Welcome to WebmasterWorld Guest from 54.145.208.64

Forum Moderators: goodroi

Message Too Old, No Replies

How to stop google from indexing wordpress blog paginated archive

   
2:18 pm on Sep 14, 2012 (gmt 0)



Hello,

I want to remove my wordpress blog paginated archive (like http://example.com/page/401) from Google search indexes but not the content appearing on these pages i.e. posts.

Actually I am seeing lots of web user coming from Google search images to these paginated archive and it is quite possible that specific post may have moved to some other paginated archive due to new content updated on the blog.

I have tried to block it through putting Disallow: */page* in robots.txt file but it didn't seems working.

Please help.

Thanks
Abhinav
5:59 pm on Sep 14, 2012 (gmt 0)

5+ Year Member



Try this Disallow: */page/*/*
9:47 am on Sep 15, 2012 (gmt 0)



Thanks Pankaj, I will try this.

Thanks
Abhinav
4:09 pm on Sep 15, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Disallow: /page/
will disallow anything beginning example.com/page/

The trailing * is redundant.
8:50 pm on Sep 15, 2012 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



the Disallow directive in robots.txt is about crawling, not indexing.
you probably want to implement a meta robots noindex element or the X-Robots-Tag; noindex HTTP Response header.
9:13 pm on Sep 15, 2012 (gmt 0)

10+ Year Member



you probably want to implement a meta robots noindex element or the X-Robots-Tag; noindex HTTP Response header.


I would go this route as well since your pages are already in the index the robots.txt route at this point and adding the disallow will not allow the bots to crawl the pages and as results the pages will still stay in the index but with a blank description.
9:40 pm on Sep 15, 2012 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I forgot to mention (and was implied by mslina2002) that if you implement a noindex you must remove the crawling exclusion from robots.txt.
3:57 am on Sep 16, 2012 (gmt 0)



Thanks g1smd, phranque & mslina2002 for your replies. I would do the suggested as above but not sure how & where to do it.
9:58 am on Sep 16, 2012 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



the <meta> element is part of the Robots Exclusion Protocol.

About the Robots <META> tag:
http://www.robotstxt.org/meta.html [robotstxt.org]

one disadvantage to using the <meta> tag is it only helps you control indexing of HTML documents, since you can't put such a meta element in a pdf file, for example.

the X-Robots-Tag header is a Google extension that has the advantage of providing the meta robots functionality for web resources that are not HTML documents.

Robots Exclusion Protocol: now with even more flexibility:
http://googleblog.blogspot.com/2007/07/robots-exclusion-protocol-now-with-even.html [googleblog.blogspot.com]
10:06 am on Sep 16, 2012 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



by the way, i forgot to welcome you to WebmasterWorld, Abhinav!
2:35 pm on Sep 16, 2012 (gmt 0)



Thanks phranque for the tutorial links
3:53 pm on Sep 16, 2012 (gmt 0)

10+ Year Member



Another option:

If you are using wordpress you can add the meta tags in:

(1) manually in your archives files. I would only recommend this option if you are familiar with php and wordpress.

OR

(2) use a plugin -- most likely can't mention it here but you can google "wordpress seo". There are several plugins that can help with this. With the plugin all you have to do is go to 'titles and meta' section, and click 'Noindex subpages of archives' button.
2:03 pm on Sep 18, 2012 (gmt 0)



Thanks mslina2002 for other alternatives
6:57 am on Oct 3, 2012 (gmt 0)



Simply use Disallow */page

You can use Yoase SEO plugin for wordpress tackle SEO tasks on a wordpress site/blog very easily