Welcome to WebmasterWorld Guest from 54.159.179.132

Message Too Old, No Replies

Bad Content: If You Can't NOINDEX, Will Robots.txt Be Ok?

   
4:54 pm on Mar 16, 2011 (gmt 0)

WebmasterWorld Senior Member planet13 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Hi there, Everyone:

i am using a CMS directory and many of my pages don't have any content on them.

I can't unfortunately add a NOINDEX meta tag to those pages without noindexing the whole site.

Will blocking them via robots.txt work if they have ALREADY been indexed by google and the other search engines?

I am doing this because of the farmer / panda update, I would not want google to see a lot of "low quality" pages.

Thanks in advance.
6:13 pm on Mar 16, 2011 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Disallowing in robots.txt is one of the suggestions from Google, recommended when you plan to eventually improve that content. See these in-depth recommendations from Google's JohnMu: [google.com...]

As to whether it will "work" - no one seems to have any reports so far of any change helping them recover lost rankings.
11:06 am on Mar 17, 2011 (gmt 0)

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Using robots.txt will eventually get those pages out of Google's index. Since you mention they are low quality I doubt Google is crawling them often so it may take weeks or months for all of the low quality pages to be de-indexed. Which ios probably why (as tedster points out) no one has yet confirmed this made their rankings bounce back for the rest of the site.
1:25 pm on Mar 17, 2011 (gmt 0)

WebmasterWorld Administrator 5+ Year Member Top Contributors Of The Month



Using robots.txt will eventually get those pages out of Google's index

Yes, but only if there are no external links to these pages, otherwise they may hang in the index with no meta description shown.
12:15 am on Mar 21, 2011 (gmt 0)

5+ Year Member



I feel I must chime in here. Read John Mu's comments carefully. He is NOT suggesting disallowing with robots.txt. He's saying to add a meta robots tag with NOINDEX in cases where you are working on improving content. Those are very different! The former blocks crawling of pages or wildcard-matching of pages. The latter tells Google to remove the pages completely, and must be applied on a page-by-page basis.

John actually suggests NOT disallowing crawling of those pages, because if you do, Googlebot is blind to them (including the NOINDEX meta tag).
12:35 am on Mar 21, 2011 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Thank you, Fred. I had not read those comments closely enough - you are correct. In fact, he specifically says "make sure that they're not disallowed by the robots.txt file."

Now re-thinking the opening question, I now assume that robots.txt will NOT be OK. It sounds like Google might be scoring a site based on the past record of URLs. It's still a bit ambiguous because that answer has a certain specific context, but my assumption now is we need to remove the URLs or enhance their content. If enhancing the content takes time to do, then us noindex during the process.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month