pageoneresults - 1:04 am on May 31, 2010 (gmt 0)
Do you ever use "noindex, follow"?
Yes but I do not specify the follow directive, that would be the default behavior.
We now use X-Robots-Tag for quite a bit of this.
We noarchive all documents by default. Cache is another area I don't like.
We proactively noindex.
We aggressively noindex, nofollow.
On rare occasions, we may nofollow only. I typically don't like to send that type of signal. Very few documents qualify for the nofollow only treatment.
I agree robots.txt is like having the curtains open.
I've had the opportunity to communicate with hacker types and I overhead them say that robots.txt is an area they review. ;)
Note: Every document out of the box contains the noindex, nofollow directive. We don't take any chances. Google finds URIs if you just think about them - they're psychic like that. ;)
Since we are quoting John Mueller on crawling, indexing, you'll find him recommending noindex. Note his suggestion on NOT including pages in your sitemaps that you don't want indexed. I see many folks throw the entire kitchen sink into their sitemap files, that's not good, a waste of crawl equity.
John Mueller: It’s always a good idea for your XML Sitemap file to include all pages which you want to have indexed. If you have pages such as tag or archive pages which you prefer not to have indexed, it’s recommended to add a “noindex” robots meta tag to the pages (and of course, not to include them in the Sitemap file).