joined:June 24, 2005
There are some thin pages I don't want google to index.
I've added a robots.txt to block these pages.
However I still submit these pages via a sitemap? Why? Because I want to keep track of how may are being indexed this way.
I checked WMT and the first error I got was:
"When we tested a sample of the URLs from your Sitemap, we found that the site's robots.txt file was blocking access to some of the URLs. If you don't intend to block some of the URLs contained in the Sitemap, please use our robots.txt analysis tool to verify that the URLs you submitted in your Sitemap are accessible by Googlebot. All accessible URLs will still be submitted."
What has me worried is the last sentence which to me indicates these urls will still be committed (it's not clear though).
Another error that popped up was: "Sitemap contains urls which are blocked by robots.txt"
Does anybody know which is more powerful? Can I tell google not to index pages while still having them in a sitemap? The Submitted/Index ratio is still the same which to me says google is not obeying the robots.txt directive but perhaps there is a delay?