phranque - 12:44 am on Jun 30, 2013 (gmt 0) [edited by: phranque at 12:56 am (utc) on Jun 30, 2013]
We have specific pages that we do not want indexed,
in this case you should allow the resource to be crawled and supply a meta robots noindex element or X-Robots-Tag noindex header.
The Googlebot shows up in logs all the time as having accessed those pages. These pages are properly blocked in robots.txt
you checked the IP address to verify it wasn't something else merely spoofing the googlebot user agent?
and have noindex in the headers
this is irrelevant when you have excluded the bot from crawling.
These pages show up in the SERPs from time to time with the "description blocked by robots.txt" statement.
this is precisely the expected behavior when googlebot is excluded from crawling, so it means if the real googlebot did actually crawl that url it is ignoring this fact in the index.
[edited by: phranque at 12:56 am (utc) on Jun 30, 2013]