aakk9999 - 9:29 am on Jul 28, 2012 (gmt 0)
I have the following situation which I think proves Google is NOT honouring robots.txt:
- the page was blocked by robots.txt since it was created. It is blocked by user agent * and I have no other user agents specified in robots.txt
- in WMT, if I test this URL, it shows as "blocked by line nn"
- however, in WMT, under "internal links" section, if I hover over that URL, the page preview shows the screenshot of the blocked page.
So despite the page being explicitly blocked, google HAS visited it in order to create screenshot.
So it is blatantly obvious that Google is not honouring robots.txt as I cannot see any other way how Google would obtain the page screenshot other than visiting the page.