- Search Engines
-- Search Engine Spider and User Agent Identification
---- And Now Google's Doing It. JS Stats Show GoogleBot
- 11:42 pm on May 14, 2011
Googlebots doing their stuff without reference to robots.txt
Or using a previously cached copy, perhaps.
To borrow a line from Johnny Depp, the rules of the pirate code (Robots Exclusion Protocol) are only guidelines. There are no sanctions for misbehaviour.
It's all something of a charade, we just do what we can to get the right files in the index (and keep the wrong ones out). To quote keyplyr again:
I'm talking real world. G, Y, M$ all crawl disallowed files.
There are worse bots to worry about - ones that offer no potential benefits.
Brought to you by