Forum Moderators: goodroi
Does the googlebot mess up sometimes in reading robots.txt or perhaps this is a fake googlebot?
In my robots.txt file I have this, which has always worked for years:
User-agent: *
Disallow: /see-this/
This is the google bot that springs the trap:
---------------------------------------------
A bad robot hit /see-this/ 2006-11-27 (Mon) 00:34:21
address is 66.249.65.109
agent is Mozilla/5.0
(compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Welcome to WebmasterWorld!
My first thought was that the bot is one of the many imitation googlebots. But the ip you list is a Google ip.
The part of your robots.txt that you posted is valid and Google should not be following it unless maybe you have a section in your robots.txt that is specifically for googlebot that allows it. Do you?
cheers
goodroi
Most robots will accept the first robots.txt record that matches their user-agent string, or "*" -- whichever comes first. The major search robots go beyond that, and accept the record that most-specifically matches their user-agent string.
However, support for obeying multiple records which match to varying degrees is likely non-existent. Therefore, your robots.txt should be designed assuming that only one record will be obeyed by any given robot, and the best approach is to design for the simple rule given first above.
Another way to put this is that robots.txt records are per-robot, not per-URL-path.
Jim