Msg#: 3672881 posted 10:32 am on Jun 12, 2008 (gmt 0)
A bad robot hit /bot-trap/index.php 2008-06-12 (Thu) 12:08:05 address is 18.104.22.168, hostname is ff-in-f133.google.com, agent is Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)
Did not obey my robots.txt and got trapped. Should I unblock it?
Msg#: 3672881 posted 12:58 pm on Jun 13, 2008 (gmt 0)
Googlebot has a very good track record of respecting and following robots.txt. Most of the the time when someone thinks googlebot has been bad and not followed robots.txt it turns out to be a non-google person pretending to be googlebot.
Sometimes it is someone from Google that is not from the search division (like adwords, adsense, local, etc.) that is looking at your site. But the googlebot for search does a good job with robots.txt.
Looking up the ip address that visited your site it seems that it has been connected to Google Web Accelerator use. This might be valid usage. I'd keep an eye on it.
Msg#: 3672881 posted 1:38 pm on Jun 13, 2008 (gmt 0)
I was just thinking about the accelerator and wondering if it follows all links on a page without taking care of the robots.txt. I excluded the humans because the trap is the standard 1px transparent gif, but the landing page has a form for humans to un-ban themself. I'm going to remove the IP from .htaccess Thanks Guido