Welcome to WebmasterWorld Guest from 50.19.156.19

Forum Moderators: goodroi

Message Too Old, No Replies

Google under cover?

     

Guido

10:32 am on Jun 12, 2008 (gmt 0)

5+ Year Member



A bad robot hit /bot-trap/index.php 2008-06-12 (Thu) 12:08:05
address is 66.249.85.133, hostname is ff-in-f133.google.com, agent is Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)

Did not obey my robots.txt and got trapped. Should I unblock it?

goodroi

12:58 pm on Jun 13, 2008 (gmt 0)

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Googlebot has a very good track record of respecting and following robots.txt. Most of the the time when someone thinks googlebot has been bad and not followed robots.txt it turns out to be a non-google person pretending to be googlebot.

Sometimes it is someone from Google that is not from the search division (like adwords, adsense, local, etc.) that is looking at your site. But the googlebot for search
does a good job with robots.txt.

Looking up the ip address that visited your site it seems that it has been connected to Google Web Accelerator use. This might be valid usage. I'd keep an eye on it.

Guido

1:38 pm on Jun 13, 2008 (gmt 0)

5+ Year Member



I was just thinking about the accelerator and wondering if it follows all links on a page without taking care of the robots.txt.
I excluded the humans because the trap is the standard 1px transparent gif, but the landing page has a form for humans to un-ban themself. I'm going to remove the IP from .htaccess
Thanks
Guido