homepage Welcome to WebmasterWorld Guest from 54.167.138.53
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Google under cover?
Guido

5+ Year Member



 
Msg#: 3672881 posted 10:32 am on Jun 12, 2008 (gmt 0)

A bad robot hit /bot-trap/index.php 2008-06-12 (Thu) 12:08:05
address is 66.249.85.133, hostname is ff-in-f133.google.com, agent is Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)

Did not obey my robots.txt and got trapped. Should I unblock it?

 

goodroi

WebmasterWorld Administrator goodroi us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3672881 posted 12:58 pm on Jun 13, 2008 (gmt 0)

Googlebot has a very good track record of respecting and following robots.txt. Most of the the time when someone thinks googlebot has been bad and not followed robots.txt it turns out to be a non-google person pretending to be googlebot.

Sometimes it is someone from Google that is not from the search division (like adwords, adsense, local, etc.) that is looking at your site. But the googlebot for search
does a good job with robots.txt.

Looking up the ip address that visited your site it seems that it has been connected to Google Web Accelerator use. This might be valid usage. I'd keep an eye on it.

Guido

5+ Year Member



 
Msg#: 3672881 posted 1:38 pm on Jun 13, 2008 (gmt 0)

I was just thinking about the accelerator and wondering if it follows all links on a page without taking care of the robots.txt.
I excluded the humans because the trap is the standard 1px transparent gif, but the landing page has a form for humans to un-ban themself. I'm going to remove the IP from .htaccess
Thanks
Guido

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved