Forum Moderators: open

Message Too Old, No Replies

Interesting Google-bot Image Encounter

         

caribguy

5:42 pm on Apr 24, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Now this is interesting:

www.example.com 66.249.71.200 - - [22/Apr/2010:00:22:44 -0900] "GET /photos/image-02.jpg HTTP/1.1" 403 298 "http://images.google.com," "Googlebot-Image/1.0"

caribguy

2:31 pm on Apr 25, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Didn't see any headers from these requests. Could it be related to the Stinky Crawler Proxy [webmasterworld.com]?

jdMorgan

3:08 pm on Apr 25, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Real Googlebots in general do not send a referrer header, since 'bot requests are not referred by links on other Web pages but driven by previously-compiled and de-duplicated URL-lists in a stored database.

You should also see the HTTP "From" header containing googlebot(at)googlebot.com -- In Apache mod_rewrite form, it should match:
 RewriteCond %{HTTP:From} ^googlebot\(at\)googlebot\.com$ 

-or-
 RewriteCond %{HTTP:From} =googlebot(at)googlebot.com 


The reverse-DNS should also resolve to crawl-nn-nnn-nn-nn.googlebot.com (where the "nn" digits are the IP address), and come from one of the known googlebot IP address ranges.

If not, you can kick it to the curb.

Jim