Forum Moderators: open
The IP is 74.125.16.67 and I see it requesting images like the OP over there stated but I also see a few instanced of ""Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14" which typically indicate screen shots but I'm not so sure in this event.
I don't see it using robots.txt either.
Anyone have anything to share that can shed some light on this?
Thanks.
74.125.16.70 - - [11/Feb/2008:04:13:17 -0600] "GET /MyFolder/ HTTP/1.1" 200 13034 "Valid other website" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
71.43.82.zzz - - [11/Feb/2008:04:13:52 -0600] "GET /SameFolder/ HTTP/1.1" 200 13034 "SAME Valid other website" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
71.43.82.zzz <snip>images related to index pages of requested folder</snip>
There was also some other activity (possibly one or both IP's) which I failed to record that was the result of pages from my other site (not recorded) requesting additional pages to this site from 74.125.16.70 IP.
All this on the same day and within 2.5 minutes.
Please note; when the visitors IP was utilized all pages images were requested as well.
The google IP returned five days later requesting a different page on one site from a valid (DMOZ) listed referral.
The google IP returned 13-days later (from initial request)(on other website) from a "valid" RIPE google search and was denied access (i. e., RIPE to my sites).
Hope this helps.
Don
Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+en-US;+rv:1.8.0.7)+Gecko/20060909+Firefox/1.5.0.7
The version of Firefox gets updated as time goes on:
Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+en-US;+rv:1.8.1.3)+Gecko/2007030919+Firefox/2.0.0.3
We have also seen this same IP address request numerous images and always without a user agent when doing so.
Sometimes they take a single HTML file, sometimes a JavaScript, or a CSS, or a few images.
Sometimes it's a Linux UA, sometimes Windows, sometimes none at all (especially for images).
Sometimes they work in tandem with GoogleBot, sometimes with the Wireless Transcoder.
I have images, CSS and JavaScript disallowed in robots.txt and GoogleBot itself never takes these.
So I assume it is quality control, and they are making sure that I am not trying to fool them.
As long as they are really from Google I am unconcerned.
Others can expect no mercy.