Forum Moderators: open

Message Too Old, No Replies

Google Visit

         

DavidT

6:41 pm on Mar 23, 2003 (gmt 0)

10+ Year Member



216.239.39.5 - - [23/Mar/2003:09:13:11 -0800] "GET / HTTP/1.0" 200 6794 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"

This IP is from google, is it a crawler and if so why doesn't it identify itself, or is it just someone who works there?
Sorry if stupid question.

Edouard_H

7:22 pm on Mar 23, 2003 (gmt 0)

10+ Year Member



I received a few similar visits to a site recently (direct request) and assume it's a human visitor. Site's clean, so I'm not worried. Could be AdWords, Catalog, or Froogle related in my case - haven't found any info yet on how those ip addresses are allotted.

DavidT

7:29 pm on Mar 23, 2003 (gmt 0)

10+ Year Member



Yes, but weird human visitor, only html files without the css files and images that make them up, in a logical pattern but time between taking each page too long, and always direct request.

Brett_Tabke

7:37 pm on Mar 23, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Who knows what setup they have for tools. The rest of the page requests could have come out of the cache. Was there a referrer?

DavidT

7:51 pm on Mar 23, 2003 (gmt 0)

10+ Year Member



No sir, no referrer, just as I posted above. That was the index/frameset page, then the 3 pages that make up the frameset but all without referrer which for default frameset pages is normally [mysite.com...]

Then 10-20 second gap between other files but taken in a logical pattern, ie folder/blue-widgets.htm then folder/subfolder/more-about-blue-widgets.htm.

So I don't know.

WarmGlow

9:00 pm on Mar 23, 2003 (gmt 0)

10+ Year Member



DavidT,

Your visitor is using the Google language translation proxy. Look at the HTTP_REFERER string in the log file lines immediately following the first request.

wilderness

1:43 am on Mar 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



WarmGlow,
Welcome to Webmaster World.

Thanks for the heads up.

DavidT

5:49 am on Mar 24, 2003 (gmt 0)

10+ Year Member



Warm Glow,

Thanks, yes I see it now, teach me to look a bit closer, completely different ip's that threw me. All requests using that service are direct though so visitor just got a bunch of 403s for images.

WarmGlow

8:21 am on Mar 24, 2003 (gmt 0)

10+ Year Member



All requests using that service are direct though so visitor just got a bunch of 403s for images.

DavidT,

Although my server configuration replies with status code 403 to requests for image files from external referrers and known image file harvesters, I allow Google and Babel Fish language translators to display my pages with in-line images included. If a visitor has enough interest in my site's content to use a language translator, I do not want them to leave with the impression that my pages are broken.

WarmGlow

10:19 am on Mar 24, 2003 (gmt 0)

10+ Year Member



wilderness,

Thanks for the welcome. As the Google language translation proxy increases in popularity, alert site administrators like DavidT will be wondering what Google is doing. At first glance, it is certainly startling to see a request from a Google owned remote host that sends a common web browser ID as the HTTP_USER_AGENT string. It is even more startling when the request is for a page that is excluded in the robots.txt file. I was relieved to find that these requests are made by human users through a helpful Google tool and delighted that Brett is providing an opportunity to share that information.

wilderness

5:31 pm on Apr 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<snip>How can I tell if Google visited my page?</snip>

This thread offers the many IP ranges from google.
google also provides their bot url in the UA field of the log entry when spidering

[webmasterworld.com...]