homepage Welcome to WebmasterWorld Guest from 54.227.77.237
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
what was googlebot looking for
Get / Http/1.1
getvisibleuk




msg:52416
 2:32 pm on Feb 6, 2003 (gmt 0)

My clients site was down for a week last month. All that is in the db now is the homepage with the hosting companies error message as the title tag.

Googlebot didn't come back for ages so I submitted it yesterday.

Today I got in my logs:

crawl1.googlebot.com ¦ date ¦ GET /robots.txt HTTP/1.0
¦ 404
crawl1.googlebot.com ¦ date ¦ GET / HTTP/1.0 ¦ 304

does this mean that she is happy now or as I suspect, she is raising her lip to the site?

It is an old site that I've struggled to get the client to allow me to update for 3 years (i know :o )

Any thoughts - the site has a PR 4.

Ta GETVISIBLEUK

 

Dreamquick




msg:52417
 3:10 pm on Feb 6, 2003 (gmt 0)

That means that GB asked for the robots.txt to check what it was allowed to crawl but didn't find one - this wont be a problem.

Secondly it asked for the default page on the root, and it got given a 304 which means "it hasn't changed since the last time you saw it". Incidentally that was one of the 216.* bots (aka "fresh" bot) which might explain the "has it changed" request rather than just give me the content.

On the plus side it does know your site exists at the moment but on the downside its not the deepcrawler (64.*) and as I'm not an expert on such things I couldn't say if that means you will be in the next index or not - although my gut feeling says "yes"...

- Tony

getvisibleuk




msg:52418
 3:37 pm on Feb 6, 2003 (gmt 0)

it used to hold 1 - 3 on google, altavita, lycos, and the old excite - was roaring - wasn't asked to update for ages so it slipped!

tristan




msg:52419
 4:03 pm on Feb 6, 2003 (gmt 0)

just for the record:
the 216.* is the deepcrawler, 64.* is freshbot

JayC




msg:52420
 4:05 pm on Feb 6, 2003 (gmt 0)

Incidentally that was one of the 216.* bots (aka "fresh" bot) which might explain the "has it changed" request rather than just give me the content.

On the plus side it does know your site exists at the moment but on the downside its not the deepcrawler (64.*)

Actually I believe you have those reversed. Googlebot hits from 216.* are the deep crawler; those from 64.* are the freshbot.

binki




msg:52421
 4:21 pm on Feb 6, 2003 (gmt 0)

Aha! So if that second line has a "200" instead of the "304", it means that the page has changed? Shouldn't that call for an extended visit?

I've had nothing but two of those two-line Freshbot visits in the last week and I am getting very antsy!

Dreamquick




msg:52422
 4:34 pm on Feb 6, 2003 (gmt 0)

I was basing my fresh bot IP on this post;

[webmasterworld.com...]

My apologies if I got it wrong...

- Tony

JayC




msg:52423
 4:57 pm on Feb 6, 2003 (gmt 0)

Tony, I've seen those little off-cycle quick hits from a 216.* googlebot, too, and have on occasion since even before there was a "freshbot."

Perhaps they do sometimes run a freshbot from that IP range. In general though it's thought that 64.* means freshbot.

JayC




msg:52424
 5:09 pm on Feb 6, 2003 (gmt 0)

Aha! So if that second line has a "200" instead of the "304", it means that the page has changed?

Basically that's right, as long as your server is supporting the "If-Modified-Since" header. A 200 response is the server just saying "ok, here's your document." A 304 means the requestor sent an "If-Modified-Since" request, and the server reponse was "it hasn't changed," so the document wasn't actually sent.

If your server isn't set up to support that, you'll only see 200s (or some other response for other situations), and never see a 304.

A related thread: [webmasterworld.com...]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved