homepage Welcome to WebmasterWorld Guest from 54.145.238.55
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
what was googlebot looking for
Get / Http/1.1
getvisibleuk

10+ Year Member



 
Msg#: 9105 posted 2:32 pm on Feb 6, 2003 (gmt 0)

My clients site was down for a week last month. All that is in the db now is the homepage with the hosting companies error message as the title tag.

Googlebot didn't come back for ages so I submitted it yesterday.

Today I got in my logs:

crawl1.googlebot.com ¦ date ¦ GET /robots.txt HTTP/1.0
¦ 404
crawl1.googlebot.com ¦ date ¦ GET / HTTP/1.0 ¦ 304

does this mean that she is happy now or as I suspect, she is raising her lip to the site?

It is an old site that I've struggled to get the client to allow me to update for 3 years (i know :o )

Any thoughts - the site has a PR 4.

Ta GETVISIBLEUK

 

Dreamquick

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9105 posted 3:10 pm on Feb 6, 2003 (gmt 0)

That means that GB asked for the robots.txt to check what it was allowed to crawl but didn't find one - this wont be a problem.

Secondly it asked for the default page on the root, and it got given a 304 which means "it hasn't changed since the last time you saw it". Incidentally that was one of the 216.* bots (aka "fresh" bot) which might explain the "has it changed" request rather than just give me the content.

On the plus side it does know your site exists at the moment but on the downside its not the deepcrawler (64.*) and as I'm not an expert on such things I couldn't say if that means you will be in the next index or not - although my gut feeling says "yes"...

- Tony

getvisibleuk

10+ Year Member



 
Msg#: 9105 posted 3:37 pm on Feb 6, 2003 (gmt 0)

it used to hold 1 - 3 on google, altavita, lycos, and the old excite - was roaring - wasn't asked to update for ages so it slipped!

tristan

5+ Year Member



 
Msg#: 9105 posted 4:03 pm on Feb 6, 2003 (gmt 0)

just for the record:
the 216.* is the deepcrawler, 64.* is freshbot

JayC

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9105 posted 4:05 pm on Feb 6, 2003 (gmt 0)

Incidentally that was one of the 216.* bots (aka "fresh" bot) which might explain the "has it changed" request rather than just give me the content.

On the plus side it does know your site exists at the moment but on the downside its not the deepcrawler (64.*)

Actually I believe you have those reversed. Googlebot hits from 216.* are the deep crawler; those from 64.* are the freshbot.

binki

10+ Year Member



 
Msg#: 9105 posted 4:21 pm on Feb 6, 2003 (gmt 0)

Aha! So if that second line has a "200" instead of the "304", it means that the page has changed? Shouldn't that call for an extended visit?

I've had nothing but two of those two-line Freshbot visits in the last week and I am getting very antsy!

Dreamquick

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9105 posted 4:34 pm on Feb 6, 2003 (gmt 0)

I was basing my fresh bot IP on this post;

[webmasterworld.com...]

My apologies if I got it wrong...

- Tony

JayC

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9105 posted 4:57 pm on Feb 6, 2003 (gmt 0)

Tony, I've seen those little off-cycle quick hits from a 216.* googlebot, too, and have on occasion since even before there was a "freshbot."

Perhaps they do sometimes run a freshbot from that IP range. In general though it's thought that 64.* means freshbot.

JayC

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 9105 posted 5:09 pm on Feb 6, 2003 (gmt 0)

Aha! So if that second line has a "200" instead of the "304", it means that the page has changed?

Basically that's right, as long as your server is supporting the "If-Modified-Since" header. A 200 response is the server just saying "ok, here's your document." A 304 means the requestor sent an "If-Modified-Since" request, and the server reponse was "it hasn't changed," so the document wasn't actually sent.

If your server isn't set up to support that, you'll only see 200s (or some other response for other situations), and never see a 304.

A related thread: [webmasterworld.com...]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved