Welcome to WebmasterWorld Guest from 54.161.21.157

Forum Moderators: open

Message Too Old, No Replies

what was googlebot looking for

Get / Http/1.1

     
2:32 pm on Feb 6, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Dec 19, 2002
posts:92
votes: 0


My clients site was down for a week last month. All that is in the db now is the homepage with the hosting companies error message as the title tag.

Googlebot didn't come back for ages so I submitted it yesterday.

Today I got in my logs:

crawl1.googlebot.com ¦ date ¦ GET /robots.txt HTTP/1.0
¦ 404
crawl1.googlebot.com ¦ date ¦ GET / HTTP/1.0 ¦ 304

does this mean that she is happy now or as I suspect, she is raising her lip to the site?

It is an old site that I've struggled to get the client to allow me to update for 3 years (i know :o )

Any thoughts - the site has a PR 4.

Ta GETVISIBLEUK

3:10 pm on Feb 6, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 25, 2002
posts:872
votes: 0


That means that GB asked for the robots.txt to check what it was allowed to crawl but didn't find one - this wont be a problem.

Secondly it asked for the default page on the root, and it got given a 304 which means "it hasn't changed since the last time you saw it". Incidentally that was one of the 216.* bots (aka "fresh" bot) which might explain the "has it changed" request rather than just give me the content.

On the plus side it does know your site exists at the moment but on the downside its not the deepcrawler (64.*) and as I'm not an expert on such things I couldn't say if that means you will be in the next index or not - although my gut feeling says "yes"...

- Tony

3:37 pm on Feb 6, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Dec 19, 2002
posts:92
votes: 0


it used to hold 1 - 3 on google, altavita, lycos, and the old excite - was roaring - wasn't asked to update for ages so it slipped!
4:03 pm on Feb 6, 2003 (gmt 0)

New User

5+ Year Member

joined:Oct 4, 2008
posts:
votes: 0


just for the record:
the 216.* is the deepcrawler, 64.* is freshbot
4:05 pm on Feb 6, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 5, 2001
posts:724
votes: 0


Incidentally that was one of the 216.* bots (aka "fresh" bot) which might explain the "has it changed" request rather than just give me the content.

On the plus side it does know your site exists at the moment but on the downside its not the deepcrawler (64.*)

Actually I believe you have those reversed. Googlebot hits from 216.* are the deep crawler; those from 64.* are the freshbot.

4:21 pm on Feb 6, 2003 (gmt 0)

New User

10+ Year Member

joined:Oct 30, 2002
posts:27
votes: 0


Aha! So if that second line has a "200" instead of the "304", it means that the page has changed? Shouldn't that call for an extended visit?

I've had nothing but two of those two-line Freshbot visits in the last week and I am getting very antsy!

4:34 pm on Feb 6, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 25, 2002
posts:872
votes: 0


I was basing my fresh bot IP on this post;

[webmasterworld.com...]

My apologies if I got it wrong...

- Tony

4:57 pm on Feb 6, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 5, 2001
posts:724
votes: 0


Tony, I've seen those little off-cycle quick hits from a 216.* googlebot, too, and have on occasion since even before there was a "freshbot."

Perhaps they do sometimes run a freshbot from that IP range. In general though it's thought that 64.* means freshbot.

5:09 pm on Feb 6, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 5, 2001
posts:724
votes: 0


Aha! So if that second line has a "200" instead of the "304", it means that the page has changed?

Basically that's right, as long as your server is supporting the "If-Modified-Since" header. A 200 response is the server just saying "ok, here's your document." A 304 means the requestor sent an "If-Modified-Since" request, and the server reponse was "it hasn't changed," so the document wasn't actually sent.

If your server isn't set up to support that, you'll only see 200s (or some other response for other situations), and never see a 304.

A related thread: [webmasterworld.com...]