Welcome to WebmasterWorld Guest from 54.205.96.97

Bot on the Hard Prowl

Gbot running hard, fast, and deep.

   
5:44 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Just hammering some sites right now.

1pv a second here - another site was 10 pvs a second.

6:19 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Googlebot has fetched pretty much every one of my web pages every day, for the last five years or more.

I don't know why it needs to spider them that often, since they're almost entirely static. A weekly fetch for most pages would work fine; with daily spidering for the few pages where links to new pages appear.

Fortunately my small-medium static sites cope just fine. I shudder to think what search-engine spidering does to a big database-backed site like this, though.

6:22 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, today is the first time I've ever seen Google handed 304s all the way down the line.
6:28 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Dumb question:

What does a returned "304" mean exactly in this spidering context?
Is it a message indicating "You already spidered this page, it hasn't changed any yet"?
.. or am I missing something?
I'm starting to see 304s here too. - Larry

6:44 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member powdork is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Is it a message indicating "You already spidered this page, it hasn't changed any yet"?
Thats pretty much it. It is the Not Modified Since header.

Danny, Is your server set up to return the 304 header when appropriate?

6:47 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Danny, Is your server set up to return the 304 header when appropriate?

Yep. My logs show lots of 304s.

Doesn't stop Googlebot coming back for those pages again the next day.

7:16 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member powdork is a WebmasterWorld Top Contributor of All Time 10+ Year Member



No it will still come back but when it gets the 304 thats all its supposed to get. When it gets a 200 there will also be a number indicating the number of bytes of the downloaded file. This number should not exist with a 304.
7:25 am on Feb 5, 2005 (gmt 0)



same here. Gbot gone wild. I noticed that they do this all the time after a PR and /or backlink update.
7:41 am on Feb 5, 2005 (gmt 0)

10+ Year Member



I'm seeing a lot more of Googlebot-Image than usual.
7:41 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is it a message indicating "You already spidered this page"?

Thats pretty much it. It is the Not Modified Since header.

Thanks Powdork, that's what I thought. I'm going to look at access_logs again.
Maybe I'll learn something new about G spidering. -Larry

7:46 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member powdork is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I noticed the post which linked to w3c.org is now gone. Is that sort of link no longer ok?
I'm not complaining, just asking.
8:21 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I deleted my post because of you. The mods deleted it because it was empty.
8:27 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member powdork is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Sorry about that. I took a break while posting and you posted yours in the meantime. Yours was better.
9:28 am on Feb 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I guess it makes sense for Google to keep checking 304 pages daily, though some kind of back-off would seem obvious.
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month