Forum Moderators: open

Message Too Old, No Replies

Freshbot behaviour

What's the criteria for regular Freshbot crawling?

         

Nick_W

7:29 pm on Oct 4, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi everyone,

What makes G's Freshbot stop by for a visit?

It's been rampaging through my site continuously for about 6wks or so and I only have PR5

So what is the criteria here?
And what is the normal behaviour?

thanks

Nick

Quinn

7:32 pm on Oct 4, 2002 (gmt 0)

10+ Year Member



This is just an educated guess- A history of fresh content will increase the frequency of visits.

Clearly news sources are crawled often. A Yahoo directory listing will bring the bot around with more frequency.

jdMorgan

7:33 pm on Oct 4, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Nick_W,

PR5 seems to be a threshold. I have low to mid PR5, and I usually get freshed every three days.

Jim

chiyo

7:41 pm on Oct 4, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google itself says that it crawls more often those pages that are likely to change more.

I agree with quinn that you need to build up some sort of history of rehular changes.

Some advice at WebmasterWorld is that it is related to PageRank. That may be true, but I see it more due to regular changes on page. It may well be that there is a minimum amount of change, and (another big assumption) that it could be due to NEW linked pages, not just random non-linked content or old pages relinked to that page.

googlebot daily crawls around 70 to 250 pages (mostly at the higher end of that range) daily. They are almost always our fast changing news pages OR linked from our index page. ! am talking here of our news and opinion site. Not sure how they have established they are good pages to crawl. But apart from popular interpretations such as PR and last date modified type things, we are also not excluding such things as inclusion in news.google, in RSS/newscrawler sites like newsisfree, or indexes of news sites, or even our RSS feeds.

Nick_W

7:45 pm on Oct 4, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Right, well there's incentive to keep updating and modifying. I can't beleive the phrase I'm showing noumber one for that I only added a few days ago!

Not the most competitive phrase but not to be sniffed at either ;)

Nick

Sasquatch

8:00 pm on Oct 4, 2002 (gmt 0)



I doubt that last date modified has much to do with it. I'm guessing your news pages are fairly old, but the dynamic content is updated regularly.

It's probably as simple as just keeping track of whether that page changes much in past crawls for fresh content. "we have crawled it every 4 days for a month and it hasn't changed, move it to the every 8 day list." or " every time we crawl it on the 4 day schedule it changes, let's try every 2"

And Higher PR + whatever gets higher priority on the fresh cycles, just like the regular updates.

rfgdxm1

3:19 am on Oct 5, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>PR5 seems to be a threshold. I have low to mid PR5, and I usually get freshed every three days.

Only every 3 days? I'm surprised if Googlebot stays away from the home page of either of my sites for more than 2 days, and one site is PR5 and the other PR4. It is almost enough to give me the "Big Brother is watching you" feeling. :( Although, I'm just beginning to think Googlebot is just a nosy critter. Even the main site often gets fully crawled at least once a week. I'm hoping I'm just paranoid, and this is all just part of that "minty fresh" thing Googleguy talks about. The main site is both Yahoo and DMOZ listed, and the second one in the DMOZ. Thus, these directory listings may have something to do with Googlebot nosiness.

I've also noted that even a trivial change to a page is enough to get a fresh tag. I've had it happen doing just a minor punctuation tweak.

Sasquatch

3:35 am on Oct 5, 2002 (gmt 0)



I'm not in DMOZ or Yahoo, and it's a rare day that googlebot doesn't hit my home page.