Forum Moderators: open

Message Too Old, No Replies

How are you tracking Googlebot

tracking googlebot and freshbot

         

brettb

1:06 am on Feb 28, 2003 (gmt 0)

10+ Year Member



I have been trying to figure out how all of you are able to track googlebot and freshbot so well. I am using urchin 4, but I cannot figure out how to configure it. I am using miva merchant and have the main settings configured properly. If any of you have any experience with urchin and filters for googlebot and freshbot ips, I would really appreciate the help. Thanks in advance.

ga_ga

1:41 am on Feb 28, 2003 (gmt 0)

10+ Year Member



Can you run perl cgi scripts on your server? I run a small homemade script which identifies freshbot / deepcrawler from their IP addresses, which are available on this forum. You're most welcome to the details, if you wanted them. Just the way I do it; I'm sure there are many other options depending upon your server or hosting.

Batman

1:49 am on Feb 28, 2003 (gmt 0)

10+ Year Member



Sorry to get a bit sidetracked here, but I'm a bit confused by Brett's post now. I'm rather new at SEO, so please forgive me for asking this. :) Here's what I thought to be the case:

There are two bots: Deep Crawl and Fast Crawl. Both of them are collectively called "Googlebot". Fast Crawl is also known as "Freshbot" as it quickly skims your site for new content and leaves the real work for Deep Crawl.

However, now that Brett seems to use Googlebot and Freshbot as separate bots, I'm a bit confused. So here's my new overview:

There are two bots: Deep Crawl and Fast Crawl. Fast Crawl is also known as "Freshbot" as it quickly skims your site for new content and leaves the real work for Deep Crawl. Deep Crawl is also known as the one and only Googlebot (it's considered the "real" bot of Google since it does the hard work).

Right?

aspdesigner

2:12 am on Feb 28, 2003 (gmt 0)

10+ Year Member



Batman, the deep crawler is what feeds the monthly updates. It typically has an IP around -

216.239.46.*

Freshbot is what tries to keep the index more up-to-date between monthly updates. It only hits some pages and sites, its effects can be temporary, and it's not always running. Freshbot-updated listings will show a date between the file size and "Cached" link in the search results. Freshbot typically has an IP around -

64.68.82.*

P.S. - both the deep crawler and Freshbot will show up as "Googlebot" in your logs, you need to look at the IP address to see which one it is.

Batman

2:17 am on Feb 28, 2003 (gmt 0)

10+ Year Member



OK, thanks. Learned yet another thing today :)

Sorry again for sidetracking for a moment. Back to bot tracking now :)

colemanator

3:44 am on Feb 28, 2003 (gmt 0)

10+ Year Member



I search for the IP's in my logfiles using Ultra Edit, which has a very powerful (and fast) Find feature. Have been tempted to implement Ga_Ga's script method. It would prevent a lot of unnecessary log searching.

Side note on the bots. Does Deep crawl typically occur in between updates or closer to the actual update i.e, a week before, days before, etc.

aspdesigner

3:47 am on Feb 28, 2003 (gmt 0)

10+ Year Member



It can vary, but it typically happens a few days after the update, and the results show up in the next update after that.

colemanator

3:55 am on Feb 28, 2003 (gmt 0)

10+ Year Member



So its safe to assume that for the majority of the month Google is re-configuring data gathered by the deep crawl?