Forum Moderators: DixonJones

Message Too Old, No Replies

What stat system is best?

Want to see the crawl!

         

glenv

2:56 pm on Mar 15, 2003 (gmt 0)

10+ Year Member



I have several sites on my server and am jealous of not seeing the "crawl" I have the basic stats available, Analog and Webalizer. What can I use to see the many bots you guys and gals get so excited about? :-)

leskent

3:12 pm on Mar 15, 2003 (gmt 0)

10+ Year Member



I have been very impressed with the information I can glean with Sawmill 6. It is a drill down type reporting system that once you get used to its quirky interface, can really provide you a level of analysis that I have found unavailable with any other reporting system. It has some areas of amazing customization, but it can be rather complex to configure properly. For example, I have modified/customized it to separate out my paid referrals (Adwords, Overture, etc.) from my free referrals, which has been really important to me in my SEO analysis. Another example is I have configured it to know the difference between Google's deep crawl and freshbot robots, in a way that lets me quickly see which pages are being sucked up by different crawlers.

Les

4eyes

3:15 pm on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I use webalizer on all my sites.

I can see the number of times that Googlebot has hit my sites from the user agent report.I can't recall whether webalizer shows this by default or whether its because I modified the config file.

If I want to see exactly which pages have been hit I download the log file and hand analyse.

Usually the number of Googlebot hits is enough to tell me if something has gone wrong.

aspdaddy

3:25 pm on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I like Log Parrser - see the posts in Tracking and Logging.

It has a big advantage that you can easily run ad hoc queries using standard SQL and test a whole range of hunches using AND/OR etc .

Examples:
cs(User-Agent) like '%googlebot%'
cs(User-Agent) like '%slurp%'
cs-uri-stem like '%sitemap%'
cs-uri-stem like '%guestbook%'

If you just want to look for spiders in your logs, you can just open the log files in notepad and use find.

HTH

kevinpate

5:14 pm on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I dump the raw log, unzip it, and then dump that data into a spread sheet program. Once that new base copy is saved, I play with the spreadsheet sort feature a while to amuse myself, and to glean a bit of useful info in the process.

Jesse_Smith

6:21 pm on Mar 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you use SSI with CGI, you can make your own log and watch it update all day live! Put this in the perl file, create the text file and set the permission so the server can write in it.

$database = "/path/to/site/public_html/cgi-local/google.txt";
$shortdate = `date +"%D %T %Z"`;
chop ($shortdate);

if ($ENV{'HTTP_USER_AGENT'} =~ /googlebot/) {
open (DATABASE,">>$database");
print DATABASE "$ENV{'REMOTE_ADDR'} - $ENV{'HTTP_USER_AGENT'} - $ENV{'SCRIPT_URL'} - $shortdate\n";
close(DATABASE);
}

Or you can create your own Googlebot log using telnet.

cat /path/to/log/web.log? grep 216.239.46 > /path/to/googlelog/public_html/deepboot.log

The '?' in 'log? grep' should be the line going up and down located above the return button, with a space before and after it. This board can't read every mark.