Forum Moderators: DixonJones
Les
I can see the number of times that Googlebot has hit my sites from the user agent report.I can't recall whether webalizer shows this by default or whether its because I modified the config file.
If I want to see exactly which pages have been hit I download the log file and hand analyse.
Usually the number of Googlebot hits is enough to tell me if something has gone wrong.
It has a big advantage that you can easily run ad hoc queries using standard SQL and test a whole range of hunches using AND/OR etc .
Examples:
cs(User-Agent) like '%googlebot%'
cs(User-Agent) like '%slurp%'
cs-uri-stem like '%sitemap%'
cs-uri-stem like '%guestbook%'
If you just want to look for spiders in your logs, you can just open the log files in notepad and use find.
HTH
$database = "/path/to/site/public_html/cgi-local/google.txt";
$shortdate = `date +"%D %T %Z"`;
chop ($shortdate);
if ($ENV{'HTTP_USER_AGENT'} =~ /googlebot/) {
open (DATABASE,">>$database");
print DATABASE "$ENV{'REMOTE_ADDR'} - $ENV{'HTTP_USER_AGENT'} - $ENV{'SCRIPT_URL'} - $shortdate\n";
close(DATABASE);
}
Or you can create your own Googlebot log using telnet.
cat /path/to/log/web.log? grep 216.239.46 > /path/to/googlelog/public_html/deepboot.log
The '?' in 'log? grep' should be the line going up and down located above the return button, with a space before and after it. This board can't read every mark.