Forum Moderators: open

Message Too Old, No Replies

Identifying The Google Bot

What to look for in logs?

         

southpaw12

9:19 pm on Apr 17, 2003 (gmt 0)

10+ Year Member



I'm new to this search engine game, so this might be a stupid question. What do I look for in my logs to see if Google has visited? I know my site was indexed on the last crawl, but I don't see any Google addresses in my referrer logs.

Is there another way to determine if Google has been to my site? Can anyone recommend any good free (or inexpensive) tools for tracking the crawlers?

Also, can you set alerts to let you know when certain crawlers are on your site?

paladin

9:56 pm on Apr 17, 2003 (gmt 0)

10+ Year Member



To see Googlebot in your logs look for User-Agent entries like:
Googlebot/2.1+(+http://www.googlebot.com/bot.html)

As for tools notifying you when google visited your site; assuming you can do some kind of server side scripting, then try something along the lines of:
1. Create a text file on your server (googlebot.txt)
2. add something like the following for each page you want to track:


if useragent=googlebot then
add entry to your googlebot.txt with file name, date, time, IP etc...
(and if you want email yourself a message)
end if

You may want to have this IF statment in an Include file.
Obviously this code will be different depending on what server you use.

Jesse_Smith

9:56 pm on Apr 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



freshBot: 64.68.82.* bah, listed for only a few days, short dinner date then it dumps you.
deepcrawler: 216.239.46.* Good bot, it likes you, listed until death do you depart. So don't make him mad or it will divorce your site. You can divorce the Googlebot by using your robot.txt file. It's much cheaper and faster than going to the courts.

Here's how I log the Googlebot. If you use SSI with cgi, put this code in the script and make the text file set so the server can write in it, and change the database to the correct path

$database = "/public_html/cgi-local/google.txt";
$shortdate = `date +"%D %T %Z"`;
chop ($shortdate);

if ($ENV{'HTTP_USER_AGENT'} =~ /googlebot/) {
open (DATABASE,">>$database");
print DATABASE "$ENV{'REMOTE_ADDR'} - $ENV{'HTTP_USER_AGENT'} - $ENV{'SCRIPT_URL'} - $shortdate\n";
close(DATABASE);
}