Forum Moderators: DixonJones
I found a little script though that tracks raw bot activity: [psi-tech.com.au...] and it seems to do the job effectively.
Is it possible to install a script of some kind that logs the google bots activity on my site?
One thing I would like to know is the exact times the bot indexes my index page.
The
tail -f path/to/httpd/access.log ¦ grep "Mediapartners-Google" >> /home/djingo/adsense-bot.logas suggested by trillianjedi, will only work on a Linux system!
Here a small PHP scriped that will do the job.
The script will send an email when Google hits your site.
(if you want you can also write this to a database, but than you will need a bit more code)
It will do a host name lookup (to see if it's really comming from the Google network), and if so send you an email on what page the hit was on.
The time of the email will give you the exact times.
You will need to add (or include) this code on every page.
<?php
// Lets send an email when Google hits the site :-)
// MAKE SURE that you set the "your@emailaddress" part
if ($HTTP_SERVER_VARS["HTTP_X_FORWARDED_FOR"]!= ""){
$host = @gethostbyaddr($HTTP_SERVER_VARS["HTTP_X_FORWARDED_FOR"]);
}else{
$IP = $HTTP_SERVER_VARS["REMOTE_ADDR"];
$host = @gethostbyaddr($HTTP_SERVER_VARS["REMOTE_ADDR"]);
}
if(eregi("googlebot",$host))
{
$emailaddress = "your@emailaddress";
mail("".$emailaddress."", "Google detected", "Host name is: " . $host . "\n page hit was on: " . $_SERVER['REQUEST_URI']."");
}
?>
[edited by: Noel at 3:21 pm (utc) on Oct. 20, 2006]
but
$HTTP_SERVER_VARS is deprecated, use $_SERVER
also if you want speed then you should probably stay away from gethostbyaddr and use a iplist in a db or something.
also, everything you are asking for would be available if you had raw logs
just get a new host, look at how much time and therefore money this host is already costing you.
Noel looks cool thanks. Is it possible to alter the script so it only sends an email if its the index page thats hit?
Sure.. Only add (or include) it on/to your index.php
also if you want speed then you should probably stay away from gethostbyaddr and use a iplist in a db or something.
jatar_k is correct with this. The script WILL DO a host lookup everytime to see if it's really from Google. If you have access to a database with all the "google" IP's it will be way faster!
$HTTP_SERVER_VARS is deprecated, use $_SERVER
<?php
// Lets send an email when Google hits the site :-)
// MAKE SURE that you set the "your@emailaddress" part
if ($_SERVER["HTTP_X_FORWARDED_FOR"]!= ""){
$host = @gethostbyaddr($_SERVER["HTTP_X_FORWARDED_FOR"]);
}else{
$IP = $_SERVER["REMOTE_ADDR"];
$host = @gethostbyaddr($_SERVER["REMOTE_ADDR"]);
}if(eregi("googlebot",$host))
{
$emailaddress = "your@emailaddress";
mail("".$emailaddress."", "Google detected", "Host name is: " . $host . "\n page hit was on: " . $_SERVER['REQUEST_URI']."");
}
?>