Page is a not externally linkable
TheMadScientist - 10:37 pm on May 14, 2011 (gmt 0)
[edited by: TheMadScientist at 10:45 pm (utc) on May 14, 2011]
The first thing to establish is whether it is a genuine GoogleBot.
That's what php and even parsing non-php extensions is for, isn't it? ;)
if(stripos($_SERVER['HTTP_USER_AGENT'],'GoogleBot')!==FALSE
|| stripos($_SERVER['HTTP_USER_AGENT'],'Slurp')!==FALSE
|| stripos($_SERVER['HTTP_USER_AGENT'],'BingBot')!==FALSE
) {
$botip = $_SERVER['REMOTE_HOST'];
$bothost = gethostbyaddr($botip);
$verifiedbotip = gethostbyname($bothost);
if($botip == $verifiedbotip && (substr($bothost, -14) == '.googlebot.com'
|| substr($bothost,-15) == 'crawl.yahoo.net'
# Not sure if Y! still crawls from inktomi search, but not a big deal to check for it
|| substr($bothost,-18) == '.inktomisearch.com'
# AFAIK Bing still crawls from msn.com. May need to be updated at some point
|| substr($bothost,-15) == '.search.msn.com')
) {
# What to do if it's a real bot
}
else {
# What to do if it's an imposter
}
Modified from jcoronella's post here: [webmasterworld.com...]
NOTE: The JS file / stats are two of the few I haven't been running a full verification on, but I think it might be time to start.