homepage Welcome to WebmasterWorld Guest from 107.21.187.131
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
Using a SSI to check for Googlebot
I haven't got access to my log file, how do I see if Googlebot spiders me
bokesch




msg:89369
 9:15 pm on Mar 4, 2003 (gmt 0)


How can I use a server side include (eg. Apache XSSI, PHP or ASP) to embed or call a script that checks for:
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" as the USER_AGENT or "crawl*.googlebot.com" or "crawler*.googlebot.com" as the HOST.

 

jatar_k




msg:89370
 9:50 pm on Mar 4, 2003 (gmt 0)

php

just include a script that looks at $_SERVER['HTTP_USER_AGENT [php.net]']

if it finds your matches it could write it to a text file.

weteo




msg:89371
 9:52 pm on Mar 4, 2003 (gmt 0)

PHP:

Predefined values:
User Agent: $HTTP_USER_AGENT
Remote IP: $REMOTE_ADDR

Remote name (may not work on your server):
$remote_hostname=@gethostbyaddr($REMOTE_ADDR);

Italy




msg:89372
 10:12 pm on Mar 4, 2003 (gmt 0)

asp:

Request.ServerVariables("HTTP_USER_AGENT")

aspdesigner




msg:89373
 10:28 pm on Mar 4, 2003 (gmt 0)

ASP:

User Agent: Request.ServerVariables("HTTP_USER_AGENT")
IP Addy: Request.ServerVariables("REMOTE_ADDR")

weteo is correct, the host name may not be available on your server - many hosting companies turn this "off" because it places an extra load on the server. But if you are looking to use it to identify Googlebot, it is redundant, as you can do that with the User Agent. IP Address, which you did not mention, is also useful, as it will allow you to distinguish between deep crawler and Freshbot!

If you don't wish to track every access to your web pages, but just those from Googlebot, you can also do this -

If InStr(Request.ServerVariables("HTTP_USER_AGENT"),"Googlebot") > 0 then ...

(log code here)

End If

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved