Forum Moderators: open
User-agent: *
Disallow: /adserver/
Disallow: /addbiography.cgi
Disallow: /addtrivia.cgi
Disallow: /addquotes.cgi
Disallow: /search.cgi
I added that about 4 hours ago. I am aware googlebot doesn't request robots.txt all the time, but the following happened in my logs
crawler9.googlebot.com - - [11/Mar/2003:00:55:19 +0000] "GET /robots.txt HTTP/1.0" 200 869 "-" "Mediapartners-Google/2.1 (+http://www.googlebot.com/bot.html)"
crawler9.googlebot.com - - [11/Mar/2003:00:55:19 +0000] "GET /robots.txt HTTP/1.0" 200 869 "-" "Mediapartners-Google/2.1
(+http://www.googlebot.com/bot.html)"
crawler9.googlebot.com - - [11/Mar/2003:00:56:48 +0000] "GET /addquotes.cgi?celeb=Robin%20Williams HTTP/1.0" 200 12988 "-" "Mediapartners-Google/2.1 (+http://www.googlebot.com/bot.html)"
crawler9.googlebot.com - - [11/Mar/2003:00:58:29 +0000] "GET /addquotes.cgi?celeb=Robin%20Williams HTTP/1.0" 200 12988 "-" "Mediapartners-Google/2.1 (+http://www.googlebot.com/bot.html)"
crawler9.googlebot.com read robots.txt but still followed /addquotes.cgi?celeb=Robin%20Williams why? Are there more than one instance of a googlebot running at once? So is it possible the instance which followed the addquotes.cgi page read the robots.txt earlier before I added the Disallow?
Good idea to change my cgi pages to like you suggested with the <meta name="robots" content="noindex,follow" />
if you ping crawler9.googlebot.com it is 64.68.87.79
I know that, what I'm asking is if the visits with that user agent string are from ip's that are normally associated with googlebot or if someone is masking as Gbot? I'm guessing that 'Mediapartners' is part of a rdns lookup but my stat program doesn't have that functionality to check.