Forum Moderators: open
fragment from robots.txt:
User-agent: *
Disallow: /gfx/
Disallow: /cgi-bin/
Disallow: /protected/
Disallow: /private/
fragment from access log:
64.68.82.26 - - [19/Jul/2002:12:44:02 +0100] "GET /cgi-bin/track.pl?os=www.example.com HTTP/1.0" 302 302 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
64.68.82.78 - - [19/Jul/2002:13:24:19 +0100] "GET /cgi-bin/track.pl?os=www.example.com HTTP/1.0" 302 294 "-" "Googlebot/2.1
(+http://www.googlebot.com/bot.html)"
64.68.82.6 - - [19/Jul/2002:13:54:35 +0100] "GET /cgi-bin/track.pl?os=www.example.com HTTP/1.0" 302 289 "-" "Googlebot/2.1
(+http://www.googlebot.com/bot.html)"
64.68.82.78 - - [19/Jul/2002:14:06:31 +0100] "GET /cgi-bin/track.pl?os=www.example.com HTTP/1.0" 302 297 "-" "Googlebot/2.1
(+http://www.googlebot.com/bot.html)"
[edited by: engine at 8:49 am (utc) on July 22, 2002]
[edit reason] Edited for generic website examples [/edit]
G.
G.
It includes a lot of "User-agent" tokens that don't look legit to me, like "Mozilla/4.0 (compatible; MSIE 4.0; Windows NT)". The robots.txt specification says tokens should be one word, without version information. Also, some of the tokens you're using include characters not typically seen in robots.txt, like parentheses.
It's possible those nonstandard tokens are causing Googlebot to parse your robots.txt wrong, but that's just an educated guess. My advice would be to cut all the User-agent tokens you haven't verified with the entities you're trying to block, and see if that helps.