we have had this problem for some time now, on and off and I just can seem to find a fix for it. Hopefully someone will be able to offer me an explanation or point me in the direction of some further info.
We use rewrite rules for some of our pages -
RewriteRule ^(.*)ringtones/(.*)$ $1tones.php/$2
RewriteRule ^(.*)ringtones2/(.*)$ $1tones2.php/$2
The google bot comes to our page and a request is made for:
is directed to:
The above works fine, all pages return a 200 ok result.
The googlebot comes to our page and a request is made for:
is directed to:
This fails and returns a 302. Then another request is made that for all purposes seems identical and a '200 ok' is returned.
This is what I see in my web logs.
www.ringtones-direct.com 18.104.22.168 - - [20/Jan/2003:04:04:37 +0000] "GET /ringtones2/artist-Foo-Fighters-Times-Like-These.htm HTTP/1.0" 302 263 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
www.ringtones-direct.com 22.214.171.124 - - [20/Jan/2003:04:04:40 +0000] "GET /ringtones2/artist-Foo-Fighters-Times-Like-These.htm HTTP/1.0" 200 34320 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
Now the real curious part is this:
I have tried and tried to simulate the failed request at the command line in a telnet client and every request I make returns a 200 ok! The only way I can get this to fail is to exclude the 'Host' from the http request. Googlebot includes the Host field so this should not be the case.
I am totally stumped with this one, If anyone can help you would make my year!
Thanks for reading