|Ink - 301 Googlebot - 200|
301 Moved Permanently ¦ 200 OK
After checking through my webstats I found that Inktomi was returning a 301 whereas Googlebot was returning 200
"GET /dir1/widget-blue-large.html HTTP/1.0" 301 292 "-" "Mozilla/5.0 (Slurp/cat; email@example.com; [inktomi.com...]
"GET /dir1/widget-blue-large.html HTTP/1.0" 200 44023 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
I know that the 301 refers to the resource being assigned a new permanent URI, but am lost here - why does Googlebot find the requested path whilst Ink can't? (The path is valid!)
I use the following mod_rewrite rule:
RewriteRule ^(.*)$ /Navigate.do?cta=$1 [L]
Any help is appreciated.
Three possibilities that I can think of: Ink is asking for your site under a different tld -- for example "yousite.com" rather than "www.yoursite.com" and is being redirected, either by you or by your host configuration. Google has updated it's database to ask for "Navigate.do" directly, while Ink is still looking for "index.html" or similar. Ink appears to be horribly slow about updating - I've seen it ask for pages that went 404-Not Found or even 410-Gone more than six months ago, even though there are NO links to these files anywhere on the web. Another possible reason for this is that your code returns 302-Moved Temporarily status on requests for anything other than Navigate.do, so Slurp (correctly) assumes that these page names are being redirected only temporarily, and will eventually return to their original URLs. In order to correct this situation, you may wish to change your RewriteRule to:
RewriteRule ^(.*)$ /Navigate.do?cta=$1 [R=301,L]
Thanks again Jim,
After checking my server stats again I have noticed it is requesting www.mysite.info which is redirected to www.mysite.com by myself from another mod_rewrite rule I have in place in the server root. Your first suggestion certainly falls into the possibility.
The others I will follow and monitor over time but well worth noting :)
Also in addition I am going to place the [R=301,L] into the .htaccess file just as back up.
[added]After implementing the [R=301,L] I have noticed that the URI changes to the Navigate.do?cta=widget-blue-large.html path rather than the www.mysite.com/dir1/widget-blue-large.html. Will this have any effect on SE's as it now has the ? along with parameters back in place?[/added]
If that's a problem, you may have to, errrr, cloak for Slurp.
Actually, I got in a hurry and mis-read your original RewriteRule. It was a server-internal redirect as originally posted. If you're reasonably sure it was the .info -> .com redirect that caused the anomaly, just put your posted rule back the way it was.
It certainly sounds like this is the case... Google is requesting .com, and Slurp is requesting .info, thus the different server responses. Spoof yourself as Slurp and Googlebot at Wannabrowser if you'd like to see for yourself.