Forum Moderators: DixonJones
66.77.73.56 - - [26/Jan/2003:19:56:06 -0800] "GET /robots.txt HTTP/1.0" 301 243 "-" "FAST-WebCrawler/3.6 (atw-crawler at fast dot no; [fast.no...]
Freshbot has been getting a ton of errors like this all for obscure files which I have never moved or renamed in the 5-month existence of the site.
Sometimes on a second attempt later it succeeds. About half of its requests end in 301s. It only seems to be affecting robots as above.
I use .htaccess for RedirectPermanents but not for 'folder1' above and this can't affect robots.txt. anyway.
Would anyone have any ideas?
The most likely cause is that your site supports two or more URLs to get to the same pages. This is common, the simplest example being a site which can be reached at www.domain.com or at domain.com. If you or your host have put in place a redirect from the "non-standard" domain to the "standard" domain, and the 'bots come in using the non-standard domain name, then you'll see these 301s in your logs. Similarly, if visitors or other webmasters are using the "non-standard" address as bookmarks or for links to your site, then their accesses will log a 301.
You can test this yourself using either a browser, or the WebmasterWorld server header checker [webmasterworld.com]. Just type in variations of your URL as above, and see if you get a redirect. In a browser, you will see the address bar "magically" change in response to a 301 or 302 redirect.
Jim
You might want to check to see if other sites link to yours using the "www" version. If Googlebot finds a link, it will follow it verbatim, so those incoming links to "www" will always 301 until they are corrected.
The 301 lets the 'bot know that the two domains are one, and will help avoid duplicate content issues. It will also "correct" the URL in a visitor's address bar, which helps to prevent bookmarks and links to the "wrong" domain. But if the linking-to-you webmaster doesn't check it, it won't help.
Jim