|Sitemap warning - too many redirects|
I found the following warning in Google Webmaster Tools:
"When we tested a sample of the URLs from your Sitemap, we found that some URLs were not accessible to Googlebot because they contained too many redirects. Please change the URLs in your Sitemap that redirect and replace them with the destination URL (the redirect target). All valid URLs will still be submitted."
What does this mean and what do I need to do?
I didn't think I had any redirects. It is a small, simple site.
I'd say take some of the urls from your sitemap and fire up Firefox browser with the Live HTTP headers add-on installed. You'll learn quickly enough if your server is really generating redirects for your sitemap urls, or if this is just another one of the current problems in webmaster reporting that Google appears to be experiencing right now.
Thank you, tedster,
I installed it and ran it but how do I determine if there are any redirects? I couldn't find any instructions anywhere.
Live HTTP Headers shows you all the back-and-forth information between the browser and the server. If there's no redirect, then the server's first response will include a 200 status.
It matters if the trailing / is included or not. I had to add it to all my url's in my sitemap.xml to fix this problem. Also if your site map url's shows www and the site you added is not or visa versa. This is my first post so hello to all who have taught me all I know. Its nice to be a help for a change.
[edited by: BCDesigns at 7:08 pm (utc) on July 30, 2008]
In the first section of the results is:
HTTP/1.x 200 OK
The first line in the first section is the url I entered. There are about ten more sections that are things like the style sheet, images, adsense, statcounter, etc.
Is that normal? If so, does that mean that it Google's problem?
Also, how many of the url's in the sitemap should I try?
Sure, you'll see all the chatter back and forth, which includes the images, stylesheets, whatever. Perfectly normal and even necessary, if the browser is going to be able to render the page you designed.
Try a few of your urls, maybe a dozen or so, and if the results are like what you just saw, then it sounds like googlebot has the problem. Others have also seen a similar problem, and there's nothing you can do about it.
Just one other idea. Install the User Agent Switcher add-on and go to your site with a googlebot user agent. Just in case, somehow or other, your server is treating googlebot differently, this will show it to you. There are even some hackers these days who will get into a server and install a script on your pages that serves special junk just for googlebot.
Sometimes webhosts take some misguided steps, too. So switch out the user-agent and give that a go.
On a Live HTTP Headers note, in the config you can exclude URLs (such as images/css etc.) the default option filters a fair number of those types of files. Or you can just include URLs. Those options avoid 'chatter' if you're troubleshooting a particular URL or set of URLs.
You could also have a look at Googlebot's activity in your server logs to see how and if it gets redirected (likely with a 302 or 301 status code). Log files are pretty much the definitive place to determine spidering activity.
Five messages were cut out into new thread about conflicts
between Live HTTP Headers and an HTML editor. The new thread:
[edited by: tedster at 7:12 pm (utc) on Aug. 1, 2008]