Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Sitemap Errors for Duplicate URLs - plus greatly reduced crawling

         

jojy

12:12 am on Aug 3, 2009 (gmt 0)

10+ Year Member



Recently I have found that google sitemap is no longer accepting a sitemap which has duplicate urls. I made changes to my sitemap, resubmitted and almost 24 hours passed but its showing me same error.

I also noticed crawl rate is greatly reduced since past 2 days. Usually it was about 10K-16K pages daily now it is just 500.

tedster

1:12 pm on Aug 3, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Crawling does go through cycles, so two days is a very short period to draw any conclusions. Still, that is a massive fall-off in crawling.

How bad is the duplicate url issue on your sitemap? Unless you have a very high-level of content churn, I'd be tempted not to use a sitemap at all - at least until you're sure the duplicate issue is fixed.

Is anyone else seeing this kind of treatment?

jojy

2:00 pm on Aug 3, 2009 (gmt 0)

10+ Year Member



Around 100 urls were repeated not much, I checked my sitemap stats after 10 days.

tedster

5:51 pm on Aug 3, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There is more to this new warning than urls that are exactly duplicated - character for character. There's now some more information available from Google [google.com] staff member John Mueller (JohnMu). For example:

There is one item which may lead to confusion here though - Google's Sitemaps processing generally simplifies URLs in ways that make sense on a whole. This includes removing "/index.html" from the URL if that's the last part. In general, that makes sense, since you want to show users the relevant part of the URL...these warnings are new, but the processing of your Sitemaps files has not changed.

Thanks to SERoundtable [seroundtable.com] for spreading the word.