Forum Moderators: goodroi

Message Too Old, No Replies

Should XML sitemap ONLY include actual URLs (not sort variants, etc)?

XML sitemaps and sort variants

         

domino66

4:10 pm on Feb 9, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



I used the free Xenu program to generate an XML sitemap for my medium-sized Q&A site that has around 1,500 individual URLs.

HOWEVER, when I opened the XML file it generated, I saw that it actually included about DOUBLE the number of URLs my site actually has, because it included all sorts of variants of the same pages, for example with different sort orders or filters applied. So included in the XML sitemap were URLs like:
- http://example.com/threads?direction=asc&page=17&sort=title (index of threads sorted by title)
- http://example.com/threads?direction=desc&page=2&sort=title&tag%23Education=on (index of threads with both a sort-by-title and category-filter applied)
- http://example.com/threads/show/2283-lawyer?sort_by=newest (an individual Q&A page with content sorted by date)

My question is simply whether those URL variants belong in an XML sitemap or not. Or does it really not matter (aka will Google ignore them anyway)? I figure cleaner is always better, so should I manually remove all of the URLs that are just sort/filter variants, leaving only "real" URLs?

lucy24

5:05 pm on Feb 9, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



List only those URLs you want search engines to crawl. That's what the sitemap is for: "You may have overlooked this subdirectory which is hiding behind three layers of scripting".

:: insert boilerplate about whether an xml sitemap is even necessary at all ::

Including duplicate URLs in a sitemap would seem to be just asking for trouble.