homepage Welcome to WebmasterWorld Guest from 54.161.155.142
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / WebmasterWorld / Webmaster General
Forum Library, Charter, Moderators: phranque

Webmaster General Forum

    
Issues with large site maps
vanderbolt




msg:4666965
 7:13 pm on Apr 29, 2014 (gmt 0)

I have a database driven website with about 170K pages. I have all the urls listed and ready to include in a site maps. I know that you cannot have more than 50K urls per site map and that the site map cannot be bigger than 10 megabytes. Also I have to watch out for duplicate content.

Are there any other issues I need to know? Are there any strategies that can be helpful? Any mistakes to avoid?

 

bhukkel




msg:4670329
 9:10 pm on May 11, 2014 (gmt 0)

Create sitemaps based on your website sections or article categories. Not just 4 big sitemaps. With more sitemaps you can look into GWT and see which section of your website got indexed or removed.

jmccormac




msg:4670341
 11:01 pm on May 11, 2014 (gmt 0)

It depends on how you can break down the webpages. (Topic/Alphanumerical etc.) The best way might be to maintain a database table of pages, their lastmod date and their priority and generate the sitemaps from this.

Also pay close attention to the highest lastmod date in a sitemap as you can use a sitemap index file for multiple sitemaps. (170K might sound large but when you get to the 1M pages or few hundred million pages, things take some time to generate and anything that can save traffic and unnecessary spidering is good. :) ) With the database table approach, only sitemaps with new lastmod dates need to be regenerated.

Regards...jmcc

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Webmaster General
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved