Forum Moderators: goodroi

Message Too Old, No Replies

Sitemaps File Size up to 50Mb

         

engine

2:55 pm on Dec 12, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Many of you may know that you can now create a sitemap file up to 50Mb, [blogs.bing.com] but previously, would only support 10Mb. Bing said it's now increased to up to 50Mb.
It's still limited to 50,000 urls, and you can still use gzip to compress the file, however, it's the uncompressed file that cannot be larger than 50Mb.

If you have more than 50,000 urls you can create multiple sitemap files and create a sitemap index file, which in turn has a limit of 50,000 urls, and, again, a maximum size of 50mb.

Frequently changed urls can be indicated in the sitemap with lastmod tag. [sitemaps.org]

lucy24

6:05 pm on Dec 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Obligatory reminder:
The sole purpose of a sitemap (not to be confused with an HTML "Site Map" page) is to help search engines find pages they might otherwise overlook. It is not exclusive ("crawl and index only these pages"), it is inclusive ("be sure not to overlook these pages").

If you have 50,000 URLs that you're afraid search engines might not find, I'd worry that some of your human users might similarly have trouble finding the pages from within your site. And that's a much bigger problem. (Ever been to a page you found via a search engine, did some random exploring on the site, and then when you tried to return to the first page you couldn't find it by any conceivable link? I have. This is not a Good User Experience.)

Is there any evidence that search engines take "lastmod" seriously? I should think the only time they even might give it weight is if they're already very familiar with a given site, and know from experience that its "lastmod" tags are an accurate reflector of how often pages change.

engine

6:19 pm on Dec 12, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I agree about lastmod. If your site is crawled frequently, there's no need to use lastmod, the crawler will get around to it.

50,000 pages is quite a lot, and yes, it's very easy for a user to get lost in a site. I wouldn't want to use a sitemap myself, but then, the sitemap of such a site is not for me.

keyplyr

11:27 pm on Dec 12, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If your site is crawled frequently, there's no need to use lastmod, the crawler will get around to it.
On a site with about 300 index-able pages, I currently use Google Custom Search (although working on my own) as my site search utility. I have evidence that if I update sitemap.xml lastmod, the updated page gets indexed very very fast, in a few minutes to an hour usually. Otherwise it takes between 1 to 3 days for my pages to get re-indexed.

I have always updated sitemap.xml every time I edit a page. This has always worked well, but indexed even faster in the last 6 months. I have not tested whether the re-indexed pages proliferate across all data centers any faster.