Page is a not externally linkable
- Google
-- Google SEO News and Discussion
---- Should I include noindex/noarchive pages in sitemap.xml?


ZydoSEO - 12:52 am on Feb 19, 2013 (gmt 0)


As with most things SEO related... it depends.

I've never been a big fan of sitemap.xml files except for large sites (10s of thousands of pages or more typically). I would rather Google infer the importance/priority of the various URLs on my site by evaluating my internal linking structures and external inbound links. They are quite good at this.

If I encounter a site with crawlability issues, I would rather fix the issues preventing the site from being crawled... instead of putting a bandaid on it with a sitemap.xml.

If I have a brand new site, getting it indexed w/ a sitemap.xml is pretty much worthless. It's not going to rank for anything significant beyond any initial honeymoon period without links. New sites with no links that get indexed are like being all dressed up with nowhere to go. Rather than wasting time creating, managing priorities for, and submitting sitemap.xml files for a new site to get it indexed... I'd rather spend that time building links which will both get the site indexed naturally AND provide it backlinks to assist with rankings once it is indexed.

However, if you chose to use a sitemap.xml file, there are times when you might want to include in your sitemap.xml certain URLs flagged as NOINDEX, NOFOLLOW, or NOARCHIVE using a meta robots element.

A page flagged NOINDEX can still accumulate and pass PageRank/link juice out to other pages to which it links. It simply won't be shown in the SERPs. There may be times when you want to make sure such pages are crawled so that the engines will discover other pages ONLY linked to from such NOINDEXed pages by giving the NOINDEX URL a high priority.

You may want a page flagged NOFOLLOW to still be indexed though you don't want crawlers to follow and pass link juice to its outbound links.

You may want a page flagged NOARCHIVE to still be indexed though you do not want a cached copy to be maintained at Google or a history of your pages to be maintained by the Wayback Machine at archive.org.

If there was an issue with submitting URLs containing meta robots NOINDEX or meta robots NOARCHIVE elements, when you submitted it to Google then you'd likely get a warning... similar to the way they warn you when you submit URLs that redirect to other URLs. I doubt seriously if including them will ever hurt your site, and may in certain situations be useful in getting your site crawled and/or indexed better.


Thread source:: http://www.webmasterworld.com/google/4546229.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com