|HTML Sitemap: Good? Bad? Indifferent?|
Hey there everyone:
My site has an XML site map, but I have heard from a few respected sources that an HTML site map can be very helpful, especially if you have lots of categories that are quite a few clicks from the home page.
So I would like to know if there are right and wrong ways to set up your html site map:
1) Should EVERY page really be linked to by the site map?
2) Should you link TO the site map from the home page?
3) How structured should the site map be? If a web site has a gallery, a blog, an articles section, ecommecrce, a directory, does the site map have to be structured so that all the gallery links appear in one section of the site map, all the blog posting links appear in another section, all the ecommerce CATEGORY links appear in one section, all the ecommerce PRODUCT links appeare in another section, etc?
3A) If the answer to question 3 above is that the different links should be segregated, is it best to have them on separate pages altogether? In essence, is it best to have all ecommerce PRODUCT links on one site map page, alld the ecommerece CATEGORY links on a separate page, all the gallery links on another page, etc?
4) Should the site map be MORE than just a list of links? Should each link have an accompanying detailed descriptions of what will be found on that page?
Any other key points I am missing?
Thanks in advance.
Thought I saw a thread on this a while back but I tried using the forum search function and couldn't find it.
On many of my sites the HTML sitemap and the 404 page are one and the same. I use a tree format that uses page titles and site structure to mold the outline flow.
|On many of my sites the HTML sitemap and the 404 page are one and the same... |
I am curious; wouldn't you noindes the 404 page? If so, then wouldn't the site map thus be no indexed as well?
Yes, the html sitemap is noindex/nofollow.
I do the indexing is thru the sitemap.XML file only.
Instead an html site map, on my sites I use what I call a site index. It contains simple html links (with anchor text) to all the other pages. Since my sites are relatively small, it's also practical for me to add a brief description of each page next to the links. Perhaps because of this extra content, Google includes these pages in its index and even sends them a little traffic from the SERPs.
I think html sitemap is always useful to have a way to reach deep pages with few links. The deeper google crawls the lesser pages it crawls. I think the best option is to have all your pages at a distance of 3(at least 4) links from your homepage. This may be difficult for the site structure (imagine a blog with plenty of articles, there are some old hide ones) but with an html sitemap i tmay be easy to achieve.
@aristotle: it seems quite the same thing to me. Why not noindex them so Google won't send traffic to them, maybe is better show your homepage for that query. Noindex will prevent those page from showing but will maintain the crawling of its links.
mememax - Yes I know about the noindex, follow tag. And you could be right that these pages should be noindexed. But it doesn't matter that much, and I don't want to suddenly change something that's been in place for years. So I'll probably leave them as they are
In Italy we say: "Don't change the winning team", so maybe you're right. If it works, why change it?
Oddly I was just thinking of this tonight and searched for this current topic.
1. I have an oddly named site map file and was getting ready to relocate it. I was wondering if anyone is aware of a "standard" name for the file. Sitemap.htm could easily be confused with a SE submitted sitemap file.
2. I do a combination sitemap/ask-us/google custom search page. If they can't find, or are to lazy to search the site map for it, they can search. If they STILL can't find their answer they can form e-mail us.
3. Where should the site map be linked from for best effect? Every page? Every 1st level category page? 1st + 2nd level? Home page only? If you link it from almost every page, to benefit users the most, it gets a lot of PR, but almost never winds up on the SEs because it is so general in nature and mostly a bunch of links.
4. IF you include a Google Adsense search field on every page does it leak a lot of PR back to G? Instead should you have a "search" link from each page to a single, central, on-domain search page which is the only single place the Google Search field is located?
The XML version of your sitemap can be served to search engines bots in your robots.txt file thusly:
To me you could call the XML file anything as long as the name matches in the robots.txt file. This is also a way to help the SE's reach deep pages with few links.
My preference for an HTML sitemap is to create it mostly for the user - and therefor to link it visibly from a "utility links" area somewhere in the template for every page. Now that xml sitemaps have become standard, I don't think it's important (or even good) to link to every URL in the site. Rather, I like to show the main structure in some sort of visual way - almost like a hot-linked wireframe.
This gives the visitor a visual sense of how the site is organized that can be quite informative in a way that the main navigation may not be able to do. It can even show a different organization - a differently "faceted" way of organizing the information architecture - for instance, content by application rather than content by product name.
An html sitemap of this nature also has an SEO purpose, in that it gives search engines another signal as to what the priorities of the various pages are - another reason not to link to every page. And yes, I do like to include a short phrase or sentence to clarify where each link goes. Often I like linked anchor text rather than displaying the actual URL.
|It can even show a different organization - a differently "faceted" way of organizing the information architecture - for instance, content by application rather than content by product name. |
An html sitemap of this nature also has an SEO purpose, in that it gives search engines another signal as to what the priorities of the various pages are - another reason not to link to every page.
Thank you, tedster.
I think those are both REALLY important points.
I'm sorry, let me rephrase the question. Should the user-friendly, hand-coded, site-map.htm page which only shows the top level navigation pages mostly. (as opposed to the ugly, auto-generated, automatically submitted to G, search-engine-targeted, sitemap.xml.gz page containing every page on the site, which we don't link at all because it is not that useful to users) be LINKED FROM EVERY other page on the site, or just say, from the top level or two level navigation pages which it links out to.
In the former instance (linked FROM EVERY page) it is definitely more convenient for the reader to find and get to, but you are sending a LOT of link PR to one central sitemap page which yes, will redistribute some of it back to the top level site pages it links, but in and of it's self will tend to get indexed and rank very highly for what? The oh so useful and niche-applicable term "site map" which you probably anchor-text link it with from each page for clarity :)? Even if you link it as "Brown Widget Website Site Map" that dilutes your anchor text (in highly competitive niches) to the point of mostly being found only for the term "Brown Widget Website Site Map" which no one is going to search for.
Whether you should link to it from every page depends on what you're trying to accomplish with it. If it's mainly for users, then you should probably link to it from every page, except perhaps from pages that don't get much traffic.. If it's mainly for SEO purposes, you can use it to redistribute additional PR to the pages that you want to promote. If I understand Tedster's example correctly, it's a compromise that combines both purposes.
I don't think it matters much whether the site map page gets indexed by Google, since it probably won't get but a trickle of traffic even if it is.
|but you are sending a LOT of link PR |
That's not the way I see Google actually working these days. If that link isn't in a content area of the template, then there's not the same big batch of PR being sent that you might assume, based on the original PR patent. And yes, whatever is sent does circulate back through the main pages. Really not a problem, from what I've seen.
100% indifferent, I removed one from a major site to relieve a small amount of load that was being caused by having to keep it up to date on the fly and the search engines did not miss a beat or change rankings in any way.
Search no longer relies on them and so it's safe to retire them although, in most cases, leaving them active does no harm at all.
I know that might not be a popular answer, some swear that they have their benefits, but I am unable to replicate any benefit by adding one or observe any side effect of removing one so...
>>That's not the way I see Google actually working these days. If that link isn't in a content area of the template,<<
Hmm, So you're of the opinion that they are demoting internal navigational links/anchors somewhat? WOULD make sense.
I would just think that at some point, it would tend to compete with your home page links if you placed it anywhere near the top of the page.
|they are demoting internal navigational links/anchors somewhat? |
Am not speaking for Ted, but my take on that is that Google is not just flowing pagerank in different rates to internal, but to external links as well. The idea is that some links are more likely to get clicked than others. Footer links less likely. Links in small text less likely. etc. Links more likely to get clicked are assigned more weight. Goes back to the original paper where the surfer gets bored and clicks out. Only in this version it resembles a horde of surfers and the links clicked are the averages in certain areas/phrases and possibly contexts that get clicked the most.
When was the last time I clicked a site map link outside of an SEO situation? Can't remember.