ergophobe - 7:30 pm on Nov 25, 2012 (gmt 0)
Let's start from the beginning. Why do you want to noindex your taxonomy?
Let's say there 10 spots on the first page of the search results and you dominate. You have all 10. But they are 7 taxonomy pages and three node pages and on page 2 it's all your competitors - they take spots 11-20. So you noindex your taxonomy pages and now the front page is your three pages and seven pages from your competitors. It may not be in your best interest.
It's easy to set rules for robots meta tags on taxonomy pages in Drupal:
- d6: [drupal.org...]
- d7: [drupal.org...]
But is it wise?
I see some SEOs recommend using NOINDEX on taxonomy/category/archive pages as in your case. I think the issue commonly arises because you're duplicating content. In Wordpress it's common for the entire post to show in multiple archives. In Drupal it's more common for it to be a snippet, but that snippet by default will be the first part of the page (500 characters I believe).
I think a better solution is to create unique summary snippets so that your taxonomy/archive content is unique, rather than duplicate. Typically, if the page is strong, it will outrank the snippet, but the unique snippet gives you a second run at a different set of terms and will bring in long-tail searches you wouldn't get at all (again, provided the content on the archive page is not duplicate). The title can be an issue, but in this context, the title of the page (h1 and meta) is the title of the taxonomy page, while the title of the snippet should be an h2 or not a title at all, and there will be no meta title.
But let's take a step back...
Allow me to digress for a second, but just bear with me. Say I have two problems:
1. A watch that doesn't keep time.
2. A piano that's out of tune.
I decide to fix the watch by taking a hammer to it, cracking it open, and setting the hands so they read the correct time whenever I need the watch to be right. It works for a second at a time, but I haven't really fixed the root problem.
I fix the piano by pulling out a tuning fork and a tuning wrench and gently tuning the strings until my Steinway sings.
Now, back to your site...
1. It's a watch problem.
To me, the noindex solution is the hammer solution - forcing the site to the "right" settings. By approaching it using noindex you are hiding the problem rather than fixing it and you are removing an important diagnostic for understanding your site. If you noindex your taxonomy pages those pages will be stop cluttering up the SERPs, true, but they will still be draining off your link juice. However, instead of using that link juice on some low-priority pages on your site (i.e. your taxonomy pages), you will be throwing it away entirely - sending to pages that have no chance of showing in the SERPs at all because you've specifically tolds the SEs not to index them.
You will also no longer be able to see what the true strongest pages on your site are. And finally, you'll be giving up slots in the SERPs.
To me, where noindex makes sense is when you have content that just shouldn't show in the index at all, such as checkout pages, shopping cart pages, things like that.
2. It's a piano tuning problem.
If we look at it like the piano problem, the SERPs are your tuning fork. They are telling you the architecture (organization, link flow, etc) is out of tune on your site. Your internal linking, titles, and so forth are your wrenches. By your definition/goal, an "in tune" site would have your "leaf" pages (i.e. nodes in Drupalspeak) outrank your "branch" pages (taxonomy).
So I think it's better to think in terms of "What actions do I need to take to get my "leaf" pages to outrank my "branch" pages?" In other words, "How do I bring my site back in tune?"
This could be
- internal link flow, in particular in your main navigation and on your front page.
- titles, both meta and h1
- inbound links (though normally that will prefer your home page and your leaf pages and rarely your branch pages, so that's not likely part of the problem).
There are some cases where it makes sense:
- Matt Cutts says you could do a noindex on an ugly site map because you never want this page to show (again, I think it would make more sense to leave it indexed to have it available as a diagnostic, but I'm just ergophobe and he's Matt Cutts) - [youtu.be...]
- Matt Cutts and others say you can use it to keep/get thin content out of the index if that content is making you look spammy. That makes more sense to me - these are low-value pages that aren't going to rank anyway, and they're making your site look bad. -- [webmasterworld.com...]
- Matt Cutts also gives the example of parked domains, which to me really makes sense for noindex -- [mattcutts.com...] (old article, so beware of some of the content).