| 6:17 am on Nov 24, 2012 (gmt 0)|
you might find it easier to apply the X-Robots-Tag header to the specific URLs you don't want indexed:
Google does not support a Noindex: directive in robots.txt:
| 6:29 am on Nov 24, 2012 (gmt 0)|
That answers and important question.
| 7:30 pm on Nov 25, 2012 (gmt 0)|
Let's start from the beginning. Why do you want to noindex your taxonomy?
Let's say there 10 spots on the first page of the search results and you dominate. You have all 10. But they are 7 taxonomy pages and three node pages and on page 2 it's all your competitors - they take spots 11-20. So you noindex your taxonomy pages and now the front page is your three pages and seven pages from your competitors. It may not be in your best interest.
It's easy to set rules for robots meta tags on taxonomy pages in Drupal:
- d6: [drupal.org...]
- d7: [drupal.org...]
But is it wise?
I see some SEOs recommend using NOINDEX on taxonomy/category/archive pages as in your case. I think the issue commonly arises because you're duplicating content. In Wordpress it's common for the entire post to show in multiple archives. In Drupal it's more common for it to be a snippet, but that snippet by default will be the first part of the page (500 characters I believe).
I think a better solution is to create unique summary snippets so that your taxonomy/archive content is unique, rather than duplicate. Typically, if the page is strong, it will outrank the snippet, but the unique snippet gives you a second run at a different set of terms and will bring in long-tail searches you wouldn't get at all (again, provided the content on the archive page is not duplicate). The title can be an issue, but in this context, the title of the page (h1 and meta) is the title of the taxonomy page, while the title of the snippet should be an h2 or not a title at all, and there will be no meta title.
But let's take a step back...
Allow me to digress for a second, but just bear with me. Say I have two problems:
1. A watch that doesn't keep time.
2. A piano that's out of tune.
I decide to fix the watch by taking a hammer to it, cracking it open, and setting the hands so they read the correct time whenever I need the watch to be right. It works for a second at a time, but I haven't really fixed the root problem.
I fix the piano by pulling out a tuning fork and a tuning wrench and gently tuning the strings until my Steinway sings.
Now, back to your site...
1. It's a watch problem.
To me, the noindex solution is the hammer solution - forcing the site to the "right" settings. By approaching it using noindex you are hiding the problem rather than fixing it and you are removing an important diagnostic for understanding your site. If you noindex your taxonomy pages those pages will be stop cluttering up the SERPs, true, but they will still be draining off your link juice. However, instead of using that link juice on some low-priority pages on your site (i.e. your taxonomy pages), you will be throwing it away entirely - sending to pages that have no chance of showing in the SERPs at all because you've specifically tolds the SEs not to index them.
You will also no longer be able to see what the true strongest pages on your site are. And finally, you'll be giving up slots in the SERPs.
To me, where noindex makes sense is when you have content that just shouldn't show in the index at all, such as checkout pages, shopping cart pages, things like that.
2. It's a piano tuning problem.
If we look at it like the piano problem, the SERPs are your tuning fork. They are telling you the architecture (organization, link flow, etc) is out of tune on your site. Your internal linking, titles, and so forth are your wrenches. By your definition/goal, an "in tune" site would have your "leaf" pages (i.e. nodes in Drupalspeak) outrank your "branch" pages (taxonomy).
So I think it's better to think in terms of "What actions do I need to take to get my "leaf" pages to outrank my "branch" pages?" In other words, "How do I bring my site back in tune?"
This could be
- internal link flow, in particular in your main navigation and on your front page.
- titles, both meta and h1
- inbound links (though normally that will prefer your home page and your leaf pages and rarely your branch pages, so that's not likely part of the problem).
There are some cases where it makes sense:
- Matt Cutts says you could do a noindex on an ugly site map because you never want this page to show (again, I think it would make more sense to leave it indexed to have it available as a diagnostic, but I'm just ergophobe and he's Matt Cutts) - [youtu.be...]
- Matt Cutts and others say you can use it to keep/get thin content out of the index if that content is making you look spammy. That makes more sense to me - these are low-value pages that aren't going to rank anyway, and they're making your site look bad. -- [webmasterworld.com...]
- Matt Cutts also gives the example of parked domains, which to me really makes sense for noindex -- [mattcutts.com...] (old article, so beware of some of the content).
| 4:53 am on Nov 26, 2012 (gmt 0)|
|It may not be in your best interest. |
Good point. I was talking to the wife last night about that - at least they are coming to our site. This gives us potential for a click.
| 5:01 am on Nov 26, 2012 (gmt 0)|
BTW, thanks for the YouTube links and post. You sealed the deal for me. We will continue to work on the leaves.
| 7:03 pm on Nov 26, 2012 (gmt 0)|
It would be great if you check back in with a new thread when you figure it out (or don't) just to see how those ideas are working for you.
| 5:06 am on Dec 29, 2012 (gmt 0)|
We have been deleting junk articles from our site. I wish I could find a way to find links to our site's individual pages. In Webmaster Tools we see tons of links, but the link: operator doesn't seem to work in Google Search. It doesn't provide any results, except for our main domain (no individual pages).
| 5:25 am on Dec 29, 2012 (gmt 0)|
there are several tools sets available for discovering inbound links.
you can find these discussed in the Link Development forum at WebmasterWorld:
this thread discusses the 2 major providers of this service.
Majestic SEO / Open Site Explorer comparison by Eric Enge:
in my experience majestic provides the most useful and comprehensive data.
i've heard good things recently about ahrefs but haven't had a chance to try it yet.
| 3:23 am on Dec 30, 2012 (gmt 0)|
I have used Majestic. Tedster recommended it a while back. But I noticed they only provide a total of 500 (my pages) with backlinks. Analytics gives me 1000.
I see some other useful tools there, so thanks for the reminder.
I am going to try ahrefs now. :)
| 7:45 am on Dec 30, 2012 (gmt 0)|
You know what I like about Majestic and ahrefs?
They are easily searchable, unlike Analytics. I think they will both help in ways that Analytics doesn't.
| 9:24 am on Dec 30, 2012 (gmt 0)|
OK, I like Magestic the best. Ahrefs limited me to 10 queries.
| 10:29 am on Dec 30, 2012 (gmt 0)|
|I have used Majestic. Tedster recommended it a while back. But I noticed they only provide a total of 500 (my pages) with backlinks. |
i'm pretty sure you can see them all if you register for a free account.
| 10:21 am on Dec 31, 2012 (gmt 0)|
I did register, but they did truncate it at 500. BUT, I have been using their tool for pages one at a time. I can do a search with Majestic, unlike Google Analytics.
| 12:15 pm on Dec 31, 2012 (gmt 0)|
BTW phranque, I want to tell you what this is all about for me. Our main site (the one that paid all of our bills) got hit be Panda. It had over 25K articles. We purged tons of articles that didn't get hit using a list we compiled from Analytics. The Spreadsheet included the URL of our article, plus how many times the article was seen. We also included a list of those articles that Analytics said we had a link to.
I collected the data three times over the last two years, adding to the old list. So it is a huge list now.
We deleted all articles that didn't get any traffic. Now we are purging more. I expect to get from 14K now to about 7K when done.
The Majestic you recommended is a real good tool. It tells me how many sites linked to that specific page PLUS it gave more information about the site linking to us. Great info! I don't want to delete a page that has some good quality links, even if it didn't get much traffic.