Another factor you should consider before removing a page is how much traffic it brings in.
Also, if a page has any backlinks from other websites, it might be best to re-direct it rather than delete it.
Speaking for myself, I don't see a lot of reasons to remove a page IF that page serves a purpose for the site (in other words, is not simply a bridge page). I would noindex it; remove it from sitemap.xml; and even block its indexing in robots.txt. Those 3 steps should make it clear to Google that they can leave it alone, and should alleviate any concerns on your part that the page(s) will hurt you with Panda.
I think i also need to go into analytics and see what pages are not gaining any traffic, look for high boune rates and evaluate from there.
We are doing the same , come the next Panda update we will see what happens.
|I would noindex it; remove it from sitemap.xml; and even block its indexing in robots.txt |
Objective: is the page useful for our visitors = yes - as reported in web statistics. Is it thin or details found elsewhere i.e. waste of time being in Google index =yes. If yes to both remove from Google but keep page.
If we have thin pages not found elsewhere we keep them in.
We will see...
The process you've described is a good one, however if I were you I will try to create 2 groups of pages.
- Low quality pages in the eyes of google: you can identify those pages as you said before, divide them into:
- the ones which give you traffic (maybe they're landing pages of a referral link), don't delete these ones, just be sure they're removed from SE with noindex and/or robots exclusion
- the ones which doesn't add value nor receive traffic, you can delete these ones, redirecting the ones which has inbound links
- low quality pages in the eyes of your users: have a look at those pages which receive traffic but have an high bounce rate and decide what is the best strategy: url removal or improve their content or merge their content with another page.
If you have the historical records from whatever analytics program you use, look first for those pages that lost the most Google search traffic when Panda first had its impact on your site.
From what I've seen it is most likely these pages that got the low Panda scores. Then that low score spread as a ranking factor from these "seed pages" to some degree throughout the other URLs of your site. Ideally, you would improve these seed pages pages - take them beyond whatever "shallowness" you can clearly see. When that seems impractical, then remove them. Or if that also seems impractical, then noindex them.
Can I just clarify something..I guess Panda has spelt out the need for quality pages...but what if you have a 5 year old site with 3000 thousand unique, original content pages (I don't, but hypothetically speaking)...not everyone of those pages will gain significant traffic or user views...and some will be buried deep in your category..perhaps relating to a specific event some years ago.
Does that mean you would have to noindex/amend robots.txt all those pages which do not achieve much traffic (even though they are quality articles with original analysis?
Just because google deems them 'low quality'?
No - from what I see, pages such as you described are not hurting the websites that publish them. You only need to address those pages that LOST traffic on a Panda update, to the degree that you can. This often includes a variety of things - pages created merely to address subtle variations in keywords, for instance.
Google has given us quite a bit of input as to the kind of "content" that they don't want to rank well. Even if your pages aren't currently being devalued bu Panda, it's still wise to future-proof your site by paying attention.
|...but what if you have a 5 year old site with 3000 thousand unique, original content pages... not everyone of those pages will gain significant traffic or user views... |
Maybe moving those particular pages - if they have a common theme - to a NEW site that is more closely aligned with that theme will help out in terms of generating traffic and improving metrics for those pages.
that's not so much a Panda consideration, but more of a traffic / monetization consideration.