|Best way to get rid of thousands of pages|
I have thousands of pages that are indexed in Google, but are low quality pages. i want to get rid of them, primarily as a benfit to the users, as these pages don't have as much information as I had hoped.
What i want to do is 301 all of these pages to their parent page on the same site, but higher up in the tree.
Should I 301 all of them, and in about 6 months delete the files?
Should I do a robots.txt block on them, and then 301 them once they fell out of the index?
Should I just let them go 404, and fix the links so no user can see them?
I think a 301 would be the best, but thousands of 301s to go up overnight - will that hurt me with google?
What do you guys think?
Thanks in advance.
As you can see in another thread [webmasterworld.com], I'm not a fan of using thousands of 301 redirects. My choice would be to take the low quality pages down, remove the links to them, and allow Google to get a 404.
I would just delete them too. If people want to see the content they can use Google cache, and once they no longer turn up in the results then Google will have eliminated them entirely. I have deleted a few pages lately and you get to see in WMT how quickly Google sorts it out.
But isn't there a penalty also for a lot of 404s?
You guys don't think it's a good idea to block them in Robots first?
Why would there be a penalty for "a lot of 404s?" Every so often we get a taste of that particular bit of "SEO mythology" here, and I sure wonder where it got started.
Maybe if your site has a lot of BAD INTERNAL LINKS, maybe that could hurt you a little bit. And even in that case, it would be the bad links that are hurting you, and not the fact that the server gives an accurate 404 response.
Every possible combination of characters except for your actual files is a 404, or at least it should be. If there really were a penalty for "too many 404s", imagine how easy it would be to take out a competitor just by aiming a bunch of bad links at their domain from another domain.
I understand about the competitor sending a bunch of bad links over, but that makes me wonder if Google might have different classifications of a 404.
|404 Example #1 |
A site had a page, page is now gone.
I would suspect this would hurt your site's "credibility". Would a credible site all of a sudden have thousands of missing pages?
|404 Example #2 |
An external link is pointing at a page that does not exist on the server, google never visited this page in the past.
Would not hurt your site's credibility.
|404 Example #3 |
An external link is pointing to a page that used to exist, but does not anymore.
Same as Example #1, except here, Google thinks this 404 had some credibility from other sites, and now it's missing.
Does this make sense? This is just my thinking... curious about your thoughts. thanks
|Would a credible site all of a sudden have thousands of missing pages? |
If you're removing "low quality" pages - as you said you were - that can only help your credibility, I'd say. If you know they're low quality, you can be pretty sure Google thinks so, too.
Not just a 404, but a quality page that gives the visitor some positive choices.
Few sites NEED to say "404 hard luck mate"; most sites can have a purpose deigned page (or even a copy of the index page), that helps the lost reader to find what they want.
If the pages were as poor as you say, hopefully few visitors will find them, with newer, better pages taking priority.
I see no reason for Google to worry about 404s, though if you delete pages that had a fair number of inbound links or visitors, then I'd 301 such pages to newer equivalents.
Might be a good time to run xenu, too - I suspect the myth about 404s arose out of 404s follwing broken links. That *could* be a problem. But 404s and good internal navigation shouldn't be an issue. I'm sure Google worries much more about functionality than credibility.
We recently upgraded our forum software which eliminated all of the old forum url's. At the same time, we used 301's to redirect the url's that had at least 10 views per month to the new forum url's and let the others, about 1,500 of them, go 404.
Within 24 hours Google had noticed all of the missing pages, showed them as 404 in WMT and started spidering the heck out of our site.
The bottom line ends up that Google, MSN and Yahoo all started to furiously spider our content after the change, and many new url's, including other static pages that never ranked started to appear in all 3 engines. So, in our case, having many pages go to the away helped our rankings in the big 3.
|Would a credible site all of a sudden have thousands of missing pages? |
Maybe you've got a clothing e-commerce site, and you just cleared out your summer fashions.
Maybe you've got a travel agency and you're deleting the 2008 tour descriptions, cruise itineraries, etc. to make way for the 2009 offerings.
Maybe you run a university site, and you've replaced all of the 2007-2008 course descriptions with new pages for 2008-2009.
Or maybe you just redesigned your site and cleared out a lot of clutter.
In short, there could be any number of good reasons why a "quality site" might have major (and sudden) page turnover. And by using a 404 instead of a redirect to the home page, you're simply sending a signal that the pages are gone--you aren't doing anything that looks even remotely deceptive.
They are not even 'missing pages', so long as you have corrected internal navigation. They are simply 'the past'.
Leave everything alone, except to add a noindex, follow meta tag to each page. This will lead to them dropping out of the index without 404s. After they all drop out, make sure you don't have any links to the old URLs, and do one 301 to catch any stray traffic from the rest of the web.
So, if you have a site with 500 pages with a vast majority being product detail pages and you were to do a complete redesign, including the url structure, would you say it's best to 301 your category pages, contact us, informational pages, etc. and let the product detail page fall into a 404?
I have a home grown site and it no longer offers the functionality I need so I'm switching to a off the shelf ecomm software package with all of the bells and whistles.
|So, if you have a site with 500 pages with a vast majority being product detail pages and you were to do a complete redesign, including the url structure, would you say it's best to 301 your category pages, contact us, informational pages, etc. and let the product detail page fall into a 404? |
If the product-detail page is replaced by a new product-detail page, use a 301 to the new location.
If you no longer have a product-detail page, use a 404.
Instead of a debate on whether a lot of 404s hurt a site, why not just serve a 410?
No difference, Jim. Google treats a 404 and a 410 identically.
Yeah, I know they treat them the same, but at least it might show that the removals are (somewhat) intentional.
-- If you no longer have a product-detail page, use a 404. Ė
I am going to try to take a broad view of this, some small to mid size ecom sites might benefit from this thought.
Why one would want to take an established, trusted URI out of the inventory is still a mystery for me to this day. Why not rewrite the titles and description tags, alter some on page content and replace the product with something similar.
The way I look at it is there is 3 kinds of we donít have the thingy any more.
1. 404, dude you are not in luck today so beat it.(visitor gets confused, leaves the site)
2. 301, hmmm I just visited this page 2 days ago.(visitor gets confused, leaves the site)
3. 200, Display a message(image based) that the thingy is no longer here), then after content altered, noarchive added, reindexed and slightly re-ranked and boom you got an aged URI where it was.
This might not be the Best way to get rid of thousands of pages, but as tedster mentioned - measure twice, cut once
I really dont see any harm in using 301 redirects for 100s or 1000s of pages at all.
I have done with positive results.