| 5:25 pm on Sep 29, 2011 (gmt 0)|
Pages with no traffic no external links and are poorly indexed - "410 Gone".
Pages that have an equivalent somewhere else in the site, especially if the page you're scrapping has good incoming external links - 301.
Don't mass redirect multiple URLs to one page, certainly not to the root.
Don't use a 302 redirect.
| 5:37 pm on Sep 29, 2011 (gmt 0)|
Thanks for the advice...I was leaning toward doing redirects to category pages but now 301's look like the way to go. Aren't 301's bad though? I thought I read somewhere that google doesn't like to see 301's...or is that just links on your site to 301s?
410's have me a tad nervous as there a couple of these pages that could come back down-the-road depending on management...
| 6:20 pm on Sep 29, 2011 (gmt 0)|
g1smd's got it right. A couple weeks ago I did a mass 410 culling for a client based on the same criteria. The site lost no ranking even though we literally cut 50% of the pages on the site. (Their main phrases climbed 3-4 spots in rankings, but I can't directly attribute that to the removal of junk pages since we made several simultaneous improvements.) The 410'd pages are slowly falling out of Google and their 410s are being removed from .htaccess as they drop out.
|Aren't 301's bad though? I thought I read somewhere that google doesn't like to see 301's...or is that just links on your site to 301s |
301s on their own aren't bad and actually do a great job of retaining the reputation for removed pages. In my opinion the most important use for them is for a site re-launch where page names and directory structures have changed. With old URLs 301'd to *proper* new pages, the new pages will quickly rank where the old ones did.
I don't have any direct evidence, but I suspect that if you have a lot of off-site links that get 301'd Google might think your site is a bit stale - HOWEVER, if the 301 is generated by canonicalization I doubt it would hurt.
[edited by: SEOMike at 6:21 pm (utc) on Sep 29, 2011]
| 6:21 pm on Sep 29, 2011 (gmt 0)|
If a few of your widgets are temporary out of stock, why don't you just leave them up and put a "noindex,follow" tag.
Put a note saying to the effect "This item is on backorder, we will notify you when it's in" etc.
Take their email.
Offer other related product links, images, descriptions, etc.
When the widget is back in stock just change the meta tag.
| 6:51 pm on Sep 29, 2011 (gmt 0)|
SEOMike...sounds good. But I think there is at least a good chance that a few of these pages will have to come back...oh...in a couple of months. Wouldn't google be unhappy to see a 410 come back to life?
| 6:55 pm on Sep 29, 2011 (gmt 0)|
If the page might come back in the future use some code other than 404 or 410. "Gone" should mean gone for good.
The meta robots noindex tag is useful as a temporary measure but likely difficult to administer.
| 7:19 pm on Sep 29, 2011 (gmt 0)|
404's sound good then.
By chance, does anybody know if you need to be careful about the type of 404 content you use? Of if as soon as google sees 404 they ignore everything else? Was thinking of perhaps doing a custom 404 error page for each widget...
| 7:35 pm on Sep 29, 2011 (gmt 0)|
Google does not index pages that return a 404 status code. The HTML content is there for the human visitor.
| 8:04 pm on Sep 29, 2011 (gmt 0)|
My main competitor doesn't use 404's for broken pages but 200's for EVERYTHING. So if you enter in site.com/asdfsadfl;jasdf you get a 200 status and a page that says the content can't be found (but of course the headers don't say that).
Could these guys be shooting themselves the foot? I really hope so... For one term (100k a month search searches) they had a subpage with this faux 404 page that ranked 12th which was absolutely ridiculous.
| 8:19 pm on Sep 29, 2011 (gmt 0)|
They have an "infinite Duplicate Content" problem.
A few years ago that would have been a major issue, and a little bit of external malicious linking could have caused all sorts of mayhem for them.
Nowadays Google tries to factor those factors out of the equation, but they don't always get it right.
| 9:16 pm on Sep 29, 2011 (gmt 0)|
Thx for the info...
But how would malicious linking hurt somebody with infinite duplicate content?
Would they create a page with say a thousand links to site.com/fakelink1, site.com/fakelink2, etc...? Google then crawls this bait page and indexes all these fakelink sites? But if google doesn't index these pages they wouldn't be hurt right?
| 9:40 pm on Sep 29, 2011 (gmt 0)|
As g1smd mentioned, today Google has more safeguards in place to avoid indexing those 200 status pages that are actually "not found" massages. In the past, when there was a 200 status then every backlink to a duff URL did result in that URL getting indexed - and the widespread "error text" then registered as duplicate content. It can still happen today, though it is no longer nearly so common.
That's why the status code is so important, especially for those who use a custom error message, which is where this problem used to creep in so frequently and easily.
| 9:57 pm on Sep 29, 2011 (gmt 0)|
|Wouldn't google be unhappy to see a 410 come back to life? |
Probably. There are two things in the spec that make me say that; 1. "This condition is expected to be considered permanent." and 2. "This response is cacheable unless indicated otherwise."
Eh - maybe, maybe not... I don't really like 404ing a large number of pages because I think it makes your site look broken. If we're talking about a small fraction of pages being 404'd, then it might not be that big of a deal. I don't know where the tipping point for "too many broken pages - reduce ranking" would be.
This might be an opportunity to experiment with a couple server responses and see what works best. Maybe try 404ing some of them, try 302ing some of them to different pages, and maybe 403 some of them to see which response works better for getting the pages re-indexed and re-listed.
| 12:13 am on Sep 30, 2011 (gmt 0)|
I would not use a 302 at all here.
I have recently set 410 for several thousand URLs and 301 for another couple of thousand URLs on a site with a massive duplicate content issue.
It took several months for Google to drop all of the unwanted URLs from the SERPs, but the redirects retained the traffic while those URLs were still listed.
It has taken another couple of months for Google to remove those URLs from the WMT internal links report; only doing so once there are no links found pointing to the pages that no longer exist.
Traffic has increased during this time, even though the number of indexed URLs has dropped by 99%.
| 1:57 am on Sep 30, 2011 (gmt 0)|
|What about a 'this widget is no longer available' message? My worry is then that this would thin content and if all yanked pages have this verbiage this would be internal duplicate content? |
I presume that these pages have some content at the moment. So adding "This widget is no longer available" should not make all these pages duplicate if there is a sufficient amount of the other content on the page that makes the pages currently differ from each other.
If a widget is gone not to come back I would then do as g1smd says, that is either serve 410 or 301, depending if there is a good page to redirect or not.
If a widget is temporarily/currently not available, I would leave the page returning 200, and add the message to the page alongside the lines:
"This widget is not currently available. Why not try nnn mmm instead?" (where nnn mmm would be links to similar widgets).
| 4:48 am on Sep 30, 2011 (gmt 0)|
|I don't really like 404ing a large number of pages because I think it makes your site look broken. |
The problems I've seen come from not removing internal links to the URLs that now return 404. I've worked with redesigns and redevelopments where WAY over 50% of the URLs were removed and it did not cause big issues or penalties.
| 6:39 am on Sep 30, 2011 (gmt 0)|
That's an important point. Make sure that there are no broken links within the site.
The only issue will be where people click on an out of date link in the SERPs, or from another site, or in their bookmarks. That's why the suggestion is to redirect URLs that have traffic and a useful equivalent page of content, and 410 those that do not.
Run a Xenu LinkSleuth report before you start work. Save the data so you can refer back to it again and again.