Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

How to best handle requests for removed documents

         

true_INFP

3:05 pm on Mar 24, 2009 (gmt 0)

10+ Year Member



We've removed dozens of pages without replacement (the content was no longer valid). These documents still have quite a lot of inbound links on the web. So far, we've been returning HTTP 404 (Not Found) to all visitors requesting those pages.

However, recently the question came up if we haven't been wasting PR needlessly that way. Someone suggested 301-redirecting all such requests to the index page.

Would that be a problem for Google?

Thanks.

true_INFP

5:39 pm on Mar 24, 2009 (gmt 0)

10+ Year Member



I'd like to add that some of the removed documents are hot-linked images.

[edited by: true_INFP at 5:40 pm (utc) on Mar. 24, 2009]

tedster

10:58 pm on Mar 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How about keeping the url active, but change the content of the page to explain that the content is no longer valid? That way the other links in the page template will continue to flow PR to the rest of your site, and your visitors will get some relevant information.

I'd like to add that some of the removed documents are hot-linked images.

No PR issue with image, as far as I know. Even Google's logo doesn't show PR. And since images do not link anywhere on their own, they cannot circulate PR at any rate.

true_INFP

2:39 pm on Mar 25, 2009 (gmt 0)

10+ Year Member



Actually, the reasons for the removal of the pages are much more complex and numerous than the universal "content no longer valid". We don't have time to prepare such explanatory pages; that's why we've been serving HTTP 404.

Do you see any problem with 301-redirecting all such requests to the index page? Would Google like that?

[edited by: true_INFP at 2:40 pm (utc) on Mar. 25, 2009]

tedster

3:53 pm on Mar 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It is not a good idea to redirect all non-existing files to the home page. You end up, over time, with all those bad urls being indexed as duplicates of your home page content.

If a url is no longer available, either redirect to a true replacement page or else let the request get a 404 status as you are currently doing. Thsat way you don't risk the rankings for the rest of your site.

true_INFP

4:25 pm on Mar 25, 2009 (gmt 0)

10+ Year Member



It is not a good idea to redirect all non-existing files to the home page. You end up, over time, with all those bad urls being indexed as duplicates of your home page content.

That doesn't seem to match the official information provided by Google that duplicate content issues are prevented using 301 redirects.

In other words, any URL that gives HTTP 301 is actually removed from the Google search index and replaced by the target URL of the redirection. That's why there should be no duplicate.

[edited by: true_INFP at 4:26 pm (utc) on Mar. 25, 2009]

Shaddows

5:00 pm on Mar 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



301 = moved permanantly.
Implied: "It was here, now its THERE"

While the URL is dropped (eventually), its still resolving to the same place. I looks like your serving the same content for multiple resource requests.

There are (were?) spamming techniques involving 301 PR funnelling. What you are doing superficially resembles this. You could get caught in that penalty.

If the resource isn't there anymore, you should be serving a 410 (Gone). There are plenty of references to indicate 410 is treated as a 404 by G, so just serve that.

If you want the content to be your home page, so be it, but serve a 404 response.

[edited by: Shaddows at 5:01 pm (utc) on Mar. 25, 2009]

tedster

5:12 pm on Mar 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You're right - I was really talking about a 302 redirect, and not a 301. Still, I know that Matt Cutts has addressed this recently and discouraged the practice of redirecting all missing file requests to the home page. I'll see if I can find the reference.

Here's the point. If any old filepath on your domain resolves no matter what it is, then your server is not giving an accurate response to those bad requests. A file that doesn't exist is 404 (or possibly 410 Gone) - and that's an accurate response.

Shaddows

5:22 pm on Mar 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Also, you're not helping your upstream linkers who think they are sending a referal to a resource. That resource is returning a 200, so all should be well, when infact it is not.

Admittedly webmasters should notice their link-checker is getting served a 301 in the chain and investigate, but still its not ideal

true_INFP

5:32 pm on Mar 25, 2009 (gmt 0)

10+ Year Member



Here's the point. If any old filepath on your domain resolves no matter what it is, then your server is not giving an accurate response to those bad requests. A file that doesn't exist is 404 (or possibly 410 Gone) - and that's an accurate response.

I know that. But here's my point:

HTTP 4xx completely wastes formerly gained PR.

Shaddows

5:35 pm on Mar 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Are you 301-ing all invalid requests or URLs that did once exist?

true_INFP

5:45 pm on Mar 25, 2009 (gmt 0)

10+ Year Member



Are you 301-ing all invalid requests or URLs that did once exist?

If you read just the topic, you'll know the answer. ;-)

true_INFP

5:49 pm on Mar 25, 2009 (gmt 0)

10+ Year Member



I know that Matt Cutts has addressed this recently and discouraged the practice of redirecting all missing file requests to the home page. I'll see if I can find the reference.

I'd be sincerely grateful for that reference, tedster.

tedster

6:17 pm on Mar 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



i know there's something much more recent - but it may only be in a video. At any rate, here's one exchange from 2006:

question: i want to make my blog more professional and have been removing old posts. will my 404 page take care of this or should I use a 301 that directs back to home?

Matt Cutts Said,
Aaron, if the posts are truly removed, I would go with a 404.

[mattcutts.com...]

If I find something more recent or stronger, I'll post it here, too.

In the meantime, just think about what a 301 is saying - "Yes, that content is here but it's now at a new address." But when content is removed, that is not true.

true_INFP

7:06 pm on Mar 25, 2009 (gmt 0)

10+ Year Member



In the meantime, just think about what a 301 is saying - "Yes, that content is here but it's now at a new address." But when content is removed, that is not true.

Well, in some of those cases, 301 may be appropriate as well. For example, when the index page contains portions of the content (possibly reworked) of the removed pages.

If I find something more recent or stronger, I'll post it here, too.

Thanks. I appreciate that.

[edited by: true_INFP at 7:09 pm (utc) on Mar. 25, 2009]

true_INFP

8:06 pm on Mar 25, 2009 (gmt 0)

10+ Year Member



Perhaps I should have added that this is a PR7-8 site, which might not be as susceptible to some penalties as e.g. a random PR3 site...

Shaddows

9:23 am on Mar 26, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Ok, if your not redirecting EVERYTHING, and you have a high PR, you're probably not going to get treated as a spammer.

I don't think you have a black/white answer here. You will probably be fine, but I think you are running a risk.

Its up to you to decide if the risk is high or low, and if the corresponding benefit (retaining PR) is worth it. Not forgetting that G could 'tweak' the way it looks at 301s at any point.

Funnily enough, the reason that you may very well be immune is the same reason I wouldn't take the risk, namely that you clearly have a successful and well regarded site.

oodlum

12:32 am on Mar 27, 2009 (gmt 0)

10+ Year Member



I'm facing a similar issue. Is there any evidence that having a lot of pages go 404/410 regularly has an adverse effect of ranking?

g1smd

1:16 am on Mar 27, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



*** Are you 301-ing all invalid requests or only URLs that did once exist? ***

That is an absolutely crucial point here.

true_INFP

5:43 pm on Mar 27, 2009 (gmt 0)

10+ Year Member



That is an absolutely crucial point here.

No, here it's actually irrelevant (see the topic and first post of this thread).

tedster

7:31 pm on Mar 27, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The caution people are expressing is a concern that you continue to return a 404 for urls that never existed on your server.

true_INFP

3:30 pm on Mar 28, 2009 (gmt 0)

10+ Year Member



The caution people are expressing is a concern that you continue to return a 404 for urls that never existed on your server.

I'm not sure why you use the word "continue".

We've never done that nor have I written anywhere in this thread that I intend to do anything like that. On the contrary, I specifically wrote that we're talking about handling requests for removed content that once existed.

So, again, your use of the word "continue" is quite difficult to understand for me.

tedster

4:08 pm on Mar 28, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'll try to clarify. A server normally returns a 404 status for urls that NEVER existed. I am assuming that your server also hase done this.

If you begin serving a 301 redirect for the urls that DID exist but have been removed, make sure that the server still returns a 404 status for urls that NEVER existed.

true_INFP

11:43 am on Mar 29, 2009 (gmt 0)

10+ Year Member



Can you quote any part of my posts here that led you to believe that you needed to clarify/explain to me the difference between those two things? Just wondering if people actually read what I write...

Shaddows

10:17 pm on Mar 29, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



true_INFP, you're being incredibly pedantic. Stupidly so. All you have to say is

"No, I'm not. 301 only happens to pages that did once exist. Thank you all for taking time to reply"

The fact is, many people are cavalier in their approach to relating facts. Others are confused as to what facts they relay.

Take you for instance. You have stated you ARE 301-ing old pages. Great. You have NOT said what you do with the other pages. There is therefore doubt.

When you ask for help, it is good manners to be courteous to those who try to help, even if you perceive their questions to be superfluous.

[edited by: Shaddows at 10:52 pm (utc) on Mar. 29, 2009]

g1smd

10:46 pm on Mar 29, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



*** Just wondering if people actually read what I write. ***

Sure. And answered the question. And added extra information so that you would not go on to implement something that would be harmful.

So, again. Yes you can 301 requests for documents that did exist and which no longer exist. You should 301 to a page that is a similar topic. I would avoid redirecting all of those to the root home page. That's likely not a good profile to have.

And finally, to clarify something you need to be wary of in your implementation of a redirect: for URLs that have never existed, you should continue to return a 404 status for those. That is, do not start redirecting those.

Here I use the word 'continue' in the same context as Tedster used it above.

true_INFP

9:22 pm on Mar 30, 2009 (gmt 0)

10+ Year Member



true_INFP, you're being incredibly pedantic.

Possibly. I just prefer people to read what I write especially when I've asked them to do so several times. ;-)