Forum Moderators: phranque
[httpd.apache.org...]
Somehow it doesn't seem exactly right, but going by how it's explained for that module, would this be how?
Redirect 410 /foo/page.html
Let me explain why. Some search engines keep 404's in their index, which is plausible since the page may just be temporarily inaccessible. But for 410/gone:
Returns a "Gone" status (410) indicating that the resource has been permanently removed. When this status is used the URL argument should be omitted.
410-Gone was introduced in HTTP/1.1 to resolve this ambiguity. Over time, some HTTP/1.0 clients were 'extended' to support most but not all HTTP/1.1 requirements. The major search engine spiders that publish as HTTP/1.0 fall into this category.
If you are hosted on a dedicated server or VPS, the proper thing to do is to detect HTTP/1.1 or extended HTTP/1.0 requests by checking for the presence of the HTTP_HOST request header. If this header is present, it is most likely that the client will understand a 410-Gone response. If the header is not present, a 404 should be returned instead.
If you are hosted on a name-based virtual server, this is a non-issue, since your site cannot be accessed at all by a true HTTP/1.0 client that does not send the HTTP_HOST header. Therefore, no conditional checking is needed, and you may use mod_alias or unconditional mod_rewrite code to return a 410 response.
In order to test for HTTP_HOST, you can use mod_rewrite:
# Test for HTTP/1.1 (or extended HTTP/1.0) hostname request header
RewriteCond %{HTTP_HOST} .
# If present, return 410-Gone for removed page
RewriteRule ^removed_page\.html$ - [G]
# Else let server generate default 404-Not Found
However, I haven't seen compelling evidence that 410s are treated any differently from 404s yet -- I just use 410s because that's what HTTP/1.1 says we should use. If the search engines conform their behaviour in the future, I'm good to go. And if not, my doing it correctly doesn't have any significant downside.
Jim
Matt Cutts has said that Google treats a 410 the same as it treats a 404. So it wouldn't work with Google.I know it's frustrating. Google hates to drop urls from their index, even urls they haven't crawled.
But it isn't Google on this. I've caught some problems with Yahoo with removed/redirected pages and directories (like there used to be with Inktomi), and I'd like to give whatever I can a shot to try to see what might be done to deal with that. So I'll double-check the problematic Yahoo listings again and take it further from there.
Thanks!