Forum Moderators: open
The problem is simple. HTTP/1.1 [w3.org] has introduced the nice 410 response:
10.4.11 410 GoneThe requested resource is no longer available at the server and no forwarding address is known. This condition is expected to be considered permanent. Clients with link editing capabilities SHOULD delete references to the Request-URI after user approval. If the server does not know, or has no facility to determine, whether or not the condition is permanent, the status code 404 (Not Found) SHOULD be used instead. This response is cacheable unless indicated otherwise.
The 410 response is primarily intended to assist the task of web maintenance by notifying the recipient that the resource is intentionally unavailable and that the server owners desire that remote links to that resource be removed. Such an event is common for limited-time, promotional services and for resources belonging to individuals no longer working at the server's site. It is not necessary to mark all permanently unavailable resources as "gone" or to keep the mark for any length of time -- that is left to the discretion of the server owner.
This, of course, is the ideal solution for those who want to delete pages. But Googlebot advertises herself as an HTTP/1.0 type of person.
Does anyone here (hint, hint :) ) have a definite answer to that question?
This is a really good question. BlueSky's experience mirrors my own: Despite 410-Gone being an HTTP/1.1 response, it seems to be honored by many 'bots, even if they advertise HTTP/1.0.
If you are concerned with going 'by the book' with 404/410 responses, you can always use something like this to return 410 to HTTP/1.1 and higher user-agents only:
# Respond with 410-Gone status to HTTP/1.1 requests for removed resources.
RewriteCond %{THE_REQUEST} ^[^\ ]+\ [^\ ]+\ HTTP/(1\.[1-9]¦[2-9]\.[0-9])
RewriteCond %{REQUEST_URI} ^/(announce¦sp_event/event1¦sp_event/event2)\.html$ [OR]
RewriteCond %{REQUEST_URI} ^/(2002news¦2002weather)\.html$
RewriteRule .* - [G]
Jim