Forum Moderators: open
why isn't a 404 sufficient enough to tell googlebot that the url is gone?
From the HTTP1.1 Status Code Definitions [w3.org] (italics added):
10.4.5 404 Not FoundThe server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.
Since the server has not told Googlebot that the resource (page) is permanently gone Googlebot correctly keeps trying to get it.
Partially related to this is Jakob Nielsen's belief that web pages must live forever [useit.com].
If there are no pages linking to those urls, then Googlebot will eventually give up.
There are two gotchas here.
The first is "eventually". Googlebot is a persistent gal!
The second is "no links". I have an old page that was referred to on bulletin boards long ago. I have removed all visible links, neither Google nor Alltheweb acknowledge that it has any links. But they are there, and people still click on them two or three times a month, so I am sure that Googlebot also sees them.
So, either accept the 404s as a fact of life, or use 301s to redirect them to somewhere useful.
I know this sounds odd, but i would advice you to do exactly that - in AllTheWeb.com. The reason is that Google dos not show all incoming links when you use the "link:" command, AllTheWeb is better at this. Actually, you might as well try both.
If this search returns any pages, then you will know who are linking to that page - those pages linking to you will be the results. Then ou might be able to get them to link to another page in stead, by emailing them.
Still, Googlebot will give up once it's seen that 404 a few times. If you can give it a 410 using htaccess, as Mohamed_E suggests, it will possibly stop quicker.
/claus