TheMadScientist - 3:32 pm on Mar 16, 2011 (gmt 0)
For those pages that return 404 or 410, I don't see google crawling them as they don't exist anymore.
Google operates a fairly compliant bot and 404 and 410 are two different things.
A 404 is a Not Found error, and is default server behavior, meaning the situation may be temporary or may be permanent, so the URLs will likely be requested for years into the future while serving a 404, especially if there are links to the page(s).
Example of Temporary: The FTP program that does your uploading deletes the file on your server and then saves your local copy there ... Your upload stalls out after the file is deleted ... Anyone who requests the file will get a 404 Not Found error until the page is re-uploaded ... A 404 does not in anyway indicate 'removed' or 'permanent' or 'no longer exists' it means exactly what it says: Not Found.
A 410 Gone does indicate Permanent and is NOT a default behavior. It MUST be intentionally set, so if a page is intentionally removed and will not be replaced it is the correct code to use to slow down GBot from crawling the page. If I remember correctly, when it was first introduced they treated it much like a 404 in terms of request frequency, but have since adjusted GBot to not request the page as often, even though they will still occasionally check to see if it is still 'Gone', because, yes, it's tough to believe, but webmasters do make mistakes, and sometimes just plain change their mind, so they don't want to 'write off' a URL, even if it's Gone. (They always double check, repeatedly.)
Anyway, I think the short answer to the above question is: 410 Gone if they are gone and you don't not ever want them indexed.