lucy24 - 4:04 pm on Jun 27, 2011 (gmt 0)
At this point, I am about ready to submit a sitemap that includes all of our currently indexed 404 pages.
Google's sitemap handling seems to be cumulative. (Bing's documentation implies that they do it differently.*) That is, once a page is on a sitemap they crawl it forever, even if you feed them a new sitemap that's entirely different. Once they've decided a page exists, there's almost nothing you can do to purge it from their memory.
* They say that if a lot of pages return a 301, they "don't trust" the sitemap. It would take a hell of a nerve to say this if they don't revise their database when an up-to-date sitemap is submitted.