Robert_Charlton - 9:16 am on Dec 2, 2013 (gmt 0)
Google reported 3,000 new 404 urls on the 22nd and another 11,000 on the 25th, all of which have either never existed or have been gone for years (no links on my site point to these pages, they properly resolve with a 404 header etc).
Note that a pattern very similar to this, of old 404s resurfacing, was discussed in this thread....
17 May 2013 - GWT Sudden Surge in Crawl Errors for Pages Removed 2 Years Ago?
I make a bunch of comments in the thread with regard to recrawling old 404s. These particular observations might be the most helpful right now...
I've observed that in addition to periodically rechecking the lists of 404s it keeps, Google also often recrawls these lists when there's a refresh of the index, as might occur at a large update of the type we just had.
This observation from a 2006 interview with the Google Sitemaps Team is helpful... [smart-it-consulting.com...]
My emphasis added...
When Googlebot receives either (a 404 or 410) response when trying to crawl a page, that page doesn't get included in the refresh of the index. So, over time, as the Googlebot recrawls your site, pages that no longer exist should fall out of our index naturally.
My sense of the above is that by recrawling the old lists at updates or refreshes, Google is able to generate "clean" reference points of sorts, with currently 404ed urls removed from the visible index. The above interview was in 2006, though, and the index has gotten much more complex, so it's hard to say whether the 404ed pages are removed from the index in one pass, or after many.
There is a separate crawl list, and your observation suggests that the old urls are recrawled. I note from your report that the number of 404s peaked at just about the time of the update, and that the number is trending down gradually.