crobb305 - 2:53 am on Mar 15, 2010 (gmt 0)
But similar to crobb305 and g1smd, I had incoming external links (valid ones from other websites) pointing to /index.html. I 301 all requests for /index.html to /.
You say that those incoming links were valid and pointed to index.html. Can you clarify? Are those inbound links incorrectly linking to your homepage as "index.html" or do they link to the the canonical '/' as you intended? If webmasters are incorrectly linking to your homepage, I'd contact them and ask for a change. Otherwise, your problem sounds identical to mine whereby the inbounds are correctly linking to my canonical, but Googlebot is apparently trying to crawl /index.html (and subsequently reported 404s in the Webmastertools, prior to my 301).
I can't find a single instance where another page/site has incorrectly linked to my homepage with /index.html (I don't even use .html for any of my pages), so I have no idea why those requests were being made by Gbot. The 301 has stopped the 404s, but at what cost?