Forum Moderators: open
But, if you want the complete list of Google's IP that's easy ;)
64.68.80.0 - 64.68.87.255
66.102.0.0 - 66.102.15.255
64.233.160.0 - 64.233.175.255
216.239.32.0 - 216.239.63.255
216.200.251.112 - 216.200.251.119
Includes everything (spider, dc, etc..)
It requested hundreds of files that were valid filenames up until June 2003, and then were 301 between June 2003 and October 2003, and have been 404 since last October. The only thing this new IP is doing is trying to fetch these 404 files.
Today this IP is doing the same thing, for more than 1000 files already today. This crawler is always from the 64.68.92.X Class C, and reverse resolves to crawl?.googlebot.com (where? is a number like 4 or 5). I don't know what the user-agent looks like.
The usual crawler, in the 64.68.82.X range, is still functioning normally. For several months, this usual crawler has been aware that the old filenaming format is now history, and rarely tries to fetch them.
What could Google be up to?
Don't forget, that just because you have pages that are no longer there, that there may still be pages out there that do still link to those old addresses. Google follows links, so it will follow old links on old sites that now go to 404 pages. Maybe they keep a database of that which they have a separate crawler validate from time to time.
I had a site that had been offline for a year, but a few weeks ago, doing a link:www.domain.com/ search brought up a list of real pages that still link to the dead domain. I put the domain back online, and donated it to someone else to put a site on, one which has similar content to the old site that used to be there (so searchers won't be cheated), and the site was back in the normal search index within just a few days.