tedster - 10:13 pm on Sep 19, 2012 (gmt 0)
By paying attention to articles and other information about Google's "Crawl Team". I do this both for my own SEO purposes and because I want to offer well-sourced information to this forum. Crawling has been done with some variant of this approach for many years, so I haven't quickly located a definitive source. But I will keep trying, since how crawling works is a relatively common discussion here.
You can also see how this approach almost needs to be the case by considering what it takes to crawl all the URLs on the whole web on a frequent basis - especially considering the wide variety of sources Google uses for URL discovery,
So as I see it, the crawl team builds a URL list and sends it to one of their googlebot servers to crawl. Then those pages are retrieved, examined for URL discovery, internal linking etc. The next crawl list can then be built based on the new data. This approach would certainly be faster in computation resources than trying to decide on the next URL to request in real time.
However, if a brand new URL is discovered on a re-crawl of a known page - then there may well be a special routine that kicks in and gets that new URL crawled ASAP. It's just that every crawl of every existing page wouldn't even need to get an immediate new request, the way crawling seemed to work in 1990s.