Not all of the 1 trillion urls that Google has discovered are in the visibile index - not by far. So does this sentence mean that links on those discovered-but-not-indexed urls are still part of the web-link graph, and that they are capable of influencing PageRank?
Why wouldn't they be part of it? Assuming a random walk of the web as the fundamental basis of PR accretion and distribution, does it matter to the random surfer whether or not a page/URL is indexed by Google, if they're just somehow "randomly" coming across a page/URL?
"Today, Google downloads the web continuously, collecting updated page information and re-processing the entire web-link graph several times per day."
Not "the part of the web graph that Google has included in their index" but "the entire web graph" is what it says. So what are we to think or conclude, when we try to read between the lines? Either they're using the entire web graph to recalculate PR daily, or they aren't. It's probably just as simple as that.
[edited by: Marcia at 10:01 am (utc) on July 29, 2008]