Is Continuous PageRank Updating a Farce?

I would appreciate it if someone could shed some light on this "continuous updating" that google is supposed to be doing.

First, let me explain a few concepts the way I understand them:
1) Google has 8 billion pages in their index.
2) To calculate pagerank, google must do several "iterations" throught the data. On the first iteration (of 8 billion pages) google has to see all the outbound links from ALL of the pages. On the second iteration some pages gain rank because they have incoming links. But this is not enough, several more iterations must be completed in order to get a good reading and establish a rank.
3) The computing power required to do numerous iterations across 8 billion pages must be enourmous.
4) Google uses "supplemental results" in addition to the main index, alluding to the idea that PageRank may only be established for the first 4 billion or so pages, and the rest is just used to "fill in".
5) Before google moved to only doing (allegedly visible) updates once per quarter, there were numerous problems with Google keeping on their monthly schedule. People would become alarmed.
6) Even before the quarterly updates, Google was using "Freshbot" to help bring in new data between monthly updates. Please check me on this: Freshbot results did not have PageRank.
7) We have been told that even though there is no update to what we see in the little green bar, there is actually "Continuous PageRank Update"

I find continuous update of PageRank implausible. In order to get a true calcualtion it requires calcualtions across the entire dataset of 8 billion pages multiple times. We have already seen signs that there were issues in the past (missed updates), attempts to remedy the problem (freshbot), and use of additional measures to hide what is really going on (quarterly updates). Most likely, we are now in an age of "PageRank Lite".

And here is the "kicker": we have this mysterious "sandbox effect" (aka "March Filter") that seems to place a penalty on new links and/or new sites. Could it be a result of Google's failure to calculate PageRank across the entire index?

IMHO, Yes!

Quietly, Google has been building more datacenters. Recently, they opened a new one in Atlanta, GA, but there was no public announcement. There is not even a sign on the building. If you go up to the building to the door, there is a voice on the intercom that also doesn't tell you that you are at a Google facility (source: Atlanta Journal Constitution).

With the number of datacenters Google has already, the main reason for adding more is probably not uptime and dependability. Though these things are important, they certainly have plenty of datacenters, and you rarely see problems with downtime or server speed. The reason for adding these datacenters (quietly) must be they need more computing power to calculate PageRank.

I believe I have provided many examples to support the idea that continuous updating of PageRank is indeed a farce. I also feel that this "sandbox effect" is a result of the inability to do complete calcualtions across the entire dataset (especially new additions to the data).

I look forward to hearing what others think.

Is Continuous PageRank Updating a Farce?

dvduval

scoreman

skunker

scoreman

hugo_guzman

skunker

dvduval

doc_z

dvduval

scoreman

doc_z

Clark

kaled

claus

doc_z

dvduval

kaled

doc_z

dvduval

doc_z

dvduval

claus

kaled

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week