Forum Moderators: open

Message Too Old, No Replies

Personalizing Pagerank

An Analytical Comparison of Approaches

         

vitaplease

5:12 am on Jun 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



An Analytical Comparison of Approaches to Personalizing PageRank [stanford.edu]

Taher Haveliwala, Sepandar Kamvar and Glen Jeh

Personalising Pagerank, as far as I understand it, means not taking the normal probability distribution for visiting a random page (random walk) when a surfer is not following an outlink, but assigning some bias or preference to certain kind of pages.

Topic-Sensitive PageRank, Modular PageRank and BlockRank are compared briefly in "vectorised terms".

also by Haveliwala and Kamvar:

The Condition Number of the PageRank Problem [stanford.edu]

..This statement has implications for the accuracy to which PageRank can be computed, currently and as the web scales. Furthermore, it provides a simple proof that, for values of $c$ that are used by Google, small changes in the link structure of the web do not cause large changes in the PageRanks of pages of the web

msgraph

3:52 pm on Jun 25, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Great find vitaplease

It is good to see a paper that lists the top 3 Stanford proposals for adding some personalized PageRank.

I'm a little disappointed that they did not into more depth. I would've liked to have seen a list of real world examples for each topic. I mean you could go into each separate paper to dig some of these out but since they went ahead to put this together, but why not add more? Hopefully this is just a small draft and we'll see an extended and updated version in the future.

doc_z

8:23 pm on Jun 25, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Indeed, that paper just give a very short review about three different approaches to calculate a reduced basis of personalized PageRank vectors:

- Topic-Sensitive PageRank is using a small set (e.g. 16) of topic related PR vectors as basis (i.e. the teleportation probability is topic dependend)

- Modular PageRank is using a set of highly ranked pages as basis

- BlockRank is using local blocks (e.g. stanford.edu) as basis. (i.e. the teleportation probability is 0 for pages outside the block and uniform distributed for the pages inside the block)

As already said, this paper doesn't go into depth. (However, thanks for mentioning that paper.)

Also, I have some problems with a number of points within this paper:

- The authors are refering to the case d < 1 (d: damping factor, named c in that paper) as the computation of eigen vectors. This is not the best point of view and it's causing unnecessary problems, e.g. removing danging pages. Seeing PR calculation as solving a set of linear equations by matrix inversion is mathematically, computationally and practically the better way.

- The BlockRank techniques described in this paper, doesn't correspond to the description given in the paper by Kamvar, Haveliwala, Manning and Golub. In the original paper links going outside the block are neglected (local PR). (However, this wouldn't correspond to personalized PageRank.)

Also, I'm missing important references for the block technique.

In my opinion (unfortunately I haven't the time so far to study all papers in detail), BlockRank can just be used to compute PR faster (especially due to the possibility of parallelize the algorithm). Topic-sensitive and modular PageRank are more interesting for finding extentions/improvements of the current PR algorithm.