Forum Moderators: open

Message Too Old, No Replies

Topic Sensitive Page Rank

from WWW2002

         

JamesR

4:56 pm on May 14, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



thanks to msgraph for alerting us to the presence of these new papers WWW2002 papers [webmasterworld.com].

The possible evolution of Google and/or its technology in Topic Sensitive PageRank [www2002.org]

Similar to the Hilltop algorithm [webmasterworld.com] we have discussed earlier but takes it further. This approach wants to make Google more relevant by dividing the pagerank into relevant topics to the search query. This is an effort to keep high PR sites from showing up on irrelevant searches.

Our approach to biasing the PageRank computation is novel in its use of a small number of representative basis topics, taken from the Open Directory

That's novel?

JamesR

5:04 pm on May 14, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Other interesting bits:

For instance, the user's bookmarks and browsing history could be used in selecting the appropriate topic-sensitive rank vectors.

or the Google toolbar? :)

At the end of the paper, the writers are aware of the "adversarial editors" factor in the ODP, but the algo still seems way to reliant on this one directory. This is an incomplete data pool on which to base an algorithm, IMO. Yet I can see that this may be the best option out there.

All this algo does is use one more step on top of the current PageRank system by adding one more inbound link to determine the topic (from ODP). After the topics are assigned, then PR is calculated as usual to my understanding.

bird

12:09 am on May 18, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Isn't this the same paper we discussed here [webmasterworld.com]?

ggrot

1:22 am on May 18, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wonder if google has ever considered starting up their own directory?

brotherhood of LAN

1:35 am on May 18, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I would love google to have their own directory.....but the way things are going for them, you cant blame them for empowering DMOZ (Freebie) for the directory results. If dmoz wasnt part of google it would fast dissapear IMO :)

msgraph

1:52 am on May 18, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Automatic categorization is a hot topic. Only problem is, no one has really gone through with it well.

Even one of Google's own software engineers wrote a paper on it. The Use of BiGrams to Enhance Text Categorization [serve.com] (1.1 mb pdf)

[edited by: msgraph at 1:15 pm (utc) on May 22, 2003]

msgraph

3:20 pm on May 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Extended version has now been released.

Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search [dbpubs.stanford.edu]

vitaplease

3:17 pm on May 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the update on the paper update, msgraph.

A quick "spot the differences" seems to be the addition of chapter 6.

Basically new offline and query-tim processing techniques are discussed such as the Quadratic Extrapolation and other recent speeding algos.

Anyone else found something interesting?