Forum Moderators: open
The problem of finding "similar" objects arises in many applications, and many domain-specific techniques have been developed, e.g., matching text across documents or computing overlap among item-sets. We propose a complementary approach, applicable in any domain with object-to-object relationships, that measures similarity of the structural context in which objects occur, based on their relationships with other objects. Effectively, we compute a measure that says "two objects are similar if they are related to similar objects." For a given domain, our general technique can be combined with other domain-specific similarity measures. The formalization and computation of our similarity measure, called "SimRank", is similar in spirit to previous recursive algorithms (such as PageRank) for computing importance of Web pages, although ours is more complex and expensive since we must consider object-pairs instead of single objects. We suggest techniques for efficient computation, and we provide experimental results on two application domains showing the computational feasibility and effectiveness of our approach.
[abstract pulled from link below]
Jeh, Glen; Widom, Jennifer. SimRank: A Measure of Structural-Context
similarity. Technical Report, Computer Science Department, Stanford University,
2001
dbpubs.stanford.edu:8090/pub/2001-41
(check out the PDF)