homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Ask - Teoma
Forum Library, Charter, Moderator: open

Ask - Teoma Forum

Teoma Development

 2:35 pm on Dec 26, 2005 (gmt 0)

Im a programmer and webmaster , Id like to find indepth information into Teoma and how it works. Does anyone know where I can find information on how teoma works, its algorithm , ranking , sourcecode etc?



 8:12 pm on Dec 26, 2005 (gmt 0)

Teoma has been rather willing to share information about their algorithm -- which uses insights from Jon Kleinberg's HITS algorithm to identify "web communites". Teoma also conquered quite a technical a challenge in finding a way to give rapid results, building those community clusters on the fly. I've often wondered if the scalability of that solution isn't a major reason why AJ/Teoma hasn't taken on Google in a big way.

Mike Grehan has a very informative interview with Paul Gardi, SVP Search at Ask Jeeves/Teoma online:

At the end of the interview there's a link to a free pdf with more information about HITS and linkage based algorithms.

<fixed spelling>

[edited by: tedster at 5:12 pm (utc) on Dec. 30, 2005]


 12:46 pm on Dec 27, 2005 (gmt 0)


Yes I understand they base the majority of the ranking on the HITS principle , Is there anywhere which provides information on how they managed to improve upon it?


 5:56 pm on Dec 30, 2005 (gmt 0)

As I understand it, the main algorithm can be understood as HITS, modified by CLEVER [almaden.ibm.com], further modified by work done in the DISCOWEB [cse.lehigh.edu] project. That last link offers some good detail on the math involved, as well as further source papers in the footnotes.

I also always assumed that there was a pinch of HILLTOP [cs.toronto.edu] thrown in to limit maniplation by affiliated websites, but I can't find confirmation anywhere. Teoma's "topic distillation" seems more to be an alternative to the Hilltop approach.

Beyond that, as I said earlier, the big deal for Teoma was creating a way to retrieve and cluster the results with a runtime measured in seconds rather than minutes -- but that is more operational rather than algorithmic. I don't think anything like exact sourcecode is publicly available.

Another good starting point is this pdf, also from Mike Grehan:

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Ask - Teoma
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved