homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

Academic Papers by Googlers

 3:20 am on Apr 25, 2008 (gmt 0)

This may not be everybody's cup of tea, I realize, but for those who are intersted, theres' a very solid online collection of academic papers written by Googlers


Categories with the Current Number of Papers:

Algorithms and Theory (65)
Artificial Intelligence and Data Mining (31)
Audio, Video, and Image Processing (36)
Distributed Systems and Parallel Computing (71)
Education (1)
Human-Computer Interaction (32)
Hypertext and the Web (12)
Information Retrieval (33)
Machine Learning (53)
Natural Language Processing (35)
Operating Systems (2)
Science (11)
Security, Cryptography, and Privacy (36)
Software Engineering (13)



 3:58 am on Apr 25, 2008 (gmt 0)

Two examples of some gems I'm appreciating from that collection:

Finding Near-Duplicate Web Pages: A Large-Scale Evaluation of Algorithms by Monika Henzinger

This paper was not new to me - i belive Marcia pointed it out a while ago. It really opened up my eyes to the challenge of attributing a document properly, and filtering out the secondary versions. Some of the urls that hide behind "omitted results" links owe their hiding place to this kind of logic.

Structured Models for Fine-to-Coarse Sentiment Analysis by Ryan McDonald,et. al

Sentiment Analysis is a particular interst of mine. It's a kind of semantic processing that works to determine the "sentiment" of a document. That can mean many things, but two key areas for Google would be where a document falls on a postive to negative scale in its approach to a topic -- or where it falls on a spectrum of subjective (opinion-based) to objective (fact-based).

You can see how Google would be very interested in this kind of challenge. In fact, I thought I saw signs of Sentiment Analysis in the first page results last year. But I asked some Google staff about it at PubCon and was told it's not currently in use - and that it is definitely a "hard problem." If you just think about an algorithm trying to make sense of irony, you can quickly appreciate how hard the problem can be.

Those two words currently being tossed around by top staff - "diversity" and "serendipity" - certainly could Incorporate some sentiment factors in the future.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved