Fallacy of a Voting System to Identify Quality Content

I posed the question to the SE reps at SEMconf about how they identify quality content. We have heard over and over again that in order to improve ranking, just add quality content to your site. The reply across the board was that quality content is identified via a voting system, one site votes for another through linking (the traditional philosophy Google PR is based on).

The concept originates in academia where it kind of works. For instance, if one research paper quotes several outside sources, it can be assumed that those outside sources have a degree of quality for the subject at hand. The theory is then used with one website linking to another and counting the link as a vote.

The problem in academia is the assumption that the writer has all available sources on a topic at his disposal, is readily familiar with all sources, and has honestly and accurately selected the best source to quote in a given paper. As you can see, there are a lot of assumptions that are made here.

Now, when that system is transferred to the Internet and used as the core philosophy behind a ranking algorithm, even more flawed assumptions emerge:

Flawed Assumption 1 - Websites are not research papers. They serve many more purposes than just to present an argument while quoting the best sources available. Sites may be informative, entertaining, commercial, hobby, or just plain nonsense etc.

Flawed Assumption 2 - Websites choose to link to other sites for a variety of reasons including:
a. link exchange agreement
b. link is purchased by another site
c. link is a personal favor from the webmaster
d. link is given to a site owned by the same company
e. link is given because the site does give quality information on the topic

Flawed Assumption 3 - Websites choose not to link for various reasons including:
a. there are no financial benefits involved
b. the site wants to retain its visitors
c. the site does not want to pass visitors/or PR to its competition
d. company protocol does not allow external linking

The interesting thing is that the link voting system is based on an assumption, that sites are linking because of reason 2e above. However, we can see that there are many reasons (many not stated above) why a site would link or not link. In fact, 2e is rarely used (except maybe in the case of online research papers :)) . Therefore, to build an entire ranking and relevancy algorithm based on this assumption is extremely flawed.

Granted, it is hard to come up with a better solution. Teoma boasts of their new community identification technology (term vectors) but it is based on the same flawed system even if on a more defined scale. We all know on page criteria can easily be abused and is therefore in itself not a viable solution to the voting system. Yet we have found out that link criteria can also be abused albeit, not as easily or readily.

Conclusion:
The current identification of quality content via a link voting system is essentially flawed because it does not take into strong enough consideration the reasons why one site would or would not link to another site. The assumption that site A contains quality content due to a relatively high link count for a certain search term is misleading due to the many factors why the sites linking to A are doing so.

Fallacy of a Voting System to Identify Quality Content

JamesR

digitalghost

JamesR

dmorison

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week