Forum Moderators: not2easy

Message Too Old, No Replies

Fallacy of a Voting System to Identify Quality Content

         

JamesR

9:33 pm on May 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I posed the question to the SE reps at SEMconf about how they identify quality content. We have heard over and over again that in order to improve ranking, just add quality content to your site. The reply across the board was that quality content is identified via a voting system, one site votes for another through linking (the traditional philosophy Google PR is based on).

The concept originates in academia where it kind of works. For instance, if one research paper quotes several outside sources, it can be assumed that those outside sources have a degree of quality for the subject at hand. The theory is then used with one website linking to another and counting the link as a vote.

The problem in academia is the assumption that the writer has all available sources on a topic at his disposal, is readily familiar with all sources, and has honestly and accurately selected the best source to quote in a given paper. As you can see, there are a lot of assumptions that are made here.

Now, when that system is transferred to the Internet and used as the core philosophy behind a ranking algorithm, even more flawed assumptions emerge:

Flawed Assumption 1 - Websites are not research papers. They serve many more purposes than just to present an argument while quoting the best sources available. Sites may be informative, entertaining, commercial, hobby, or just plain nonsense etc.

Flawed Assumption 2 - Websites choose to link to other sites for a variety of reasons including:
a. link exchange agreement
b. link is purchased by another site
c. link is a personal favor from the webmaster
d. link is given to a site owned by the same company
e. link is given because the site does give quality information on the topic

Flawed Assumption 3 - Websites choose not to link for various reasons including:
a. there are no financial benefits involved
b. the site wants to retain its visitors
c. the site does not want to pass visitors/or PR to its competition
d. company protocol does not allow external linking

The interesting thing is that the link voting system is based on an assumption, that sites are linking because of reason 2e above. However, we can see that there are many reasons (many not stated above) why a site would link or not link. In fact, 2e is rarely used (except maybe in the case of online research papers :)) . Therefore, to build an entire ranking and relevancy algorithm based on this assumption is extremely flawed.

Granted, it is hard to come up with a better solution. Teoma boasts of their new community identification technology (term vectors) but it is based on the same flawed system even if on a more defined scale. We all know on page criteria can easily be abused and is therefore in itself not a viable solution to the voting system. Yet we have found out that link criteria can also be abused albeit, not as easily or readily.

Conclusion:
The current identification of quality content via a link voting system is essentially flawed because it does not take into strong enough consideration the reasons why one site would or would not link to another site. The assumption that site A contains quality content due to a relatively high link count for a certain search term is misleading due to the many factors why the sites linking to A are doing so.

See also:

Reasons Not to Link [webmasterworld.com]
One Way Linking [webmasterworld.com]

digitalghost

9:58 pm on May 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I never once assumed that quality content was identified by link votes and I'm not sure why anyone would. Popular content is linked to frequently and in some cases quality content is popular. Controversial content gathers links quickly but I've been to enough controversial sites to know that controversy isn't the hallmark of quality.

Quality is subjective. The web is a popularity contest as far Google is concerned. They can talk about quality all they like but if I type in a seach phrase and I get about.com as my #1 result then we're not exactly talking about quality are we? All Google did was shunt my query off to another site that will send me somewhere else when all I really want is to find is info on classic literature.

JamesR

10:03 pm on May 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I never once assumed that quality content was identified by link votes and I'm not sure why anyone would.

My point exactly. :)

The hard part isn't really noticing the problems, the challenge is coming up with a solution and in all my criticism, I admit I don't have one.

I hope that the average webmaster and many who are new on the web will identify the real issues when they hear "content is king" or "add content to your site to get higher rankings".

dmorison

10:31 pm on May 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi James,

Therefore, to build an entire ranking and relevancy algorithm based on this assumption is extremely flawed.

I wouldn't be so harsh as to say extremely flawed. I would buy that a ranking and relevancy algorithm based on those assumptions is becoming less effective as the web becomes a more commercial place.

Remember that those search engines that have become popular were developing their technology back when the web _was_ a large collection of research papers.