Ok, this post stuck like a grain of sand in my shoe that just wouldn't go away until I went back and gave it some deeper thought.
So I went back and re-read the article, and thought about it some more. Then grabbed my GF to get her to read it, and give me an opinion just to make sure my train of thought wasn't going off the deep end.
And our basic conclusion was that this is an incredibly creepy, and wrongheaded way of looking at things. Here's a rundown of some of the major issues we both ended up taking on the article:
Cross Behaviour Posting
Depending on the topic, I tend to reply/participate in different ways. On some subjects I'll drop a quick single "reply" and move on, maybe checking in to see how other people responded. Sometimes, they respond en masse, sometimes they don't. Other times, I'll get fully involved in a discussion/debate, bouncing posts back and forth as the thread grows. According to the formula they use, this would make me both an "answer guy" (read: trustworthy), or a prolific questioner (read: untrustworthy). My rating would depend on which of my threads you sampled. Either rating would be essentially wrong. This could apply to a great many people.
The Sample the Researcher Selected is Too Specific
The MSN research is formulating his data based on Usenet postings. And that's a HUGE mistake. Even if I thought the formula he derived for the system was relevant (which I don't), it would be relevant only to Usenet users. Usenet has become a "fringe" of the net, and a very specific type of person participates in Usenet. Assuming that the population at large would act in a similar way in other forums is unjustified. Certain groups of people tend to Usenet, certain types tend to BBS, certain types Blog, some do all three, and the majority of the population participates in none of the above. The old McLuhan adage that "The Medium is the Message" tends to be quite true, anecdotally at least. Radio commentators act differently from TV commentators act differently from print journalists. Making a formula that quantifies a radio commentator's trustworthiness and then trying to apply the same formula to newspaper columnists would be wrong at face value. They're different professions, talking to different audiences, using different methods.
The Formula Would Eliminate Too Much Valuable Data
Any statistical model used to determine relevancy is essentially a negative response algorythm. It is used to weed out irellevant data. In many fields, this can prove quite helpful, such as in medical research where you can use it to determine anomolous responses to medications and treatments. It lets you select out the "freak" occurrances.
The problem is the greater the sample that a negative response algortyhm eliminates, the less valid it becomes. If such an algortyhm eliminates 50% of a statistical population, then there is obviously a very large segment that is simply unquantified, and you can't draw any real conclusions from the sample selected by the data.
This MSN research looks to be eliminating a far greater proportion of the data. They're trying to use it to base decisions on reading as few opinions as possible. At a guess, I'd say they were looking for less than 5 to 10 % of the sample data to take back for further review. That way, they only have to read a small amount of postings. This is eliminating 90% + of the population as irrelevant.
I doubt if I need to explain how wrongheaded that concept is.
That Microsoft is using this research to apply to their product development cycle, tells us more about MS's corporate mentality than it does about how people post in discussions.
It tells us that MS is losing touch with the "human factor" in product development, and wants to reduce customer response, and development issues related to customer response, to a statistical analyses of issues. They don't want to actually read through massive amounts of complaints and suggestions, they want to make an algorythm that tells them that "Response A is statiscally relevant, response B isn't, so we'll work on dealing with response A and ignore B."
Again, wrongheaded. It skews there development cycle towards appeasing a certain portion of the population that they have determined is statistically relevant, and ignores all else. If you respond in forums in certain ways, and are responded to in certain ways, then you're relevant. If you fall outside that narrow definition of relevancy, then obviously you aren't relevant and your opinon doesn't matter.
Good customer service is based on everyone's opinion being important. This doesn't mean you cater to everyone's opinion, it just means that you take as broad a sample of the population into consideration as possible.
MS doesn't want to deal with you as an individual. They want to eliminate 90% of our opinions as irrelevant, and work on the rest.