elsewhen - 10:03 pm on Jan 22, 2011 (gmt 0)
Everyone seems to be latching on to Matt Cutt's use of "content farm" and then defining that in all sorts of different ways. I think it would be valuable to unpack the phrase starting with Matt's own, albeit vague, definition:
As “pure webspam” has decreased over time, attention has shifted instead to “content farms,” which are sites with shallow or low-quality content
In this case, the use of the word "farm" is intended to be a derogatory description of content that is produced at a massive scale. The term seems to rely on the notion that the highest quality things are created by craftspeople in small batches vs. a factory approach of massproducion.
Lets look at different types of textual content on the web that are all created at scale, starting with the shallowest and lowest quality and moving up the quality ladder:
1) Scraper sites. These are sites that automatically create pages about a particular topic by aggregating content that it finds from around the web.
2) Sites with algorithmically-generated content that does add some value. There are many sites that aggregate info in useful ways.... The about-us type pages that aggregate everything possible about a domain can be marginally useful. A site that aggregates all social-media discussion about a particular topic/brand can be convenient too.
3) Sites with thin human-created content. wiki.answers has many pages with one-word answers. Most content by sites in this category is not checked by an editor. Some pages are valuable, many are worthless. Often, pages with questions but without answers pollute the SERPs (although Google seems to have reduced this problem recently).
4) Sites that rewrite existing web content. There is a wide variety of repackaging - simplifying a complex wikipedia article into laymen's terms could be quite valuable but just rewriting it with synonyms or adds nothing.
5) Rev-share content sites. These are the sites that allow users to contribute and market content and they share in the revenue that the pages create. Many of the links to these pages are given by the content producers themselves because they have something to gain, so the incoming link equity is not necessarily a true "editorial vote" that vouches for the quality of the content.
6) Rev-share content sites _with_ editorial oversight. Some rev-share sites have editorial oversight and reject submissions that do not meet their guidelines. Since there is virtually no cost (other than brand devaluation) for them to publish something of questionable quality, the tendency of these types of sites is to reject only the most egregious submissions.
7) Sites that pay upfront: Much of what Demand Media does goes here. They have editors that review all submissions; they reject poor submissions and fire poor contributors.
8) Sites that pay upfront where writers work with dedicated editors. These sites address the potential problem of quality at scale by breaking everything down into small teams - so you can think of this as a bunch of small craft brewers making as much beer as budwiser.
9) Wikis. There are good arguments that suggest that wikis are much lower on the totem pole, but since they will never be labeled a "content farm" by google, it doesn't really matter where we put them.
10) Traditional media. NYTimes is a content farm if you use the term "content farm" literally - content produced at scale... Take a look at pictures of their massive newsroom - it looks just like what you would think of as a "content factory" or "content mill."
The key question then, is everything done at scale necessarily bad? You may disagree with the order of the list above, but there are scores of sites much more in danger than eHow.
One last note: sites have more than one type of content. Some may have many original articles, but then many algorithmically generate pages that also pollute the SERPs. eHow has topic pages that add little or no value - those could be at risk. Many people have accused eHow of mirroring the exact flow of other web content (see #4 above) - those pages are probably at risk as well.
I am a little surprised about the eruption that this story has caused. I think it is important to note that the two parties most discussing this issue have their own axes to grind: webmasters and traditional journalists.
Webmasters are upset when other sites rank above theirs. If eHow outranks you in the SERPs, then you are likely to join the other army of webmasters that are similarly outranked and hope that eHow somehow gets axed from the SERPs. Traditional journalists also hate eHow, but for a different reason... Demand Media's compesation model undercuts the high salaries that journalists used to make when newspapers were in their heyday.
I am not defending eHow or Demand - I just dont think any of their original articles are at the level of risk that many people have suggested. If it is the case that a big portion of their traffic goes to topics-pages or to content that is rewritten from other web sources, then they should probably be a little worried. There are still sites at the bottom of the quality-ladder that still appear in the SERPs - Google will likely start there and then start to move up. I am not even convinced that they will ever get to #3, although the web would probably be better if they did.