|WSJ Writes about Google Quality Raters|
The "All Things Digital" blog at The Wall Street Journal has a 3-part series of interviews on Google Quality Raters system. John Paczkowski (blogger of All Things Digital) interviews Scott Huffman (Director of Engineering), Matt Cutts (Senior Engineer, Spam Team), and Amit Singhal (Google Fellow, works with the Search Quality team).
The first interview includes a link to the Google Quality Rater manual (the 2007 edition). You can download this and see for yourself how Google's team of 10,000 (yes, ten thousand) contractors review and evaluate websites according to a long list of criteria.
It's the QUALITY of the page, not SEO, that counts. A website can be SEOed to the gills, but the Google Quality Raters evaluate it and if they don't like it, Google engineers write new filters to block it.
Google uses humans, not software, to evaluate sites. The software does the heavy work: the indexing of billions of pages. But "bad" pages creep into the top results: either the filter was poor, the page is spam, or the page uses SEO tricks. So humans look at the top results, evaluate these, and the filters are adjusted. As for bad pages, these are pushed down (Matt Cutts admits that in today's interview.)
This means that much of what passes for SEO (keyword density, page rank, back links, etc. has a limited value: it can get a page INTO the index and it can bring a page up in ranking, but the Google Quality Raters will look at the page and evaluate not on the keyword density, meta-tags, etc., but on the quality, which means navigational, informational, or transactional criteria.
You can see the implications of this. What it means for corporate sites, mom-and-pop sites, and so on. You can also see why Google is doing this (and why this has been such a major secret at Google).
Scott Huffman interview [digitaldaily.allthingsd.com]
Matt Cutts interview [digitaldaily.allthingsd.com]
Amit Singhal interview [digitaldaily.allthingsd.com]
[edited by: tedster at 3:58 am (utc) on June 6, 2009]
[edit reason] Add links to the interviews [/edit]
But are ten thousand people enough to find all the sites that need to be weeded out? And why would Google have to keep creating new filters at this stage of the game?
|But are ten thousand people enough to find all the sites that need to be weeded out? |
GoogleGuy (when he used to hang out here) once explained that quality evaluators are used for benchmarking purposes. In other words, the 10,000 evaluators aren't there to weed out sites per se; their job is to help train Google's "black box" by identifying examples of sites that need weeding.
As for why Google would have to keep creating new filters at this stage of the game, I'd say it's because:
1) Search technology doesn't stand still.
2) SEO doesn't stand still, either.
The Google patent we discussed in 2006 [webmasterworld.com] makes it pretty clear how the human editorial input process works, at least in a broad way. The quality raters are reviewing a set of search results - they're not trying to exhaustively review every site there is. A site is only an issue if it ranks inappropriately for some common search phrase.
These interviews are pretty short and superficial, IMO. I guess the idea is to get a little bit of information to an audience who doesn't pay the kind of close attention that we do.
|A website can be SEOed to the gills, but the Google Quality Raters evaluate it and if they don't like it, Google engineers write new filters to block it. |
That's an oversimplication of what they said. From the Scott Huffman interview:
|We donít use any of the data we gather in that way. I mean, it is conceivable you could. But the evaluation site ratings that we gather never directly affect the search results that we return. We never go back and say, ďOh, we learned from a rater that this result isnít as good as that one, so letís put them in a different order.Ē Doing something like that would skew the whole evaluation by-and-large. So we never touch it. |
Google's evaluators don't push individual sites up or down in the ratings. Google is using them to find subjects where the algorithms are producing spammy results.
Matt Cutt refers to "policy violations that are pretty egregious" like adult sites spamming their way into non-adult results.
You can download the Google Quality Rater manual from the Wall Street Journal blog and see for yourself what the quality raters do.
The Google Quality Rater manual lays out in detail how they evaluate all sites, not just spam. They evaluate corporate sites, informational sites, shopping sites, and so on.
The manual also discusses how to identify spammer sites. They also identify affiliate sites.
These evaluations are used to modify the algorithms to increase the display for desirable sites (sites which best match the navigational, informational, or transactional criteria) and downrank the undesired sites.
As Google says: "Feed the winners, starve the losers." The best sites move up, the remainder are moved down.
This is done both in natural search and paid search. A different team evaluates ads and landing pages. They look at criteria for shopping carts.
All of them do this: Google, Yahoo, and Microsoft have teams to evaluate paid ads. Both Google and Yahoo have teams to evaluate the organic results. I don't know if Microsoft Live had this; it's fairly obvious that Bing.com have human evaluation.
|and why this has been such a major secret at Google... |
It's not a secret. This has been discussed for years and years now.
Sep. 19, 2004
June 2, 2005
Google's Human Touch - Rater Hub - The Secret Google Lab?
Google Patent - human editorial opinion
Aug. 25, 2006
July 9, 2007
Google's Human Evaluators - 10,000 of them?
July 10, 2007
Google Has 10,000 Human Evaluators?
This is information for those who don't don't follow the industry as closely as our community has been following this particular story since 2004.
It's not a secret? Ask Google for a recent copy of the manual :-)
In my industry there are a few sites that have fake/self-generated user communities, fake/self-generated "breaking news" and lots of money to fool everyone with large press release distribution and nice designs.
These sites get a lot of traction from Google and I can very well see a novice think that the information is great while it's fabricated. You can see that all is fabricated but unless you are a webmaster or in the industry and verify the info it's not easy to tell.
Search engines including Google and it's human reviewers still have a lot to do in my opinion. Reminds me of Google's CEO saying that the web is a cesspool. It is a unfortunately there are people ahead of Google's game all the time.