We've known for a while that Google uses human evaluation in evolving their algorithm. They've even got a patent for integrating the humans and the algorithm [webmasterworld.com]. On 2007-06-23, Google held a Scalability Conference in Seattle. Here's an interesting tidbit from a Q&A session with Marissa Mayer:
Q: How do they tell if they have bad results? A: ...they have 10,000 human evaluators who are always manually checking the relevance of various results.