I have not read the full article yet, but I will.
My only question is:
10,000 truly unique human evaluators is a large staff, even to out-source.
With that number, why havenít we heard more about this before?
or have we and I was sick that day?
[edited by: kamikaze_Optimizer at 3:28 am (utc) on July 10, 2007]
We've been hearing about it for years -- there was even a bit of a blow-up a few years back when someone leaked one of Google's training documents for these evaluators. Plus, if you follow various job opening boards, Google has openly advertised to fill these spots.
But I agree with your reaction - I was quite surprised by the number, too.
Yeah, I remember the past discussions and the leaked document, but 10,000 is a huge number.
Think of it this way - out of 10,000, 1% of them would be talking about it on blogs and the like..., and since the leaked document (what was that, 2-3 years ago?) we have heard nothing, correct?
I have seen Google IP's hit my sites (Corp IP's that is, not bots).
I blocked one by once for using a bad user agent and the employee emailed me, it was all very funny.
Odd and interesting, I think, about the 10,000 number.
It is clearly huge investment in quality control, if it is done correctly.
Even if they do have 10,000 quality evaluators, a lot of those people may be (and probably should be) part-timers.
|Even if they do have 10,000 quality evaluators, a lot of those people may be (and probably should be) part-timers. |
Yeah, I agree.
I would think that the burn-out rate at doing a quality job at this task would be about 2 hours per day at maximum.
Maybe the 10,000 number comes from ALL Google employees being required to perform this task for one hour per week, which would make more senseÖ
and might explain the popularity of the Wiki results in the SERPís :)
I don't believe. 10.000 is a too much big workforce to have "whenever forever" under control.
We have should heard something (critical) from the
I better believe some google's rep saying "We have human evaluators in our beta quality control program, a number between ten and a thousand"; inmediately, the journalist wrote "10.000".
I would not have even started this startling thread, except for the fact that Matt Cutts quoted this interview [mattcutts.com] in his blog today.
I think that gives the big number more credibility - and so I thought it was worth a thread. I'm still trying to wrap my mind around it, too. Matt finishes his post by saying "At this point, if you think that Google doesnít try to utilize human feedback in scalable, robust ways then you need to adjust your mental model."
And I still say hmmmm.... That's got to be a very interesting editorial feedback app they've got going.
|We've been hearing about it for years -- there was even a bit of a blow-up a few years back when someone leaked one of Google's training documents for these evaluators. |
eval.google.com - Google's Secret Evaluation Lab.."Rater Hub Google" Rumours? [webmasterworld.com]
I guess they refer to their normal staff. The employees are surfing the net during their work, and after work.
If they connect through some VPN through a Google proxy then Google gets hold of sites that are seen as quality by their employees (long stay time, many page views). Maybe they even have a little thumbs up/thumbs down icon somewhere in the browser? Or they all have to use the Google toolbar with all tracking options switched on?
Toolbar data is probably enough.
This sounds like another google fabrication.
OK, does the original excuse still apply? You know, collecting data to finetune the algo by. ( In 2005, when eval.google was leaked this was the official stance, right? )
But actually that's why the report spam, vote buttons, report bought links, ( and the toolbar's spy features, click data and the rest ) are there for... according to them at least.
They don't need data for QC, they need people to actually do ( something about ) it, they have all the data they need, so let's suppose all these reports land on the table at the evaluators.
No make that we KNEW that.
If it's the 10.000 people, that's not much.
Actually all employees are browsing the web.
A special toolbar comes to mind.
Before, they'd try to hide, deny, cover up all leads to the "active human factor", just take the case of TrustRank. Trademarked, yet there's not a single official comment? Pretty obvious as to why though.
Or is it that nothing gives away the "active" from what has been said.
10,000 may seem like a lot but it's not enough. Their SERPs are full of scrapers and SPAM and have been for ages. I still see a lot of out of date, illegal, and poor content ahead of my site in SERPs including MFAs.
How exactly are these "evaluators" evaluated and evaluating? Can somebody fire them and get a new 10,000?
Well..it seems there are several "classes" of evaluators.
This class of evaluators is for folks having BA/BS degree:
Search Quality Evaluator - Mountain View - Google Jobs [google.com]
However don't expect that GOOG is looking for 10.000 of the kind to be "installed" at Googleplex.
And here you have another evaluators class which doesn't require BA/BS things.
Search Quality Coordinator - Hyderabad , India [google.co.in]
As you might expect, there will be accordingly large differences in the quality of evaluations of the different evaluators!
Talking about "human-role-in search" in Google :-)
Personally I don't see that much spam anymore. (please don't shoot :) ) As a frequent searcher I generally find what I'm looking for on page one and if I get to page 3 it;s cold day when I have nothing better to do.
I'm not surprised they have people in the mix, makes sense when you look at other search engines that seem to only be using an algo....lots of cheap spamminess coming up on page 1.
> 10,000 truly unique human evaluators is a large staff, even to out-source.
> With that number, why havenít we heard more about this before?
or have we a
We have. She is referring to the fact that she considers every employee at Google a "search evaluator". How to use Google, and how to report poor quality search results is part of orientation for any new employee. If you are a googler and find poor quality search results, it is your duty to report it.
What I believe we're looking at is people training a neural network type model for removing poor quality sites and promoting underranked sites. In the same way that Google feeds relevant spam reports into the model, I suspect these people are spread out across the major subject areas doing nothing but training.
G is becoming a myth. Lots of post and publications about it have become silly or just conjectures about some patent someone found. "G buys psychics group to predict blah blah", "footers have huge impact on how fast G kick you out".
I'm at the point of just building for readers, people, as integrating what I read about SE is becoming irrational, not logic and almost absurd. Perhaps we will be hit hard by SEs but if mails keep coming, ads clicked and products being sold, I guess I could be happy.
Google note. [webmasterworld.com]
I use to work as a QA evaluator for G. We were contracted through a 3rd party. It was decent pay but the work was a bit tedious as you might imagine. They really had some good quality measures in place to ensure proper ratings by the QA folks. I'm not sure I'm allowed to delve into the details, but I will say I was impressed with the system they put together for the QA folks to use. It was all web-based (no vpn needed).
(I don't know why a vision of monkeys, typewriters and Shakespeare came into my head when I read the header of this item, but it did)
When you wrote this:
|We have. She is referring to the fact that she considers every employee at Google a "search evaluator". How to use Google, and how to report poor quality search results is part of orientation for any new employee. If you are a googler and find poor quality search results, it is your duty to report it. |
are you referring to the following section of Dare Obasanjo session with Marissa Mayer?
|Q: How do they tell if they have bad results? |
A: They have a bunch of watchdog services that track uptime for various servers to make sure a bad one isn't causing problems. In addition, they have 10,000 human evaluators who are always manually checking teh relevance of various results.
It surprises me that Google uses these evaluators because the data should be available elsewhere.
Looking at click patters on their site after SERPS change and comparing the number of people click to page 2, or the number that refine their search would gauge result quality.
Looking at toolbar or analytics data for time on a site after a search would also be insightful.
The best data that Google could possible have is the feedback provided by end users in the form of clicks, time on a given result site, and whether they return to try a different result. Google has for a number of years released products that give them access to just this type of data (toolbar and analytics being notable examples). I think the judgment of the masses would (in this case) provide much better data then the opinion of a handful of quality evaluators.
Current number of employees at Google:10,674
As with much of what "google" says, I have a hard time accepting this statement at face value.
i recently posted on matt's blog questioning google's commitment to search by asking him what the balance was between search staff and advertising staff..
this kind of release could be seen as a propaganda type response to questions like mine.
i don't believe it at all. i seem to remember matt saying that the webspam team had 200-300 in it. he said that at the time vanessa fox left. all this is based on recollection..
For the sake of future reference, I'm recalling the text of Google's job announcement [google.co.in] which I mentioned in my previous post:
|Search Quality Coordinator - Hyderabad |
Position based in Hyderabad, India.
Do you have a passion for Google? Do you desire to help improve the quality of Google's search results? Google is recruiting enthusiastic, web-savvy individuals for search quality evaluation.
* Reviewing assigned sites for quality and content.
* Troubleshooting website issues and identifying areas of concern and interest.
* Investigating web sites.
* Working on special projects, as needed.
* Excellent web research skills.
* Excellent analytical skills.
* Detail oriented; ability to complete large volume of work quickly.
* Proven track record of exceptional performance, high productivity and meeting deadlines.
* Ability to work cooperatively and proactively with team members.
* Fluency in English.
* One to three years related experience in an Internet company and with web research.
* Previous experience with a computer programming language.
* Familiarity with typical web practices such as managing a domain name.
* Basic HTML experience.
Any guesses as to the salary?
or are they building/training the algo for more advanced PC users of the future? hmm
|10,000 may seem like a lot but it's not enough |
obviously not! or, and that is a guess, the feedback can not be integrated as it should be.
- "artificial sub domain link boost" works (at least for some major brands I watch)
- the "domain age advantage" is much too strong
- "brainless link bait" produces too good results for a human filter system, IMHO
and some other quality glitches I see, when I google around vote against a 10k human ranking factor.
also, the quote says "... user feedback... is a signal ...", which I understand as another of many signals finally needed to be digested by the algo.
I can not believe, that hiring 100 helping hands myself (which a lot of professionals here could afford) and making them file complaints about specific sites would make any difference in the rank of that site, if all other signals are OK!
this smells like another "we need good press about our quality" approach by a company that re-invented online marketing.
IMHO, the truth lies somewhere between all the facts and fictions around the algo and comments from single googlers.
So, a signal? Yes! More than 10k human rankers? Probably part time, summing up to 500 full-time rankers... maybe even less. I think I could clean out 5 domains per hour (when i work really slow) without real hard work, just using a solid commercial keyword list.
500 full-time human editors could do 2500 domains per hour, which would sum up to 80,000 spam domains wiped from the index per week, yet there are such poor results to find? Either the signals are very very weak or the editors are really really bad!
| This 106 message thread spans 4 pages: 106 (  2 3 4 ) > > |