I think EWOQ is much more than just a way to test algorithms or to catch spammers, but rather an extensive classification system that features prominently in the SERP's.
If this was just to catch spammers...then why have so many relevancy levels and this elite 'vital rank' they hand out they? Why cross reference these sites with specific queries?
If it is just to test the formula of the month...then why the extensive instructions on how to spot and label adult sites/scrapers/thin affiliates/hidden text/etc..? This sounds like stuff that goes on your permanent record. Plus it makes sense that some websites get this 'brand'/'vital'/'ownership' claim for certain keywords based on these manual reviews (such as the celebrity examples the guide gives).
Did some more searching on this program and found some interesting stuff... Apparently google doesn't hire these raters directly but uses a third party companies (like Lionbridge, Leapforce and Butler Hill) to do recruitment (apparently mostly targeted toward stay-at-home moms). The raters apparently do ad quality ratings and web quality ratings using a firefox plugin called 'EWOQ Mobile User Agent Switcher'. Apparently some websites have reported a telltale 'https://www.google.com/evaluation/search/rating/task-edit?task=#*$!#*$!' in their weblogs which may indicate they've been reviewed.
Perhaps in google's eyes, these stay-at-home moms are the 'ewoks' that will take down the imperial at-st spam walkers... Although their results are cross checked with each other, I doubt they're doing a good job. Many of these reviewers don't seem very technical and frequently get fired after poor performance reviews. Could explain why Cutts got so mock flustered during one of those google round table meetings as he described how difficult it was for reviewers to spot spam sites that were quite obvious to him.
I actually don't think the web is too big for a project like this. Yeah, there are probably upwards of 200 billion websites in the world, but actual relevant, local websites the number will be a lot smaller and google would only really need to evaluate some of these sites at once. There are too many search queries too check (especially long tail ones), but we know google keeps track of how many times each phrase has been queried, so it would be easy for them to say only check into queries that breach 50k a month (which would elimnate the vast majority of queries and allow google to find the types of queries most targetted by spammers). Google could also use an internal audit mechanism to flag suspect sites for review, but it wouldn't surprise me if 99% of sites say page PR3+ have been reviewed by the EWOQ program.