TheMadScientist - 9:17 pm on Jan 13, 2013 (gmt 0)
Wave of page = ?
Everything they can get algorithmically and compare to all other pages they find web-wide ... Basically, every variable they can apply to every page, but also limited to only those they can apply to every page ... 200+ variables at last count and way too many to list, especially when you take into account the image search could (easily in my opinion) be applied to a site's pages, so colors, white space, design, layout, etc. can also be factored in.
Wave of user = ?
Same variables as the wave of a page, likely multiple waves based on determined query intent typing and other 'relationships to queries' (which is likely where the 'knowledge graph' would come into the picture) ... Those waves would then be refined by an Individual's Specific Behavior in the results, not their behavior on the pages of external sites, because you really don't need it and you're limited in the amount of data you can get visitor behavior on external sites from, so why use it when they will tell you consistently in the results what they like and don't like over time?
In my opinion, the most reliable way to personalize results and 'grab the right answer out of the whole of the index' for a specific visitor is to base the personalization of the wave(s) assigned to a visitor on the results you show and how each visitor individually interacts with them, because it's by far the most consistently available information you have access to.
The results are what they're trying to refine and they're trying to do it on a per-user, per-query, per-page basis to get the 'right answer' for an individual out of the whole of their index and the results are what they would absolutely have to 'fall back on' when the visitor uses Explorer and visits a page without Analytics (EG Yahoo! or Apple or Bing or Microsoft, etc.), so why would they ever 'leave the results' for info in the first place rather than figuring out how to 'gauge behavior' (or 'make determinations') within them more accurately by comparing their refinements or method of refining the results to 'other data sources' (such as Chrome or Analytics) for verification of direction?
ADDED: That's Exactly what they did by running Panda and automatically incorporating the blocks from Chrome as 'outside verification of direction' and removing sites that 'fit the pattern' the algo found when they were 'Chrome Block Verified' ... Looking to Chrome users for independent verification of the refinements/method of refining the results is totally different than 'driving the results with Chrome' ... They just 'automated the outside verification of direction' they were going for refinement of the results rather than having people physically look at a list to make sure the algo was 'hitting the right sites' when there was a 'high degree of certainty' the algo was correct in the pattern match.