Forum Moderators: Robert Charlton & goodroi
[edited by: TheMadScientist at 7:10 am (utc) on Jan 13, 2013]
When Panda launched initially, Google said that they didn’t use data about what sites searchers were blocking as a signal in the algorithm, but they did use the data as validation that the algorithm change was on target. They found an 84% overlap in sites that were negatively impacted by Panda and sites that users had blocked with the Chrome extension.
Now, they are using data about what searchers have blocked in “high confidence situations”. Google tells me this is a secondary, rather than primary factor. If the site fits the overall pattern that this algorithm targets, searcher blocking behavior may be used as confirmation.
[searchengineland.com...]
So, it's Still (as of Panda 2.0) used as a 'confirmation', but now it's automated into the system, so if fits the pattern AND it's blocked frequently there's no 'independent verification' needed ... It's not actually the 'blocks' in control, it's the other way around ... A site 'fits the pattern' and blocks are automatically taken into account as 'verification' the site shouldn't be there.
IOW: It's the algo that 'starts the ball rolling' and is 'written for the whole', then if there's 'independent confirmation' via blocks, the algo's 'confirmed' and a site can tank, but the blocks aren't 'the driving force' of the algo, they're the 'confirmation' of it.
If you run a search engine the size of Google, with the goals of 'determining the one right answer' for people and 'organizing the world's information'
I do believe the "one right answer" is the goal, it's just not quite ready for public consumption yet.
My view is that Chrome data is used in Panda which, as we know, is an 'add on' to the main algo. I think the main algo does all the usual relevance calculations and user metrics then come into play with Panda. I think it's more likely that google use data collected from other means to verify user metric data collected from Chrome.
The Internet is comprised of approximately 78 million servers that span the globe (That number is quite possibly very low.) Information on the Internet is being measured in Terabytes, and a Terabyte is 1,000 Gigabytes. One estimate in 2005 by Eric Schmidt, CEO of Google, puts the estimate at near five million Terabytes of information on the Web, four years ago.
Google's search engines managed to index about 200 Terabytes in seven years (as of 2005) as a comparison of how large that really is; 200 Terabytes is only .004% of five million Terabytes!
700,000 new pages of information per minute are added to that tally. If the internet stopped all forward progress it would take another 300 years for Google to index it all.
[voices.yahoo.com...]
Surveys are routinely conducted on a representative sample and this has been found to be a reliable indication of what everyone thinks. Apply this to Chrome and in my view it's highly likely with a sample of a third of all Internet users in the world, you could come up with very representative metrics about every site out there.
There's pages on sites I've built I've never even seen (and I'm not sure anyone has), but that doesn't mean they or the sites they're on aren't useful to anyone either ... But a single 'satisfied Chrome visit' wouldn't be enough to 'score' them or the entire site on by any stretch either.
I would lay money on the fact that they are not ranked in the top 10 for any quantifiable search term either.
Chrome users will be more likely to fit into certain demographics, just like IE, Firefox, Opera etc users might be more likely to fit into different demographics. You'd be skewing the data based on that fact for sure.
I'm open minded to other ideas such as user metrics coming from another source but I dont buy the argument that it's possible to programmatically replicate the million decisions a human makes in a second about a site.
My comment was that if a page,had no visitors, no traffic and had never been visited by Chrome users, then it probably is either not indexed or not relevant to a quantifiable search term. It had nothing to do with site metrics, but page metrics.