We've got a lot of data that we're collecting for our clients, and I realized that it might be possible to detect search engine algorithm changes retroactively, and possibly in real time as they're happening.
But we're not really mathematicians, so I'm not sure how we could use the data to make some meaningful conclusions.
We track all the search engine traffic to our customers over a 30 day period.
We have 3 million data points of traffic data containing the keyword and rank in Google (from the referrer)
We have a total of 5 million unique keywords in our database across all our customers
We're tracking 2500 different websites
We have 2 million pages indexed
Most of our clients are integrated with Google Webmaster Tools and Google Analytics, so we could draw data from there as well
... and much more.
We need to protect our client confidentiality, but we'd be happy to search for information in aggregate to detect search engine algorithm changes, or... anything. It didn't occur to us back then, but I'm pretty sure we would have detected Panda within minutes of it sweeping through our clients.
Let me give you an example.
I rank highly for the term "universe", so it gets a few hundred search visitors a day. Roughly 30% of those visitors contain the rank in the referrer URL from Google. So I get a measurement of my rank every few minutes. Obviously it's going to vary because of personalization settings, but it should be possible to pick out larger changes - and something like Panda would have been pretty significant.
Our plan right now is to give our clients some kind of diagnostic tool, so they see how their overall rank is doing from day to day, but I think we could aggregate that across all our clients.
Anyway, if anyone has any ideas, we'd be happy to try and implement them and pull some signal from the noise. [edited by: tedster at 7:42 pm (utc) on May 23, 2011]