Welcome to WebmasterWorld Guest from 184.108.40.206
For my top target term there are still some differences between the Caffein sandbox and .com results. It does not affect me and looking at them dispassionately the Caffein results (subjectively) serve the user better. More of my significant real world competitors appear and less of the buy their way to the top jokers.
[edited by: tedster at 5:52 am (utc) on Sep. 7, 2009]
Remember the last major infrastructure change - Big Daddy? That didn't even go down as deep as changing the file structure itself.
I see lots of differences between Caffeine and DC of the day, when querying different DC's comparing various keywords for various genres.
I see big differences as well. Contrary to what some here are stating there is no way the Caffeine results are live on Google.com(unless filters are just not applied on Caffeine.
Caffeine results will have propagated all SERP's
Sorry but that just doesn't make sense to me at all. If the Caffein infrastructure is already being used as others have said in this thread then we are already seeing Caffein results now on Google.com. If it is not then the results cannot "propogate" what needs to be done is for the new software infrastructure to be installed.
I'm still not convinced that everyone contributing to this thread, including some senior members, have really got it. By it I mean what Caffein is all about. IMHO It is about how the data is stored, indexed and results extracted. It is about when and how, in the message path, the algorithm is applied.
Having said that the results on the Caffeine sandbox do have some penalties included as I've seen "penalised" sites move out and then back in but I still don't think that they have applied all of those penalties. I loosely hypothesis that they are trying to figure out at which stage in the extraction process they should apply penalties.
Just my 2c.
If it is not then the results cannot "propogate" what needs to be done is for the new software infrastructure to be installed.
IMHO It is about how the data is stored, indexed and results extracted.
I loosely hypothesis that they are trying to figure out at which stage in the extraction process they should apply penalties.
The pages that are no longer suffering "penalties" should not have been penalised. Its not that penalties have not been applied, its that some pages are no longer "penalised". As such, I do not beleive it has anything to do with any intentional penalty process.
Now on to some wild theorising...
Contary to me previous post (or perhaps in addition), my working theory is that these are pages that slipped through the cracks of the previous infrastructure. Possibly in an environment that suffers from resource competition, some data must be discarded. Either you can discard data randomly, or you can select data for discard.
Lets assume G selects. This will be likely be low-level data (OBLs on PR<0.01 pages for eg), which you would expect to have negligible impact. However, the Butterfly Effect of adding in this data does have noticable impact.
As a purely speculative addendum to this theory, random "waves" of "penalties" could occur when the data loss gets scaled up, to PR<0.011 for eg.
The pages that are no longer suffering "penalties" should not have been penalised.
The sites/pages I'm talking about were penalised for very understandable reasons. I liked them being penalised because my understanding of why they were helped me to plan my activities. Now I don't know if they are permanently back or if they will eventually have the penalty applied again. I'd like to know this as it would help me to plan.
As an aside I've noticed in my niche that virtually every site in the top 20 or 30 has bought links, some more than others. The current results for the most valuable terms are ranked by how good the site owner or their SEO is at buying links. If Google penalised everyone who had bought links the top of SERPS would be full of lame sites and that would have everyone running over to Bing. I wonder if Google's attitude will be forced to change in view of this.
Thanks for the reply, tedster. So with all this "complexity" and "odd anomalies", is it possible that the Google employees themselves don't know exactly how the SERPs will be affected when Caffeine is implemented?
is it possible that the Google employees themselves don't know exactly how the SERPs will be affected
Thats why every major update has a series of aftershocks, as unforseen problems are fixed, patched and bodged.
And also why Caffeine specifically asks for feedback.
Its like saying, "don't meteorologists know the impact on weather if you rearrange the ocean currents". Thers no way to accurately predict results given current ocean currents (or SE algos), let alone what will happen if a major componant changes.
I wonder if this time they have ironed out the christmas lead drought that seems to happen every year around Columbus Day and finally abates in the first week of January ?
All the best
If the Caffein infrastructure is already being used as others have said in this thread
Right, and of course they [we] all know about as much about the timing of this release, whether it has [or has not] been already incorporated at least in some segmentation to the current SERP's, which is why I mentioned 'perhaps'.
I, for one, cannot see how the Caffeine results currently 'match' to the current SERP's in any way more than providing another 'flavor' of what is currently there. But I do subscribe to the summer "we busted it and have the fix now" theory.
It's the complexity effect - the GFS isn't like a mysql database. Data is sharded into various kinds of pieces and then stored across a huge server farm.
It's to me something like DRDB.
I see very outdated results in Caffeine, so it's like they pushed all old data back to years ago and then reapplying algorithms, filters, etc..
But of course, it's not that simple :)