homepage Welcome to WebmasterWorld Guest from 54.198.148.191
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 238 message thread spans 8 pages: 238 ( [1] 2 3 4 5 6 7 8 > >     
Google's New Caffeine Search Engine - part 2
Hissingsid




msg:3984609
 8:56 am on Sep 5, 2009 (gmt 0)

< continued from [webmasterworld.com...] >

For my top target term there are still some differences between the Caffein sandbox and .com results. It does not affect me and looking at them dispassionately the Caffein results (subjectively) serve the user better. More of my significant real world competitors appear and less of the buy their way to the top jokers.

Cheers

Sid

[edited by: tedster at 5:52 am (utc) on Sep. 7, 2009]

 

aristotle




msg:3984689
 1:09 pm on Sep 5, 2009 (gmt 0)

I'm still confused about what Caffeine represents. Does it simply refer to an enlargement and faster updating of the database that underlies the algo? Does it include changes in the algo itself?

tedster




msg:3984817
 7:22 pm on Sep 5, 2009 (gmt 0)

It's a new infrastructure that includes a rewrite of the underlying data storage technology - the Google File System (GFS). This will give Google more flexibility and speed in how they construct the SERPs in the future, but for now it seems they would like to make the final switchover almost seamless to end users - hence Google's request to webmasters for reports on ranking differences.

cangoou




msg:3984818
 7:25 pm on Sep 5, 2009 (gmt 0)

hence Google's request to webmasters for reports on ranking differences.

... which is a bit odd: If someone really knows all the differences, who would this be most likely?!?! ;-)

aristotle




msg:3984833
 8:18 pm on Sep 5, 2009 (gmt 0)

Well, I'm trying to understand how a change in the infrastructure and/or filing system would cause changes in the rankings. Unless perhaps it includes an expansion in the amount of data collected and analyzed for inputing into the algo, and also possibly an increase in the number of pages in the main index.

tedster




msg:3984836
 8:32 pm on Sep 5, 2009 (gmt 0)

It's the complexity effect - the GFS isn't like a mysql database. Data is sharded into various kinds of pieces and then stored across a huge server farm. When a query comes in, that data (now in smithereens) gets accessed and re-assembled into SERPs -- with all kinds of odd anomalies along the way.

Remember the last major infrastructure change - Big Daddy? That didn't even go down as deep as changing the file structure itself.

cyclinder




msg:3984843
 9:01 pm on Sep 5, 2009 (gmt 0)

i see the rollback to the 2 months old positions for my site, on google.com (same for sandbox)

its like the new algo is already active but over the old data.

my indexed pages number also decreased.

CainIV




msg:3984964
 1:10 am on Sep 6, 2009 (gmt 0)

I see lots of differences between Caffeine and DC of the day, when querying different DC's comparing various keywords for various genres.

MLHmptn




msg:3985045
 6:50 am on Sep 6, 2009 (gmt 0)


I see lots of differences between Caffeine and DC of the day, when querying different DC's comparing various keywords for various genres.

I see big differences as well. Contrary to what some here are stating there is no way the Caffeine results are live on Google.com(unless filters are just not applied on Caffeine.

kevsta




msg:3985054
 7:29 am on Sep 6, 2009 (gmt 0)

yes i've still got noticeable differences. presumably with something as big as this it'll be on or off?

ie they wont be able to blend the old and new serp to give a soft landing on arrival?

spadilla




msg:3985268
 11:51 pm on Sep 6, 2009 (gmt 0)

It seems to me as if the caffeine results are lagging behind the current SERPs. I just overhauled a website for a client who holds the #1 place for a keyphrase. He had purchased the actual keyphrase domain sometime ago and wanted to get 301 his other (#1 in SERP) domain so it would show the keyword domain instead. This was done and went live last week and I've been checking all week for the update each day this week. Just today I see the current SERPs showing the keyword domain only as #1 and in the caffeine results I am still seeing their old domain as it has shown all along up until now.

cyclinder




msg:3985272
 12:06 am on Sep 7, 2009 (gmt 0)

yes, same impression, hope this will be fixed on the actual 'release'

CainIV




msg:3985274
 12:53 am on Sep 7, 2009 (gmt 0)

It's really tough to say. I am not certain myself that there needs to be a perfect match between the two engines. Perhaps we will wake up Tuesday and Caffeine results will have propagated all SERP's. I think it is difficult to know how this will roll out.

Vimes




msg:3985339
 5:39 am on Sep 7, 2009 (gmt 0)

Yea i'd say the results are older than the current DC's i'm looking at, as an exmaple for a site i know went through a huge url restructuring a couple of months ago, caffeine is still showing the old URL structure of the site when using the site operator, SERP's seems to be matching the older data for this particular website.

Vimes.

Hissingsid




msg:3985379
 7:46 am on Sep 7, 2009 (gmt 0)

Caffeine results will have propagated all SERP's

Sorry but that just doesn't make sense to me at all. If the Caffein infrastructure is already being used as others have said in this thread then we are already seeing Caffein results now on Google.com. If it is not then the results cannot "propogate" what needs to be done is for the new software infrastructure to be installed.

I'm still not convinced that everyone contributing to this thread, including some senior members, have really got it. By it I mean what Caffein is all about. IMHO It is about how the data is stored, indexed and results extracted. It is about when and how, in the message path, the algorithm is applied.

Having said that the results on the Caffeine sandbox do have some penalties included as I've seen "penalised" sites move out and then back in but I still don't think that they have applied all of those penalties. I loosely hypothesis that they are trying to figure out at which stage in the extraction process they should apply penalties.

Just my 2c.

Sic

Shaddows




msg:3985421
 9:12 am on Sep 7, 2009 (gmt 0)

If it is not then the results cannot "propogate" what needs to be done is for the new software infrastructure to be installed.

Agreed. But iff (if and only if) Caffeinated DCs are not accessed from main SERPs due to loadbalancing, and iff redundency batch processing hasn't sourced data from GFS2/Caffeinated DC. In other words, Caffeine should not "roll out" if G has kept the data properly isolated- until each DC has had the new infrastructure installed.

IMHO It is about how the data is stored, indexed and results extracted.

Agreed. I think thats precisely what it's about, which due to arising complexity, causes diffent results.

I loosely hypothesis that they are trying to figure out at which stage in the extraction process they should apply penalties.

Strongly disagree.

The pages that are no longer suffering "penalties" should not have been penalised. Its not that penalties have not been applied, its that some pages are no longer "penalised". As such, I do not beleive it has anything to do with any intentional penalty process.

Now on to some wild theorising...

Contary to me previous post (or perhaps in addition), my working theory is that these are pages that slipped through the cracks of the previous infrastructure. Possibly in an environment that suffers from resource competition, some data must be discarded. Either you can discard data randomly, or you can select data for discard.

Lets assume G selects. This will be likely be low-level data (OBLs on PR<0.01 pages for eg), which you would expect to have negligible impact. However, the Butterfly Effect of adding in this data does have noticable impact.

As a purely speculative addendum to this theory, random "waves" of "penalties" could occur when the data loss gets scaled up, to PR<0.011 for eg.

Hissingsid




msg:3985430
 9:40 am on Sep 7, 2009 (gmt 0)

The pages that are no longer suffering "penalties" should not have been penalised.

The sites/pages I'm talking about were penalised for very understandable reasons. I liked them being penalised because my understanding of why they were helped me to plan my activities. Now I don't know if they are permanently back or if they will eventually have the penalty applied again. I'd like to know this as it would help me to plan.

As an aside I've noticed in my niche that virtually every site in the top 20 or 30 has bought links, some more than others. The current results for the most valuable terms are ranked by how good the site owner or their SEO is at buying links. If Google penalised everyone who had bought links the top of SERPS would be full of lame sites and that would have everyone running over to Bing. I wonder if Google's attitude will be forced to change in view of this.

Cheers

Sid

aristotle




msg:3985497
 12:27 pm on Sep 7, 2009 (gmt 0)

tedster wrote:
It's the complexity effect - the GFS isn't like a mysql database. Data is sharded into various kinds of pieces and then stored across a huge server farm. When a query comes in, that data (now in smithereens) gets accessed and re-assembled into SERPs -- with all kinds of odd anomalies along the way.

Thanks for the reply, tedster. So with all this "complexity" and "odd anomalies", is it possible that the Google employees themselves don't know exactly how the SERPs will be affected when Caffeine is implemented?

Shaddows




msg:3985511
 1:00 pm on Sep 7, 2009 (gmt 0)

is it possible that the Google employees themselves don't know exactly how the SERPs will be affected

Nailed on certainty, more like.

Thats why every major update has a series of aftershocks, as unforseen problems are fixed, patched and bodged.

And also why Caffeine specifically asks for feedback.

Its like saying, "don't meteorologists know the impact on weather if you rearrange the ocean currents". Thers no way to accurately predict results given current ocean currents (or SE algos), let alone what will happen if a major componant changes.

Badcol




msg:3985533
 2:12 pm on Sep 7, 2009 (gmt 0)

Certainly a huge ripple effect after every alteration. It's always a bit like lion taming with G. They get the system in a big box and then poke it with a stick until it does what it's told ;-)

I wonder if this time they have ironed out the christmas lead drought that seems to happen every year around Columbus Day and finally abates in the first week of January ?

All the best

BC

Hissingsid




msg:3985591
 4:01 pm on Sep 7, 2009 (gmt 0)

They know that if a butterfly flaps its wings in China a member of the lepidoptera family may have fluttered somewhere in east Asia!

Its like chaos only not so well organised.

Cheers

Sid

CainIV




msg:3985681
 7:44 pm on Sep 7, 2009 (gmt 0)

If the Caffein infrastructure is already being used as others have said in this thread

Right, and of course they [we] all know about as much about the timing of this release, whether it has [or has not] been already incorporated at least in some segmentation to the current SERP's, which is why I mentioned 'perhaps'.

I, for one, cannot see how the Caffeine results currently 'match' to the current SERP's in any way more than providing another 'flavor' of what is currently there. But I do subscribe to the summer "we busted it and have the fix now" theory.

Badcol




msg:3985685
 8:00 pm on Sep 7, 2009 (gmt 0)

Hi Sid,

Good to hear your voice again ... been a long time !

However, there's probably an algo written to prevent a butterfly from flapping its wings in China these days ;-)

Col :-)

cangoou




msg:3985711
 8:27 pm on Sep 7, 2009 (gmt 0)

Year, if the butterfly flaps like it is his nature he is banned 50 feet behind... Sorry, it has to be said ;-)

Hm, nothing big changed today, so what big holiday is next we can wait for?

barretire




msg:3985745
 10:44 pm on Sep 7, 2009 (gmt 0)

The days not over yet. I am still curious to see if anything changes when I wake up in the morning.

steveb




msg:3985756
 11:54 pm on Sep 7, 2009 (gmt 0)

The results are completely different still. Google likes holidays but this is not the only one.

They only asked for feedback a little while ago. They aren't in some big hurry.

brinked




msg:3985855
 6:44 am on Sep 8, 2009 (gmt 0)

I would look for some changes to google tuesday morning. I always see major changes the day following a weekend...I always remember waking up in the morning for work and checking my serps, I would not be surprised.

Hissingsid




msg:3985881
 8:22 am on Sep 8, 2009 (gmt 0)

I agree with steveb.

If you knew how well he does in the most competitive market outside of $orno and pills you would agree with him too!

Cheers

Sid

Badcol




msg:3985893
 8:57 am on Sep 8, 2009 (gmt 0)

I'm not seeing much of a change to the results since last Thursday, but I am seeing a massive URL update in the sandbox. A site I work on is now showing a full 2600 pages on sandbox, but only 232 on regular serps.

Cheers

BC

moftary




msg:3986229
 8:18 pm on Sep 8, 2009 (gmt 0)

It's the complexity effect - the GFS isn't like a mysql database. Data is sharded into various kinds of pieces and then stored across a huge server farm.

It's to me something like DRDB.
I see very outdated results in Caffeine, so it's like they pushed all old data back to years ago and then reapplying algorithms, filters, etc..

But of course, it's not that simple :)

This 238 message thread spans 8 pages: 238 ( [1] 2 3 4 5 6 7 8 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved