homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 71 message thread spans 3 pages: < < 71 ( 1 [2] 3 > >     
Google announces Caffeine is completed
travelin cat

 12:12 am on Jun 9, 2010 (gmt 0)

Today, we're announcing the completion of a new web indexing system called Caffeine. Caffeine provides 50 percent fresher results for web searches than our last index, and it's the largest collection of web content we've offered. Whether it's a news story, a blog or a forum post, you can now find links to relevant content much sooner after it is published than was possible ever before.




 7:02 am on Jun 9, 2010 (gmt 0)

My posts, even in this thread, seem confused about what Caffeine is and isn't. These are confusing times. Though I'm still quite angry to see my quality site outranked by a mashup site, a sub-domain abuser and a "picture search" engine that's got ALL of my images hotlinked from Google cache and outranks me for them I will be fair here.

Google has taken care of crap like that in the past, I'll wait to see if they get it right again, and my traffic has also climbed a little bit since this rolled out by about 9% (though still down quite a bit).

If Google can take care of the spam I like the new changes. If the type of site (mashup, multi-domain, big site with lots of pages who only spams a little) currently ranking much better is what Google wants this is a disaster, webmasters shouldn't have problems outranking sites that are re-displaying their own content.


 7:39 am on Jun 9, 2010 (gmt 0)

One question, talking about number of pages indexed. How did you check on those number? From WMT or opertor site queries?


 8:32 am on Jun 9, 2010 (gmt 0)

Best news in a month, at least it is one less variable to try and work out.

It could also be the reason why they havent downloaded our new sitemap yet having submitted it 48 hours ago.


 9:41 am on Jun 9, 2010 (gmt 0)

Still a lof of pages missing. Is deepcrawling still going on for you?

To be honest, I don't understand this "G re-crawls everything". I mean in past the date and timelime of links set plays a big part (for example you shouldn't set too much new links to one site). How is this going into the algo when they do a complete re-index?


 10:50 am on Jun 9, 2010 (gmt 0)

An article about this at PC World includes some quotes from Matt Cutts:

[pcworld.com ]


 10:58 am on Jun 9, 2010 (gmt 0)

Five months, and now caffeine is here.

Google has cut way back on crawling, stopped updating 301s, and is often displaying the wrong URL duplicate content... but they can pile papers 3 miles high.

Wouldn't it be cool to be a near monopoly where you could just spend five months making your product worse and you don't even have to think it could be possible that you would lose market share.


LOL, and now going to check I see that in the past day Google has hurled back into the index tons of obsolete URLs and things that wer duplicates six months ago but are not duplicates now.

How good or bad this change works out years from now, you just don't intentionally do moronic things like this, which means while it might be rolled out, caffeine is seriously screwed up.


 11:34 am on Jun 9, 2010 (gmt 0)

Am I to infer that this is primarily long tail results and by fresher Google means more recent? If so that's a spammers paradise, content like guides and how to-s don't need to be re-written but are often scraped and regurgitated and worse - mashed up.

That is exactly what will happen. The old notion that it can take month to get info Google is now out the window. This is fertile ground for the scammers. SEO is basically a big game of "King of the Mountain" and now the king will be different every few hours. Why not just make the "I feel lucky" button the standard search. It sounds like it's gonna be the Wild West all over again.


 11:50 am on Jun 9, 2010 (gmt 0)

I'm not sure why they decided freshness=better, or where freshness is even relevant. They're indexing information, now they're saying on a broad swath what they want to index is news.

Perhaps that's the case, and what they intend. But it's not what's important in my niche. Stuff hasn't changed for 50 years.


 12:06 pm on Jun 9, 2010 (gmt 0)

Did I miss something? Where did they say freshness is better, if you search for "Anytime"? Can't confirm this at all, good old pages are stable as always - without having changed the content for 12 or 24 month now... I thought it means that new or changed pages are indexed faster but then (according to the alorithm) ranked. Is freshness really such a factor in that?

@steveb: I guess we are 12 month now from the announcement of Caffeine until now...


 1:57 pm on Jun 9, 2010 (gmt 0)

one thing I notice over just the past few days is that my GWT reports TONS more 'crawl errors' of many more types!

My site is fairly large, 325k pages with 250k of those indexed (per GWT sitemap stats) so I bet Caffeine enabled them to spider/analyze the whole thing faster which is good. It's also a long-tail site that I previously thought was high-quality & trusted - but got entirely dropped from rankings/traffic. So if I can figure out why Mayday suddenly reversed my quality/rankings, hopefully caffeine will help recover it faster.

OTOH, I notice a long lag in the issues reported under Page Speed, but that's still in Labs, maybe it isnt running off Caffeine.


 2:50 pm on Jun 9, 2010 (gmt 0)

From studying the various bits of documentation, here's how I understand that the Caffeine infrastructure will now affect ranking, no matter what algorithm changes Google cares to develop.

One of the functions mentioned in various Google documents is that Caffeine allows faster tagging, retrieval and refreshing of the various ranking signals. The scores for these signals are stored as layers of meta data that Google attaches to each web document in their index.

Once Caffeine is launched (that is, starting right now) Google has a new agility that allows them to create and refresh a greater NUMBER of signals. So the 200+ ranking factors that are so often mentioned will now have room to grow into a longer list.

The old Big Daddy infrastructure was apparently "topped out" in this area. It was also sluggish to refresh when a new crawl was done for a document. That refreshing apparently used to be done only for large chunks of the index at a time. Now the whole process can happen in a nearly continual fashion and can occur over smaller, more modular bits of the index.


 3:09 pm on Jun 9, 2010 (gmt 0)

One of the things I think it's important to keep in mind is:
This is what we've all been waiting for, so to speak...

Google had two different storage and indexing systems in place with an algo that ranked results for either. They also probably had teams of people working on developing and integrating Caffeine. Now that the rollout is complete they can work on moving forward a bit more with the fine tuning of things, because they should have way more 'free resources' (including people) to tune, adjust, and refine other things instead of concentrating on updating the structure and storage systems.

One of the things I know from having been reading the update threads for years is there are often periods of time after major changes in the algo or infrastructure where spam 'floats to the top'. It's happened before, and will probably happen again, but every time there have been 'Google's Broken', 'they can't keep doing this', 'they've really done it this time' threads they have always made adjustments and 'the show goes on'.

The threads we've had here about Google's showing of spam in the results aren't really new... The names of the posters have changed, but the same 'stuff' still applies, so if there are those who think the type of 'spammy results' they are reporting is new, I suggest reading back through the update threads, because you will probably notice the same thing has happened before, and interestingly, with each update and each time people have said Google has moved backward somehow, someway they keep moving forward.

I think now that Caffeine is actually in place and they have some more free resources we will probably see 'algo integration and refinement' over the next weeks and months, which is really what we've all been waiting for, and my guess is them too...


 4:49 pm on Jun 9, 2010 (gmt 0)

So pardon the newbie question here please: is this live? If I search google.com now do I see caffeine ?


 5:00 pm on Jun 9, 2010 (gmt 0)

According to Google: Yes, you see Caffeine.


 5:08 pm on Jun 9, 2010 (gmt 0)

I did notice a much more rapid pickup of a few pages last night. I think we all need to distinguish what is caffeine which is the infrastructure/capabilities portion of the search versus the algo change which has been the real problem for most of us.


 6:42 pm on Jun 9, 2010 (gmt 0)

Whether it's a news story, a blog or a forum post, you can now find links to relevant content much sooner...
What if you manage a regular website (online store, brochure site, non-profit org site...) and NOT a news site, blog or forum? Sounds like the new Google emphasis is on the new ONLY, so I guess you'll have to buy Adwords to rank.

 8:01 pm on Jun 9, 2010 (gmt 0)

It sounds like "new" pages could get an extra rankings boost just for their newness. But a page only remains "new" for a short time, so I hoping that this is a temporarily boost, so that these pages will soon fade from the top and that the established pages from old authoritative sites will quickly return to their old positions.


 8:07 pm on Jun 9, 2010 (gmt 0)

It may sound like that, but I don't think Google's full understanding of "fresh" is the same as simply "new" or "in the news". As I mentioned above, with Caffeine, Google now has the ability to refresh the ranking factors that are "attached" to any URL in a nearly continual fashion. This means that their search results can use fresh DATA ABOUT all their indexed content when they generate search results, but not that only fresh CONTENT can rank well.

It has been true for quite a while that new content can get a ranking boost - for some queries (it's been called QDF, for Query Deserves Freshness). That's not new with Caffeine and it certainly will continue - but it's only for certain queries, and it only seems to affect a few first page positions even on those searches.


 8:59 pm on Jun 9, 2010 (gmt 0)

Thanks for clearing that up, Tedster. Except I'm not convinced that "it's only for certain queries". I think it could happen in exceptional cases for almost any query. For example a new article about some old historical mystery that appears on a highly authoritative site might temporarily get a high ranking even though that subject normally isn't QDF.


 9:21 pm on Jun 9, 2010 (gmt 0)

No doubt - I've definitely seen that kind of short term ranking boost.

I also think that QDF may be assigned to a query term based on an assortment of social buzz metrics - it's not a "set it and forget it" kind of thing for Google. I'd also imagine that a sudden spike in the number of people searching on a term could earn it a QDF tag for a while.


 4:23 am on Jun 10, 2010 (gmt 0)

Caffeine is a disaster. I dont know what is going on but the search results are the worst I have seen ever. Example: Search for "Black Duck Beauty In The Winds". ( A famous Chinese musical group that has a CD called Beauty In The Winds". The results on page one are 90% about anything to have the name Black Duck and ignores the Beauty In The Winds. This example is across the boards. Its as if the SERPS have been nuked.


 5:14 am on Jun 10, 2010 (gmt 0)

I don't think fresh means "new", I think it means updated recently. If an article continually gets new comments added to it, it signifies that document is current and fresh, I want to consider that if I was Google.

I'm thinking about revisiting the "change frequency" of my sitemap - build up a bit of trust with Google.


 5:39 am on Jun 10, 2010 (gmt 0)

Oh please, let it be just a matter of updating old content to regain rankings... please?


 6:03 am on Jun 10, 2010 (gmt 0)

Just wondering something about caffeine :

if you post something on a blog or let's say Twitter it is updated within seconds but what about if you change the title of your webpage or the description. How long does it take for caffeine to catch the new title ?

It use to be about a week before what about now ?


 6:12 am on Jun 10, 2010 (gmt 0)

@member22, it depends how often google bot crawls your site and index the changes. It can take 1 day or weeks.


 8:16 am on Jun 10, 2010 (gmt 0)

thanks let me know!

Robert Charlton

 8:16 am on Jun 10, 2010 (gmt 0)

Caffeine is a disaster.... Its as if the SERPS have been nuked.

Slinger - We generally don't allow specifics, but I don't think the particular search you mention is going to start a marketing war, as it's not very common on US results in all three major engines.

If you put the six words you suggest in quotes... ie, the group name and the song name together... you actually reduce the number of useful results, as the quotes make it a search for an exact word match.

So I searched two ways...

- for the six words (group and song name) without quotes (ie, an all-the-word search), and...
- for the group name and the song name each in quotes (ie, an exact match for each).

I searched Google, Bing, and Yahoo.

I'd say that for the six words without quotes, the first page of Google gave the most results that were directly useful... ie, with the album in question on the page returned. Bing gave the worst, with no accurate results, and Yahoo returned the page with the album (which Google had returned as #1) in its #4 spot.

Searching for the exact group name and song title, I got essentially the same results on Google and Yahoo, and no sites in English on Bing.

I don't know how many pages you were expecting to see satisfying the query, but I can pretty safely say that in English language serps as indexed by the three major search engines, there aren't enough results to fill the first page.

Beyond that, note that Caffeine is the infrastructure, and MayDay is the algo. It's the algo right now that's more controversial than the infrastructure, and justifiably so. I suspect the Caffeine infrastructure is pretty solid. The algo is still having some problems.

[edited by: Robert_Charlton at 8:17 am (utc) on Jun 10, 2010]


 8:17 am on Jun 10, 2010 (gmt 0)


Nothing changed with google caffeine ? it is all a mater of PR that sounds strange to me but whynot ...


 9:59 am on Jun 10, 2010 (gmt 0)

I love this bit of the quote>>

"and it's the largest collection of web content we've offered"

... they will start to call it all their content soon


 10:08 am on Jun 10, 2010 (gmt 0)

I think now Caffeine has been launched and completed, Mayday is going to be tweaked in the new environment and will get all sorts of filters on which may not have occured on the "old" infrastructure ... I do see some changes (though slightly) when I look at our website and it's queries and it seems we get the favour again since the 90+% drop we have seen the last couple of weeks.

I hope that use Dusky is right still :)


 10:55 am on Jun 10, 2010 (gmt 0)

Ah guess thats why Google.co.uk is down at the minute then, used up all the resources lol!

This 71 message thread spans 3 pages: < < 71 ( 1 [2] 3 > >
Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved