Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google Spanner & Infrastructure Information

         

inbound

4:39 pm on Oct 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not sure if this fits Google Search News, it's more about their infrastructure as opposed to how that translates to the SERPS.

A very interesting presentation is available from a recent Large Scale Distributed Systems Symposium held at Cornell:

[cs.cornell.edu...]

This give a lot of detail on how Google manage their data and datacenters, some of it will be old if you follow the subject, but some is new and there are many figures (not so sure on how up to date they are, Google may have changed some for competition reasons too, but I think the audience it was aimed at means they will not be too far off).

Spanner is interesting, Google are looking to have around 1 to 10 million servers around the globe in hundreds/thousands of locations.

I suppose the biggest thing to take from this is the methodology that is used to access data, we already know that Google has never been up for the idea of "big iron", preferring lots of networked commodity hardware and smart software - well it looks as though they are now cemented in that approach and have systems that are built around serving the needs of the majority - this leads me to believe that they are not likely to be able to access enough data in a reasonable timeframe to massively increase the amount of information used for any given web search - they handle too many searches and hold too much data.

Is there room for a different approach to web search that serves a niche audience? I think there is.

Robert Charlton

6:43 pm on Oct 24, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Perhaps not exactly on point, but to add an additional reference here about Google and scalability, it's worth noting the first half-dozen or so posts about the Google File System... which are easily lost in our October 2009 Updates discussion....

[webmasterworld.com...]

Google File System v2

A couple of years ago at the first Seattle Conference on Scalability, Google's Jeffrey Dean remarked that the company wanted 100x more scalability. Unsurprising given the rapid growth of the web. But there was more to it than that: GFS - the Google File System was running out of scalability.

[storagemojo.com...]

Background Here:
[storagemojo.com...]

...and also...

The Google File System - Abstract
Google Research Publications
[labs.google.com...]

inbound

12:22 am on Oct 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Also, there are mentions of Colossus (GFS 3) in the presentation - sounds like a suitable name...