Behind the Scenes at Google IT Department

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Behind the Scenes at Google IT Department

Brett_Tabke

1:56 am on Jul 11, 2006 (gmt 0)

An absolutly righteous article ever Google watcher should read:

[baselinemag.com...]

For example, the GFS ensures that for every file, at least three copies are stored on different computers in a given server cluster. That means if a computer program tries to read a file from one of those computers, and it fails to respond within a few milliseconds, at least two others will be able to fulfill the request. Such redundancy is important because Google's search system regularly experiences "application bugs, operating system bugs, human errors, and the failures of disks, memory, connectors, networking and power supplies," according to the paper.

Hanu

2:52 am on Jul 11, 2006 (gmt 0)

A nice round-up but nothing new.

old_expat

5:02 am on Jul 11, 2006 (gmt 0)

Egad! Now they are building in duplicate content!

CainIV

5:55 am on Jul 11, 2006 (gmt 0)

Regarding the recent 'anomalies' with Google this is pretty relevnat information...

the_nerd

11:06 am on Jul 11, 2006 (gmt 0)

Google's market share for search among U.S. Internet users reached 43% in April, compared with 28% for Yahoo and 12.9% for The Microsoft Network (MSN).

anybody know how they define market-share? Can't be clicks on a search result?

ScottD

12:22 pm on Jul 11, 2006 (gmt 0)

Just Guessing I think that the sentiment Brett is thrusting to encapsulate is one of splendour and marvel, in deep appreciation of the merit of the aforementioned article.

Or as we Brits might say, it's "wicked"

Munster

11:36 am on Jul 13, 2006 (gmt 0)

Do you think Google have signed up with Hitwise :)

jcmoon

4:07 pm on Jul 18, 2006 (gmt 0)

I signed up, printed the PDF, and ... finally ... have read it all. Lots of information in there, after the initial summary.

Among many interesting bits, this stood out (emphasis mine):

Google's systems seem to work well for Google. But if you could run your own systems on the Google File System, would you want to? Or is this an architecture only a search engine could love?
Distributed file systems have been around since the 1980s, when the Sun Microsystems Network File System and the Andrew File System, developed at Carnegie Mellon University, first appeared. Software engineer and blogger Jeff Darcy says the system has a lot in common with the HighRoad system he worked on at EMC. However, he notes that Google's decision to "relax" conventional requirements for file system data consistency in the pursuit of performance makes its system "unsuitable for a wide variety of potential applications." And because it doesn't support operating system integration, it's really more of a storage utility than a file system per se, he says. To a large extent, Google's design strikes him as more of a synthesis of many prior efforts in distributed storage.

So Google has an amazing, exotic, distributed (and proprietary) file system, which really only works well for Google's applications it seems ... and in part due to the decision to opt for performance over consistency?!

That has rammifications ...