Welcome to WebmasterWorld Guest from 54.160.131.144

Forum Moderators: open

Message Too Old, No Replies

Googles Handles Server Failure

     

Brett_Tabke

1:30 pm on Mar 3, 2005 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



[nwc.linuxpipeline.com...]

Ensuring reliability was another concern. With so many commodity hardware servers, "expect to lose one a day," he said. Google decided to "try to deal with that in an automated way. Otherwise, you will have lots of people running around trying to restart servers."

cnet [news.com.com]

According to Hoelzle, Google has inexpensively built out its computing infrastructure by using thousands of "commodity" servers, instead of fewer high-end, and high-priced, machines. The trick is to make these racks of hardware work together and to ensure that the failure of one machine doesn't derail an operation.

treeline

5:23 pm on Mar 3, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Secret to success: keeping your users interests first, no matter the circumstances.

Hoelzle then flashed a picture on the screen of six fire trucks at a Google data center. "I can't tell you what happened, but it's not about one machine going down," he said. He didn't disclose when the incident occurred. "No users were harmed in this picture," he added.

Teknorat

12:51 am on Mar 4, 2005 (gmt 0)

10+ Year Member



Is it kind of weird they won't tell us what happened? Or is that just me?

Chico_Loco

3:03 am on Mar 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is it kind of weird they won't tell us what happened? Or is that just me?

If you owned a large public company and one of your "super-smart phd's" forgot to extinguish a cigarette correctly, left a candle burning, or went crazy from coding too much and developed para maniac tendancies, would you want to tell people about it?

Tapolyai

12:35 pm on Mar 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Or used a half-a-dozen extensions in chain, with everyting plugged into one circuit, one extension cord. Seen it. Have put fire out on it. Stinks. Literally.

carguy84

6:12 pm on Mar 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



man, I wish there was another 10 pages explaining their technology. :( good read.

rise2it

6:16 am on Mar 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If one of us had been there, we could've put out the fire with our free Google beach towel...

:-)

victor

7:11 am on Mar 5, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



another 10 pages explaining their technology.

One major chunk -- the filing system -- explained here:

[labs.google.com...]

Also possibly explains (see other current thread) why they need good operating system developers.

rich42

9:56 am on Mar 8, 2005 (gmt 0)

10+ Year Member



there's good lessons here for small-scale webmasters too...

I spent a long time trying to find a hosting provider which actually delivered on the 99.99% uptime claims (I never found one).

I finally figured out I was better off going with two cheap providers that were reasonably good - say 99.5%

with some monitoring software that automates fail-over via dynamic DNS - a server can crash when I'm out hiking and things keep running...

side note: google actually seems to be down at the moment...

Xoc

2:41 pm on Mar 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wonder what kind of hardware they are using for their servers. Who makes the components? Can I buy the same hardware?

carguy84

8:05 pm on Mar 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not knowing what your specific requirements are, Xoc, have you seen these:

[google.com...]

Chip-

Xoc

9:59 pm on Mar 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yeah, but that's not what I'm interested in. What I want is a cheap reliable web server. If Google is buying these things in the (tens of?) thousands, the price is probably pretty good, the components field tested and reliable, etc. When something dies, can it be replaced easily. Etc. If it's good enough for Google, it's probably good enough for me.