Welcome to WebmasterWorld Guest from 54.160.187.160

Forum Moderators: phranque

Message Too Old, No Replies

Amazon Cloud Outage Affected Major Sites

     
10:29 am on Apr 22, 2011 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:23245
votes: 357


Amazon Cloud Outage Affected Major Sites [news.cnet.com]

A partial failure at Amazon Web Services' cloud-computing infrastructure brought down some Internet operations today, including the Web sites of Quora and Reddit.

The outage struck the Elastic Compute Cloud (EC2) service at Amazon's northern Virginia site, which handles AWS operations for the U.S. East Coast. The problems began at 1:41 a.m. PT, according to Amazon's AWS status dashboard, with delays and errors when connecting to servers over a network.

We'd like to provide additional color on what were working on right now (please note that we always know more and understand issues better after we fully recover and dive deep into the post mortem). A networking event early this morning triggered a large amount of re-mirroring of EBS [Elastic Block Storage] volumes in US-EAST-1. This re-mirroring created a shortage of capacity in one of the US-EAST-1 Availability Zones, which impacted new EBS volume creation as well as the pace with which we could re-mirror and recover affected EBS volumes. Additionally, one of our internal control planes for EBS has become inundated such that it's difficult to create new EBS volumes and EBS backed instances. We are working as quickly as possible to add capacity to that one Availability Zone to speed up the re-mirroring, and working to restore the control plane issue. We're starting to see progress on these efforts, but are not there yet. We will continue to provide updates when we have them.
12:11 pm on Apr 22, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 23, 2002
posts:659
votes: 0


Clouds are great, but when they go down they are a single point failure that will take your business out.
1:18 pm on Apr 22, 2011 (gmt 0)

Junior Member

10+ Year Member

joined:June 15, 2003
posts:125
votes: 0


We were in final stages of making the decision to move to the AWS and this happens. We are now working on redundancy to make sure we don't get caught like this. Currently we have identical servers in very secure data centers 100 miles apart that monitor each other constantly with a program called replistor. IF we can solve the redundancy issue, we estimate we will save 80% or more on current costs not including the issue of replacing aging servers.
3:25 pm on Apr 22, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Jan 30, 2006
posts:1661
votes: 10



Clouds are great, but when they go down they are a single point failure that will take your business out.


which is exactly how the cloud is NOT sold. Its the cloud, its always on, its always backed up, everything is happy up there!
6:10 pm on Apr 22, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 26, 2003
posts:1284
votes: 0


So essentially amazon has way oversold its capacity and didn't correctly engineer its storage zones to be fault tolerant. Nice.

I interviewed there and it was about the most inhumane and depressing interview i've ever witnessed. THey're so proud of their crap that they laugh you out the door if you speak on behalf of your own enterprise tools/san/nas experience.

Look whose laughing now. ha ha ha :P
8:46 pm on Apr 22, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Jan 22, 2005
posts:1165
votes: 7


People won't buy a Pepsi car, wear Crest shirts or live in Kodak homes. Why use Amazon's computer services?

Seriously, eleven years ago I worked for a large regional retailer who did what Amazon is doing--they sold their computer expertise. AND REAL COMPANIES BOUGHT IT. It was amazing. Did our customers get burned. Ohhhh yeah.

The last time I leased major computer power I took a look at Amazon. It was obvious that they were not serious about this business. And, we were building it out to sell. So, we went with a brand name firm. I was worried that our buyers would see Amazon as our provider as a negative.