Welcome to WebmasterWorld Guest from 50.17.5.36

Forum Moderators: httpwebwitch & not2easy

Message Too Old, No Replies

Facebook Service Unavailable - DNS failure message

     
8:00 pm on Sep 23, 2010 (gmt 0)

Senior Member

joined:June 3, 2007
posts:6024
votes: 0


Ok, so where are you Facebook?
8:24 pm on Sept 23, 2010 (gmt 0)

Junior Member

10+ Year Member

joined:Mar 15, 2005
posts:152
votes: 0


All those embedded like buttons are also showing up with that same error... Check your site if you implemented these for a big text block error message.
8:28 pm on Sept 23, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member

joined:July 26, 2006
posts:1619
votes: 0


Guess someone didn't like all the publicity that they had on the news this morning.
9:01 pm on Sept 23, 2010 (gmt 0)

Moderator This Forum from CA 

WebmasterWorld Administrator httpwebwitch is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 29, 2003
posts:4059
votes: 0


My life is on hold
9:02 pm on Sept 23, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 20, 2004
posts:2377
votes: 0


Yup my FB comments boxes are down.
9:06 pm on Sept 23, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


Here are some tidbits from Twitter:

Tweetdeck: Facebook is currently suffering a major outage which is impacting TweetDeck FB columns too. We suggest removing FB accounts until fixed.

[twitter.com...]

Facebook: Facebook may be slow or unavailable for some people because of site issues. We're working to fix this quickly.

[twitter.com...]

Where's Facebook's version of the Fail Whale?
9:16 pm on Sept 23, 2010 (gmt 0)

Moderator from US 

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 10, 2005
posts:5551
votes: 24


Seems to be alive now.

<unhold category="life" user="httpwebwitch" />

I like this tweet: tweet [twitter.com]:
BREAKING NEWS Facebook down. Worker productivity rises. US climbs out of recession.
9:32 pm on Sept 23, 2010 (gmt 0)

Full Member

10+ Year Member

joined:Dec 26, 2000
posts:323
votes: 0


I saw this myself. Now I'm getting 503 status on a completely different site that hosts data. Could something bigger be up, like a dos attack/worm?
9:54 pm on Sept 23, 2010 (gmt 0)

Preferred Member

10+ Year Member

joined:Aug 1, 2003
posts:392
votes: 0


Maybe the major DOS attack that brought down Nettica for hours yesterday was a practice run or related.
11:58 pm on Sept 23, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


Here's a summary.

Facebook likely disappointed millions of bored office workers again on Thursday afternoon with a widespread outage and latency, a day after an outage shut down the site for hours...

According to AlertSite, a Website performance service and vendor, Facebook only had 38 percent availability with 60 second response times.

Meanwhile, until service was restored, frustrated Facebook users overwhelmingly turned to micro-blogging site Twitter to tweet their unhappiness -- a slight irony due to the fact that Twitter itself was the recipient of a massive cross-site scripting attack that bombarded users with pop-ups, rainbow tweets and #*$!ography just two days prior.

[crn.com...]
12:03 pm on Sept 24, 2010 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month Best Post Of The Month

joined:May 9, 2000
posts:22305
votes: 239


More Details on Today's Outage [facebook.com]
The key flaw that caused this outage to be so severe was an unfortunate handling of an error condition. An automated system for verifying configuration values ended up causing much more damage than it fixed.The intent of the automated system is to check for configuration values that are invalid in the cache and replace them with updated values from the persistent store. This works well for a transient problem with the cache, but it doesnít work when the persistent store is invalid.



Today we made a change to the persistent copy of a configuration value that was interpreted as invalid. This meant that every single client saw the invalid value and attempted to fix it. Because the fix involves making a query to a cluster of databases, that cluster was quickly overwhelmed by hundreds of thousands of queries a second.
1:45 pm on Sept 24, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 20, 2004
posts:2377
votes: 0


This same thing happens to my server on a much smaller scale. If my memcache were to suddenly think all cached objects needed to be replaced, my server crumbles to it's knees while the db gets hammered.

It is a risk/ tradeoff of highly cached system.

I can only imagine the issues they face at the scale they deal with. So much dynamic content. Crazy.
5:26 pm on Sept 24, 2010 (gmt 0)

Full Member

10+ Year Member

joined:Jan 15, 2004
posts:274
votes: 0


The question is was your site traffic higher than normal yesterday due to FB not being available? Mine was, not sure it was due to this though. Facebook sucks all the air out of the room, I hate huge sites like this personally (unless of course I owned a site like this, then I wouldn't mind too much).
6:19 pm on Sept 24, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 20, 2004
posts:2377
votes: 0


Interesting observation/theory ddogg. If it were down a whole day I bet we would see some different numbers.
6:37 pm on Sept 24, 2010 (gmt 0)

Preferred Member

5+ Year Member

joined:Nov 20, 2007
posts:585
votes: 0


[bbc.co.uk ]


One of Facebook's senior engineers Robert Johnson apologised to everyone who couldn't log on.

In a statement on his blog he said: "The key flaw that caused this outage to be so severe was an unfortunate handling of an error condition.

"An automated system [to fix the problem] ended up causing more damage than it fixed."
9:00 pm on Sept 24, 2010 (gmt 0)

Moderator This Forum from CA 

WebmasterWorld Administrator httpwebwitch is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 29, 2003
posts:4059
votes: 0


An automated system for verifying configuration values ended up causing much more damage than it fixed.


Well, then obviously those facebook folks are a bunch of drooling morons! ha ha ha

But seriously, their systems are so huge and complex, it impresses me that humans are capable of understanding it all. I have only been offered a brief glimpse into their persistent data storage system, and it's... gargantuan. It's a special kind of disaster when something incredibly complex starts melting down.

Good work getting it back up & running again, FB crew
9:43 pm on Sept 24, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member sgt_kickaxe is a WebmasterWorld Top Contributor of All Time 5+ Year Member

joined:Apr 14, 2010
posts:3169
votes: 0


I knew something was up, i clicked on a like button last night and the popup box was opening and closing itself over and over until I shut the browser.
6:49 pm on Sept 25, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Apr 21, 2008
posts:193
votes: 0


BREAKING NEWS Facebook down. Worker productivity rises. US climbs out of recession.


lol
3:53 am on Sept 27, 2010 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


That senior engineer, Robert Johnson, wrote a longer article about the problem here: [facebook.com...] Essentially they had a problem that they could only fix by shutting down the site and then bringing it back online little by little. Sort of like rebooting your PC.