| This 129 message thread spans 5 pages: < < 129 ( 1 2 3  5 ) > > || |
|Westhost sites down|
It appears anything hosted by Westhost has been down since about 3:30pm eastern time (roughly three hours now), and they are working on the problem.
"The flip side to that argument (staffers calling customers) is the staffers should be working to restore service faster instead of wasting their time on the phone. "
I was, of course, referring to office personnel, NOT the tech people working to solve the problems.
"I can understand your feelings and am glad to see your cooling off "
LOL...actually, I was never 'hot' about it...a bit annoyed, maybe - as I would have switched our DNS to our backups a few hours sooner than I did had they been a little more forthcoming with their information on Saturday.
I do understand why some others ARE hot about it, though. Had I waited two days and THEN had to switch (once the truth came about about how bad things were), they'd be having to peel me off the ceiling.
The last day we didn't have an online order was in 1998...had I not switched when I did, that little record would have been broken.
Calling customers ..they have over 80,000 customers ..do the math ..
and you do have an offsite back up plan in the future ..in place now ..?
you do have separate daily or hourly incremental ..backups of what is on the machine(s) that you use for dev and to post here ..don't you all ..?
CYOA ..no one else will..
Calling customers ..they have over 80,000 customers ..do the math ..
But all of them didn't go down? In fact, they say it was only a handful of dedicated servers, and about less than a dozen shared servers, so we 'should' be talking about hundreds, not thousands, of affected customers.
Then again, I'm proactive from a customer service standpoint - I would much prefer WE contact a customer, instead of waiting for that same (usually more upset) customer to contact us.
As far as backups, CYOA is right - and is (hopefully) will be a wakeup call to those that have not been doing so, and were instead relying on their hosting service (any company) to cover them.
Might be a good opportunity for someone to start a 'backup' thread so people could compare notes on exactly what plans they do we have in place.
We're good at doing backups (offsite) for our internal stuff here, and have guidelines for recovery written out. However, with domains and web hosting, I had never even heard of the word 'failover' before about 3 days ago.
|I was, of course, referring to office personnel, NOT the tech people working to solve the problems. |
Not sure how they're organized, but when we had a crisis even office personnel were hands on assisting, even if it's handing the workers tools, holding flashlights, verifying lists of customers per machine to make sure things match when it's reassembled, etc.
|But all of them didn't go down? In fact, they say it was only a handful of dedicated servers, and about less than a dozen shared servers, so we 'should' be talking about hundreds, not thousands, of affected customers. |
This points to the larger problem - lack of transparency.
* Poor intitial communication
* Actively covering up the magnitude of the problem for the better part of two days
* A bungled dishonest approach to reputataion damage control. From Wiki:
|A few days after the beginning of the outage, some users began to wonder if WestHost was being honest about the cause of the problem. The speculation began when a Twitter user noticed that the Wikipedia article for Inergen was edited to indicate that Inergen can damage hard drives. The edit took place shortly after the start of the service outage. Previous to the edit, the article made no mention of possible damage to server equipment. Furthermore, the IP address was traced back to UK2 Group. Soon after, the CEO of UK2 Group came forward and confirmed rumors about the edit: "Yes, we added that - I feel we had the proof to do so, and it was only fair to warn others. Can you blame us?" |
Credibility is a precious commodity. Hard to gain, but easily lost. I will be watching developments around this issue very closely as I determine whether or not to maintain hosting with them.
I have several sites that have experienced intermittant outages today (up to 45 minutes at a time). This suggests they continue to work on the issues, but something still doesn't quite feel right about this and the explanations.
Two sites went down (or came back up?) with DB tables needing repair. My other sites hosted with them were apparently unaffected. But then I get a series of emails saying one (or more) unidentified sites of mine were on effected servers. Still everything is chugging along fine. Then the intermittant outages begin.
I suspect, given what I have seen, that there is damage to core infrastructure beyond the reports of the odd server here and there. Why else would otherwise uneffected sites be experiencing downtime 3 days after the fact?
Ug, another day gone, and although the 4.0 account is made, the dns has not changed...so everything including email is still dead.
Does anyone have a solid company they can recommend that comes with pre-setup apache, php, mysql, mail, allows for ssh, and gives their users ftp access to the actual mysql folder for their account?
Ideally they would have the ability to let us edit the php.ini and httpd.conf file for our account as well.
"wsl05010, wsl05006, wsl06013, wsl03020, wsl02013, wsl02006, wsl01013, wsl00012 - Shared Servers
* UPDATE * 05:24pm MT 2/23/10 - We are currently evaluating our backup restoration tools and sorting out the best way to get these things started. I know everyone is in a rush to get this going, but we must be extremely careful with this restoration process - since part of our restoration tools were destroyed during this event, we're going to have to do these by hand. As soon as we get a realistic time as to when the backups can begin, we will post it here. As we stated earlier this week though, we want to be completely transparent at this point - it could still be another 24 - 72 hours for the backups to be completed."
My server is wsl05006.
Sorry to hear Ozzie,
Funny part for me is I did comprehensive backups just prior to this, I don't need my data restored, I just need the bloody machine to work so I can upload my material....and have it do what's supposed to ;)
Gone for the night, who know, perhaps tomorrow will be a fresh new day :)
Oh trust me, we're going to make a big change, backing up to our own computers regularly after this. But after being with Westhost since 2004 in some form or another, we're also looking to change due to the way this was handled with lies early on - followed by delay tactics, problem is finding a good host.
|We are currently evaluating our backup restoration tools and sorting out the best way to get these things started. |
Ummm, what did I say about relying on hosting company backups?
[laughing so hard I think I ruptured my spleen]
Let me get this right. They didn't put just the backups in the same room as the originals, they put their "restoration tools" in the same room as well?
This is a great thread that I forwarded to my clients who insist on using westhost. Westhost is a reseller: they host nothing. Their techs are in an office, not a data center..they own no data centers. Zero. Total culture shock between the site manager 3.0 and the cPanel 4.0.. I thought site manager was killed about ten years ago. It's funny this happening (not to those of you with servers) but because it is exactly what I was talking about with my clients and down stream resellers.
I tried moving to the 4.0 yesterday....it's a lot prettier, but doesn't offer the same features the 3.0 does...I backup my mysql folder directly via ftp...the 4.0 version does not allow you access to the folder your databases are kept in....thank god I was able to keep the 3.0...would have been a royal pain in the butt to setup a local server to export everything to .sql
It's alive! :)
Seems not only is everything running again, but they did manage to find a current copy of the data.
Now to figure out how to keep management happy this won't happy again...thinking of paying for a 2nd hosting plan and only activating it when this re-occurs.
It does not allow access because you can use the backup tool in 4.0 or, like I do on 3.0 use PHP MyAdmin and pull the database. Or you could set up a cron job to do the same. 4.0 does some things differently and a lot of things better. However you can't have the great spam tool that's in the mail package, mail aliases in 3.0. You got me on that feature ;) I'm going to have to keep the mail in Utah on Cogentco's web servers and the websites in California, on 10TB. Which is owned by the same company that owns Westhost.
> It's alive!
Glad to hear that, David. Since the one site I mentioned was one of the very first to come back up, I was feeling more than a bit guilty watching you suffer through days of downtime...
wsl00012, wsl05010 - Shared
"7:58pm MT 2/24/10 - These shared servers suffered a total loss, including the backup server the backups were located on. A few hours ago, we contracted a third party data restoration service (at a great expense) to attempt to recover the data off the backup drives. At this point, we are waiting to hear back from them to determine chances of full, partial, or no backup scenario. As soon as we get the official word, we will update this status post. The situation with this server is dire, but we are exhausting every avenue possible to retrieve as much data as we possibly can.
* UPDATE * The data recovery team is working on our backup drives - We won't know the result until tomorrow morning (2/25), expect an update then.
Anyone want to take a stab at an estimate of the number of sites on each server?
I know they fixed the dedicated servers first, that's one person paying something like $259 a month and up.
Wouldn't shared servers have a boatload of people paying between $10 and $54 a month like we are? Still not up by the way.
Nativeozzie, many of my sites on shared servers came up long before my DS.
Just wanted to say how sorry I am for those who have had to go through this with WH. I'm not with them, but the server my site is on had issues several weeks back, and was down for about two days. First the server went read only, then a disk in RAID failed. It was replaced, and then went read only again. Then a driver failed.
My host ended up installing the server on new hardware and restoring everything from backup. I lost a couple days business, and while they were honest about what was happening all along, those of us affected did complain that the updates weren't frequent enough.
They then made an attempt to keep us updated regularly on their forum.
I do have DNS on another server, in another data center, with a different host, so I can make changes quickly if I need to. I have backups locally as well. My site isn't big enough yet to afford failover or multiple redundant servers in different data centers, but those sure seem to be required precautions these days for those who absolutely cannot risk having their sites down.
Hang in there, I know how you feel. It's very frustrating.
My main site just came back online. I'm waiting for my test email to come through. Bit by bit...
I am still waiting for my main email to come back.
I got a 96 hour ETA Monday night from them.
I have over 30 clients hosted on Westhost, thankfully most on different servers. Only 2 are still down since last Sat. They've been on WH since about 2002 and have been moved to different servers through the years so I don't have the server # but must be one of those that was fried. Westhost's new NetStatus page is very informative if I just knew the server #.
Lorel, I think you have to go through them (chat/messaging) to get the server number, since it shows that in the control panel, which, or course, we can't get to if the server is down.
I have my sites dedicated IP, so I can check the status from there (still 404, of course)...this late into the game I'm assuming I'm on a fried server.
Thankfully I had backups with current versions of the most important pages sitting right on my main computer here from doing some long overdue site upgrading over the last month, but this shocked me into reality, as I'd gotten a little complacent lately about doing complete backups (as I think is human nature when you have those long stretches that everything is running smoothly).
My site is basically html (and I use a third party cart), so I didn't have the database issues to contend with that some other folks have had...I feel bad for those people.
Even with getting hosting up quickly elsewhere, and only losing about 12 hours before switching, we took one hell of a hit in business for the next 3 days because of the time it takes for the DNS to propogate (or however they spell it) to different ISP's.
I'd never heard the words 'failover' or 'dns hosting' before a few days ago, and we'll surely be working toward learning and setting that up so we won't be in this situation again.
Sometimes being a webmaster/entrepreneur is like trying to fight terrorism...you try to plan for every negative scenario, but there's always something out there that you (or maybe no one) had thought of or even knew about.
...or maybe it's gremlins...
MY email server just came back online about 15 minutes ago!
Everything is there, their postmaster even kept all new messages safe for me.
I am not happy about how long it took but I do understand why.
I am moving to a local host though and my off-site backup plan with them is going to be that I go pick up a drive with my stuff backed-up on it every night at which point I will return the one from the day before, and so on.
The great thing about local companies is that you have a door you can knock on.
Glad to hear it...especially that you got all of your emails, which was a big concern to most.
Hmmm...just hit my dedicated IP, and my site is now up, so I guess I was NOT on one of the fried servers. Seems to be loading the graphics a bit slowly, so I guess they're still doing some maintenance on it.
Yes I have noticed a bit of a lag on the sites living in my reseller account as well.
I assume it is temporary, I will only bug them about it if it persists into next week. I am sure they have enough tickets to drudge through at the moment.
After days of my site being down, looks like I'm out of Google SERPs. Any tips for getting back in?
Also, it was my understanding from reading posts on the Westhost forums that Westhost does not have their own data center, but uses a colocation service. Their moderators never responded to those posts, so I was wondering if anyone here knew for sure.
It appears "their data center" is owned and operated by Consonus Technologies, and that WestHost does not own or operate a data-center facility.
The bulk of the wording of the outage notice posted at WestHost was actually a cut 'n paste from an email Consonus sent to WestHost, (re: Inergen release and drive failures).
Now, about INERGEN damaging drives or other hardware; form your own opinion -- I've formed mine based on the following:
According to TYCO, (which manufactures and markets fire suppression systems and a "gaseous agent" under the registered trademark name INERGEN; "INERGEN agent is a mixture of three inerting (oxygen diluting) gases: 52% nitrogen, 40% argon, and 8% carbon dioxide."
The marketing material goes on to say: "..INERGEN agent will not threaten software or hardware because it does not break down to form damaging acidic byproducts..."
Then further explains that INERGEN does not liquefy, (even when compressed, it is still "gas" -- which means it could not have squirted, sprayed, or poured onto hardware). INERGEN works by depleting the oxygen in a [closed] room.
I wonder what really happened...?.?.?
I think I mentioned that a few posts ago about who WestHost is, or is not. "Cogentco's web servers and the websites in California, on 10TB. Which is owned by the same company that owns Westhost." They actually have two vendors. Cogentoco was the one hit. Right on about INERGEN ..LOL we are just assuming it was the electronics fire suppression system and not the sprinkler system which is really what sounds like activated. That would create some excitement.
| This 129 message thread spans 5 pages: < < 129 ( 1 2 3  5 ) > > |