Forum Moderators: phranque

Message Too Old, No Replies

what's an acceptable downtime for host backup?

my server has been unaccessable for 2 hours

         

amznVibe

10:31 am on Jan 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I am sure the instant reaction is ZERO time is acceptable
for downtime, but my host has been down now since 3am EST to 5am EST.
Normally its alot shorter than this. I think they are a victim of their own success.

They say they doing nightly backups from 4am to 5am EST
which explains extreme lag but this is getting worrysome.

They are always open to my suggestions and do things for me
like minor software updates within an hour so I don't
want to bail out on them (also have like a dozen clients on
that server that I am responsible for).

What technical solutions can I ask them to do to their server backup
to speed up the process or allow some bandwidth in and out of it while this is happening?
(it's a "standard" apache server)

Thanks for any ideas! -aV-

amznVibe

12:47 pm on Jan 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just to follow up on this and possibly help other people with host issues in the future:

seems like my host does nightly rsync backups and then weekly full backups,
and somehow they both triggered at the same time while there was decent server traffic

caused the server memory to go too low and it crashed!

just bothers me it took someone two hours to discover and correct the situation...
they have at least 98%+ uptime since I have been with them though...

-aV-

sem4u

12:51 pm on Jan 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It would bother me as well, especially as some sites depend on visitors all around the world.

Can't they take a backup and keep the service level up at the same time?

amznVibe

1:08 pm on Jan 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



well they had two backups churning away, the partial and
the full, and aparently with the traffic it was just too
much... they just told me now there was some additional
time needed to bring the server back up because they were
checking the entire drive for errors since there was a
fatal crash...

If this happens once a year I can live with it... but if it
happens again this year I might have to have a serious talk
with a senior tech there...

bird

2:01 pm on Jan 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Most hosts seem to guarantee 99.9% uptime, which means the site can be down a maximum of 3.5 days per year. Even 99.99% uptime still allows for more than 8 hours. That makes it kind of hard to judge a single incident. What's their record in the long run?

<added>Zero downtime would require physical redundancy of the server with automatic switchover mechanisms in place. That can be had, but is very expensive.</added>

chiyo

2:20 pm on Jan 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



99.9% would be 4-5 hours a year surely?

That to me is acceptable and is what our host offers. I'm just an amateur, but would assume that this situation may have been a bit unprofessional on the part of your host.

Im sure large hosts have all sorts of back up systems and system monitors to prevent that exact case happening, as Bird sugegsted. I think our host does mention that "redundancy" word in their FAQ's on their server set, and it impressed me though i wouldnt have a clue what they were on about.

Timing is no excuse. Early morning in the US is actually our own peak times, so unless their clients have 90% or more North American users its not that much of a good excuse.

If i was happy with all the other aspects of their services however i would be happy with a report if what they have learned from the experience and what they are doing in future to prevent them, and would give them another chance or two.

bird

3:23 pm on Jan 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I demand a recount! ;)

Looks like a slipped a decimal. 99.9% is 8.76 hours a year. Consequently, 99.99% is less than an hour (53.56 minutes).

Just for the sake of completeness, the 98% uptime mentioned in the second post would allow for almost a week of downtime per year. I'm not sure if I would be happy with that.

Btw: Redundancy can be applied to many different things. Most commonly that is to connectivity and routing. But to prevent downtime on genuine server crashes, your site would need to be hosted on two seperate machines at the same time, which is rare.