|Planning for 36 hr website outage - is 503 the best way?|
| 11:20 pm on May 5, 2008 (gmt 0)|
We're facing a major site outage, while we truck a couple hundred thousand dollars' worth of gear across town. My Ops team projects a minimum outage of 24 hrs and a worst-case scenario (moving back to the original datacenter after a failed migration) of 60 hours.
I know about using HTTP 503 for short outages; this is even written up in the google webmaster blog:
However, is there a point at which Google stops believing "this is a site outage" and "this site is not coming back".
In other words, at what point, if any, will Google drop our pages (millions of pages indexed, lots of good rankings) from the index?
Is HTTP 503 still the best way to go, even for an extended site outage?
During the outage, we plan to be serving a read-only version of the website from an alternate datacenter, and redirecting to there via HTTP 302 for individual page requests. Meaning, if someone tries to get to our site while we're down:
we'll redirect them via HTTP 302 to:
where they'll see a read-only version of the same page.
I'm inclined to prevent the temporary, read-only site from being indexed, by having the redirection box serve HTTP 503 to googlebot and 302 to everyone else, even though technically speaking this seems like a violation of the general principle to show Googlebot exactly what everyone else sees. But that seems preferable to having Google think our entire site is 302'ing elsewhere for over a day.
| 12:29 am on May 6, 2008 (gmt 0)|
From the HTTP/1.1 specification [w3.org]:
|If known, the length of the delay MAY be indicated in a Retry-After header |
Serve the 503 to the 'bots (be careful to use all of their know IP ranges), forego the 302, and just use DNS to switch to the temporary server. Set the 503 Retry-after header to 36 hours. Set the Time-To-Live on your DNS to a short period (30 minutes or so), then after waiting for your previously-configured Time-To-Live to expire, you can change the DNS for your main domain to point to the temporary server on 30 minute's notice. Similarly, you can change it to point back to the relocated servers once the relocation is complete.
| 12:20 pm on May 6, 2008 (gmt 0)|
I once had a **gulp** 15 hours outage. During this time, the whole site was completely offline (i.e. server not responding). Neither my rankings nor my traffic have been affected by this.
(Well, during the outage the traffic was obviously affected, and certainly my health was affected too.)
Can't comment on 36 hours, though.
| 3:32 pm on May 6, 2008 (gmt 0)|
jdMorgan - we'd considered using a DNS change to redirect requests to the temporary datacenter, but wouldn't googlebot see that too?
I have seen DNS delays within googlebot -- it caches IPs in my experience, and reacts somewhat slowly to DNS changes, which makes me lean against a DNS change here because it might take several days after the datacenter move for googlebot to 'find' our site again. In other words, using DNS seems to create a week-long problem to solve a 36-hr-long problem, due to DNS propagation delays and DNS caching.
But in any case, I don't understand how we'd serve a 503 to googlebot without it also seeing the DNS change. Maybe that's a question for my Ops team?
| 7:39 pm on May 6, 2008 (gmt 0)|
Very intresting read I assume then there is no disaster plan in effect to handle this type of issue.
What happens say there is a wreck and the equipment is damaged. I would consider maybe setting up a mirror site before the move then dns it over. Bare minimum or whole site move the equipment check the to make sure the set up is complete then reset then dns to the new IP or hopefully with this type of move you are keeping the same IP address as this will be another issue if your not.
Moving this amount of equipment without a backup site is really bad business and something your guys should have already taken care of. Heck spending a couple thousand on setting up the mirror site on a hosting company is well worth the possible worst case you could encounter with a freak accident.
We have yet moved our equipment without one going bad on us for some reason or another.
I say hold off till you set up a disaster recover site and get a better plan in place. this will as well give your people peace of mind without the added pressure to get it done to fast and mistakes are made. This will as well give you a backup site to keep just to be safe.
| 7:56 pm on May 6, 2008 (gmt 0)|
Googlebot would see the DNS change, and it would also see the short Time-To-Live setting that I recommended. I can't assure you what they would do because I don't work there. All you can do is the use the DNS and HTTP protocols as-specified, and hope that, as usual, they pay attention to the "messages" sent using these protocols.
I agree with bwnbwn that a fully-operational back-up site is the preferred solution, but I assumed that you were willing to risk some delays to avoid the cost and complications of a fully-live backup site.