Forum Moderators: phranque

Message Too Old, No Replies

How to check for downtime

host dosent believe server was down

         

dauction

9:06 pm on May 19, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



have WHM and Cpanel WHM also have server access via virtual dedicated set up. Also have the usual awastats,webalizer etc..

just want to locate a specific 2 hour time location and verify no traffic ..but cant seem to find out how or even if I can with the available tools

bobothecat

9:28 pm on May 19, 2006 (gmt 0)



Do you have access to your raw log files?

dauction

9:39 pm on May 19, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



bobcat that's the first thing I did today .. but apparently only saves on daily basis..I have todays stats for example but not the 16ths ..

if it saved these (older dates) which file would it be in?

bobothecat

9:48 pm on May 19, 2006 (gmt 0)



If you have SSH or telent access to the site, I'd look in your 'logs' file - most hosts save a few days worth of access-log's, could be named something like access_log.1.gz or similar.

dauction

9:58 pm on May 19, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



beautiful..got it ..thanks bobcat ..
you da man

Lobo

10:04 pm on May 19, 2006 (gmt 0)

10+ Year Member



I use internetseer to warn me of down time.

It works fine, continually monitors your site and sends instant alerts when it goes down, then sends you a recovery alert with the exact amount of time it has been down.

you can monitor 2 sites for free..

I think I like the speed of it and I can tell my server to the second on which day at which time it was down with out checking logs etc ..

jdMorgan

10:22 pm on May 19, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Such services do not usually tell you exactly how long your site was down. They tell you the time elapsed from the last regulary-scheduled (usually-hourly) access to your server that failed until it began responding again. After they detect that your site failed to respond, they begin testing it more frequently -- once a minute instead of once an hour, for example.

In order to give you an exact server downtime -- to the second -- they would have to send a request to your server twice per second, and not many Webmasters would desire that frequency of access. It would also cost the service provider more in terms of bandwidth, and make it harder for them to provide the service free or at low cost.

In simplifed terms, here's how it effectively works for a given server:

  1. Wait one hour.
  2. Poll server for response (usually a HEAD request to the URL you specified).
  3. If server responds, go to step 1.
  4. Else start downtime timer.
  5. Send "Server down" e-mail.
  6. Wait one minute.
  7. Poll server.
  8. If no response, go to step 6.
  9. Else stop downtime timer.
  10. Send "Back online" e-mail reporting elapsed time.
  11. Go to step 1.

So, the time they give you may be 59 minute shorter than the actual outage, or it may be correct, or anywhere in-between. It depends on the sampling rates they use. You can see the normal sampling rate in your log file.

Jim

Lobo

11:07 pm on May 19, 2006 (gmt 0)

10+ Year Member



Perhaps but as I stated it is not the case with that option, it works fairly well..

jdMorgan

11:33 pm on May 19, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I use them too, on some sites, and the above is derived directly from their behaviour. Check your server logs: If they normally poll your site once an hour, then according to sampling theory [google.com], it is impossible for them to report accurate results to a one-second resolution.

In order to sample something and report a state change to an accuracy of time interval I, you must poll at an interval of I/2. That assumes that the polling frequency for both states of a 2-state system is the same. In this case, it is not; They poll slowly for failure, and poll much faster for recovery. The best accuracy thay can achieve if they poll once an hour for failure and infinitely fast for recovery is a one-hour accuracy.

To illustrate: They could poll once an hour, and polling it just before your server died, find it OK. They would not poll again for one hour while your server is still down, and then poll again after this first hour. At this time (after the first hour), they would find it down, and then begin polling very fast - say once per second. If, after the first hour and one second of actual downtime, the server came up, they would see that it had been down only for one second.

Actually, they used to poll once an hour, give or take a (very) few minutes. In the past several years, though, I've seen them polling less frequently, and with a much larger variation in the sampling rate. I guess they just have too many subscribers, and too few machines to run their polling client on.

Regardless of the advertising hype and the description on their site, they cannot achieve the accuracy they imply in the downtime report with such a slow failure sampling rate. No-one can break the laws of physics -- or Nyquist and Shannon. This is not to say the service is not useful -- it is. But you need to understand what you're getting.

Their paid service undoubtedly polls faster, and thus will have better downtime report accuracy.

Jim

bobothecat

11:42 pm on May 19, 2006 (gmt 0)



Personally... I would never question Jim :)

Lobo

11:50 pm on May 19, 2006 (gmt 0)

10+ Year Member



lol I've got to agree... yet I have been impressed with their accuracy, it seems to ping my sites far more frequently...

jdMorgan

12:08 am on May 20, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Bobo, please question -- I can't learn if I don't make mistakes! And I have learned plenty here. But thanks for the vote of confidence...

Lobo, the accuracy depends entirely on the polling frequency. If they poll your sites faster, then the report will be more accurate. But even on my sites, where they poll (about) once an hour, the downtime report still shows hours:minutes

I don't think they mean to mislead, but it's just impossible to get that accuracy with the one-hour polling rate on my sites.

Jim