Forum Moderators: phranque

Message Too Old, No Replies

People can't access my site and I have no idea why

         

bcc1234

10:16 pm on Jan 5, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I got a big problem. Some people can't access my site.

I run a web-based newsletter, so I get feedback that I wouldn't get if I weren't in contact with my visitors. I suspect this problem affects my other sites as well, but visitors of others sites (not newsletters, message boards or otherwise membership-based sites) are not likely to contact me if the site is down. They would simply move on.

What makes the problem really strange is that people who report they can't access that site, say they could access it in the page, and are usually able to access it later on. They all have different ISPs and different browsers.

I can't find anything in common among those who e-mail me and say they couldn't open the page.

I'm not able to replicate/catch any problems myself. I've also tried setting up a script to hit that site from another location every couple of minutes and log the results. I've tried uptime monitoring services, and nothing shows any problems.

Yet, I keep getting an e-mail from someone who says they can't access the site and read the newsletter at least once every other week.

I've checked everything from dns to web server to application server to database to firewall, and everything seems fine.

Does anyone have any idea what I could try to find out what the problem is?

At first, I kept dismissing such e-mail and simply told people to try later, but it's been like that for a year now, and if I do lose some visitors because something is wrong with that box, then I would like to know what it is and fix it.

Any ideas?

txbakers

12:12 am on Jan 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That's a tough one.

Are you doing anything cookie based? Any code that relies on cookies?

Another common problem could be invalid HTML markup that certain browsers are just not showing.

The best way to debug this is to somehow work directly with the people having the problem and trouble shooting various things. You might have to call them, or even invest in a WebEx type of session so you can see and control their screens.

I have a secure site that people can see, but when they login, it kicks them back out. This was caused by a Microsoft bug in the Content Advisor. When that was enabled, it blocked Secure Cookie-based sites.

bcc1234

12:24 am on Jan 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Nope, not on-page stuff.
They say they get a message "page not found" -- not 404, but site does not exist.

I've even tried running a script to do dns lookups every minute and log errors. It ran fine.

Really weird.

As far as asking them to help troubleshoot, I doubt I would get much help from them. Imagine if someone asks you to install some software so they can "see" your desktop and what's going on?

I doubt someone would agree to do it.

[edited by: bcc1234 at 12:26 am (utc) on Jan. 6, 2007]

DamonHD

12:26 am on Jan 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi,

Site not found implies that you have a DNS problem, ie their browsers cannot even find the IP address of your server given the URL.

Rgds

Damon

bcc1234

12:27 am on Jan 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I even thought about hiring a consultant to test the site. But I think I'll just end up with a report saying that all is well.

bcc1234

12:29 am on Jan 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Site not found implies that you have a DNS problem

I know. I worked as a system administrator in my previous life. Trust me, I've checked all the basic (and not so basic) things.

DamonHD

12:32 am on Jan 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi,

Have you tried using one of the 3rd party DNS checker sites that will come in from outside in the same way that a user would, eg dnsreport.com?

If it really isn't DNS then your hosting company may have terrible routing problems; get a few of your users that see the problems to run tracert/traceroute to your Web server.

Rgds

Damon

bcc1234

12:42 am on Jan 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



DamonHD, it's EV1. They seem pretty reliable. But yes, it does seem like some kind of a routing problem. I think IE would display similar message for dns lookup failure and for a timeout while connection to a remote host. It would "feel" different, but I doubt my subscribers would know the difference.

I just e-mailed one of them asking if they get an error right away or if it takes a bit of time while "loading" before they see that error. Maybe that would clue me in a bit.

I tried checking DNS from other locations. From my own boxes elsewhere and by using various on-line services.

DamonHD

1:30 am on Jan 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi,

A DNS issue will be obscured by the fact that once a good record is cached it'll last for your TTL, eg maybe a day or more, so it'll 'just work' for a bit.

So, as I'm sure you'll be doing, try to clear any caches and avoid going through any intermediate DNS resolvers before repeating each test.

Rgds

Damon

bcc1234

1:33 am on Jan 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A DNS issue will be obscured by the fact that once a good record is cached it'll last for your TTL

Not if I'm directly querying one of the root servers, then one of the com servers, then one of mine, and checking results along the way.

DamonHD

3:43 am on Jan 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi,

If you're doing that (and that puts you in the "power user" category), then the cacheing problem isn't confusing *you*, but it may still be muddying the symptoms seen and reported by your users.

But yes, sounds more like a routing problem.

Time to try as many 3rd-party traceroutes (there must be dozens of sites out there) from as many corners of the planet as possible!

I wonder if you could register your host/router/server with [internettrafficreport.com...] to have it do some free monitoring for you: I did it for one of my dodgy connections!

Rgds

Damon

bcc1234

3:54 am on Jan 6, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



but it may still be muddying the symptoms seen and reported by your users

In what kind of scenario? I can't think of anything.
If servers authoritative of the com zone list my server as authority for mydomain.com zone, and my server returns A for www.mydomain.com -- how can caching bring in any problems? Assuming I don't have any weirdness in my zone like retry time greater than refresh, not that it would matter.

As far as routing problem, EV1 is not the smallest company in the world. I'm sure if they were having problems routing with their peers, it would be known.

I wonder if you could register your host/router/server with [internettrafficreport.com...] to have it do some free monitoring for you: I did it for one of my dodgy connections!

I doubt that. I'm way to small to be on that map :)

txbakers

2:31 am on Jan 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Did you ask some of the users to try to connect via IP address rather than dns?

I had that problem with ONE user (can't figure out why this ONE user can't connect via DNS) and gave her the IP address and it worked just fine.

bcc1234

2:34 am on Jan 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



txbakers,

Good idea. Next time someone complains, I'll just send them two links (one with the ip) and ask to click on both of them.

bcc1234

2:35 am on Jan 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Crap. I just thought about it. My webserver returns a 301 redirect to that domain name if the host request field doesn't match that name. So a request for that IP would get a response with a redirect to that domain, which would need a DNS lookup. And I doubt the user would be able to tell the difference. They would simply see both links fail.

encyclo

2:54 am on Jan 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm watching this thread with interest, I have a similar problem with one single user who is unable to access a site of mine. I managed to get then to pipe the resuts of a traceroute into a text file and email it to me, and it showed a timeout at a router in the hosting company's datacenter. However, no other user has complained, and many other users have the same ISP as the problematic member. I even got them to modify their hosts file to hard-code the server's IP address and avoid a DNS lookup, but I'm still no further forward.

bcc1234

3:29 am on Jan 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



encyclo, my problem is that all people that complain have different ISPs.

The ones I can remember right now have AOL, wmconnect, verizon.

Some say that they were able to access the site in the past (read previous issues), but can't access it "now".

For some, the problem seems momentary. When I ask them to try again, they say that now (usually the next day or several hours later) everything is working.

Some say they hadn't been able to access the site for a "very long time". I actually had a few people ask me to remove them from the list because they can't see the newsletters when they click the link.

Yet others, say they can't access it completely -- never could from the start.

This is really driving me crazy. I can't find anything similar among the people that complain. And I'm just afraid to think that for each one that actually e-mailed me there are many more who didn't.

encyclo

3:39 am on Jan 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think that the evidence so far in what you're seeing (and me too I think) indicates a routing problem rather than a DNS problem. Time to start hassling the hosting company again...

bcc1234

4:07 am on Jan 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would love to, but what can I say to them?
I can't replicate the problem. I hadn't even encountered it once myself.

DamonHD

11:44 am on Jan 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Routing problems *are* elusive usually.

Find a EV1 sensible techie, not some first-line droid who just wants to get you off the line.

Explain the issue as you have to us, and ask them what they might suggest so that you can demonstrate that it is or isn't a DNS or routing issue, and whatever other troubleshooting they suggest.

Point them at this thread!

Rgds

Damon

[edited by: DamonHD at 11:47 am (utc) on Jan. 7, 2007]

txbakers

3:11 pm on Jan 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I even got them to modify their hosts file to hard-code the server's IP address and avoid a DNS lookup, but I'm still no further forward.

What to do if the person is using a Mac?

DamonHD

12:41 pm on Jan 8, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



/etc/hosts?