Welcome to WebmasterWorld Guest from 3.209.80.87

Forum Moderators: Ocean10000 & phranque

ERR CONNECTION RESET for some users

     
12:14 am on Aug 3, 2019 (gmt 0)

New User

joined:Aug 3, 2019
posts: 7
votes: 0


I have a website on Centos 7.4 running Apache 2.4.6
Recently I'm getting some users unable to view the site. On trying to load any page they instantly get an ERR_CONNECTION_RESET and the page does not load.

I managed to recreate this using my 4G (mobile network) connection on my phone, when I use WIFI on my phone the site works fine.

Any suggestions for things I could check which could be causing this?

I use mod_ssl with virtual hosts if that's relevant.

I cannot find anything in any error logs relating to this error.
10:59 am on Aug 3, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member Top Contributors Of The Month

joined:Nov 13, 2016
posts:1194
votes: 285


ERR_CONNECTION_RESET is a rather loose error , because it can come from the server, the client, or anything between :). That being said, usually, client are getting this error when the browser receives the FIN packet (FIN = Finish), without having receive the "content". As the name stands, the FIN packet is the indicator that the connection is finished, and "in theory", that all the data (content) packets have been sent. I would think this is most of time due to bad network connection, causing lost packets "unrecoverable". In theory, when the browser detects missing packets, it tells the server, to send them again, but if the communication is bad, the server might not receive the message, etc etc... So I would think that, in your case, these users are suffering from network issue. This might be temporary. Try to see if these users are using the same ISP, are located in a given area, etc... check if it happens often, always, rarely. You can also try to do trace routes to your server, from different location, to try to see if there is peering problems.

Also, check your PHP logs (or other language) , to see if PHP is not crashing, or abruptly ending the communication.
12:04 pm on Aug 3, 2019 (gmt 0)

New User

joined:Aug 3, 2019
posts: 7
votes: 0


It's not letting me edit my original message.

I just wanted to update, I think it's some kind of TCP issue that is resetting certain connections. Not idea why it does this to some users (i.e. types of connections) and not others.

I have been using netstat -s but I'm not sure which parts are relevant. It's a production server so I can make a failed connection attempt, but I cannot be sure I'm isolating it in the data.

These stats go up when comparing a few sample before/after outputs:

Tcp:
active connections openings
passive connection openings
failed connection attempts
segments received
segments send out
segments retransmited

TcpExt:
resets received for embryonic SYN_RECV sockets
delayed acks sent
packet headers predicted
acknowledgments not containing data payload received
predicted acknowledgments
times recovered from packet loss by selective acknowledgements
congestion windows partially recovered using Hoe heuristic
forward retransmits
TCPSackShiftFallback
TCPDeferAcceptDrop
TCPRcvCoalesce

IpExt:
InOctets
OutOctets
InNoECTPkts
InECT0Pkts

Any ideas?
12:06 pm on Aug 3, 2019 (gmt 0)

New User

joined:Aug 3, 2019
posts: 7
votes: 0


@Dimitri thanks for your reply.

I am on a mobile network in the UK, and one of the other users experiencing this is in the US, so I don't think it could be a problem with a particular ISP.

There are no problems browsing other websites on these connections
12:11 pm on Aug 3, 2019 (gmt 0)

New User

joined:Aug 3, 2019
posts: 7
votes: 0


I should also add, that this is not a widespread problem. Lots of users can use the site fine. If any substantial number were having problems, then I'd be getting messages on Facebook about the site being down etc.

Is there any way to log additional information each time I get a reset TCP connection?
12:33 pm on Aug 3, 2019 (gmt 0)

New User

joined:Aug 3, 2019
posts: 7
votes: 0


This means nothing to me but it's a tcpdump of a failed (reset) connection attempt:

12:26:37.001292 IP [IP].threembb.co.uk.55071 > mydomain.com.https: Flags [S], seq 2751361480, win 65535, options [mss 1326,sackOK,TS val 138818001 ecr 0,nop,wscale 8], length 0
12:26:37.001384 IP mydomain.com.https > [IP].threembb.co.uk.55071: Flags [S.], seq 2839564584, ack 2751361481, win 28960, options [mss 1460,sackOK,TS val 229549639 ecr 138818001,nop,wscale 7], length 0
12:26:37.004298 IP [IP].threembb.co.uk.55072 > mydomain.com.https: Flags [S], seq 1324045316, win 65535, options [mss 1326,sackOK,TS val 138818003 ecr 0,nop,wscale 8], length 0
12:26:37.004344 IP mydomain.com.https > [IP].threembb.co.uk.55072: Flags [S.], seq 4077890442, ack 1324045317, win 28960, options [mss 1460,sackOK,TS val 229549642 ecr 138818003,nop,wscale 7], length 0
12:26:37.155320 IP [IP].threembb.co.uk.55072 > mydomain.com.https: Flags [.], ack 1, win 343, options [nop,nop,TS val 138818059 ecr 229549642], length 0
12:26:37.155497 IP [IP].threembb.co.uk.55072 > mydomain.com.https: Flags [R.], seq 1, ack 1, win 32120, length 0
12:26:37.162855 IP [IP].threembb.co.uk.55071 > mydomain.com.https: Flags [.], ack 1, win 343, options [nop,nop,TS val 138818059 ecr 229549639], length 0
12:26:37.163614 IP [IP].threembb.co.uk.55071 > mydomain.com.https: Flags [R.], seq 1, ack 1, win 32120, length 0

note: I have replaced the connecting IP with [IP] and my domain name with mydomain.com
5:56 pm on Aug 3, 2019 (gmt 0)

New User

joined:Aug 3, 2019
posts: 7
votes: 0


So I used tshark to analyse the connection on the same device (my mobile phone) using the mobile network (where I cannot access the site) and the WIFI (where I can access the site).

The mobile network attempt only has TCP entries, the WIFI connection has many more entries which include stuff like:
SSL 612 Client Hello
TLSv1.2 207 Server Hello, Change Cipher Spec, Encrypted Handshake Message
TLSv1.2 117 Change Cipher Spec, Hello Request, Hello Request
TLSv1.2 1504 Application Data

I added logging for the ssl_engine, ssl_request, ssl_access in apache but they don't come up with anything, so something must be failing before that?
6:04 pm on Aug 3, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member Top Contributors Of The Month

joined:Nov 13, 2016
posts:1194
votes: 285


I wonder if it can be related to old browsers , or midleware boxes, which are too old, and not supporting your TLS protocols and ciphers.

edit: this is NOT good to downgrade your protocol list and cipher suites because older protocols are no longer safe.
9:03 pm on Aug 3, 2019 (gmt 0)

New User

joined:Aug 3, 2019
posts: 7
votes: 0


I'm not sure this is a Cipher issue because it doesn't get that far, it fails just before the client tries to say hello to the server.

Also, since this problem is recent, it seems unlikely that the client is using an older cipher now then it was last week?

The problem is present on both Chrome and the device's native browser.
9:51 am on Aug 4, 2019 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member brotherhood_of_lan is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 30, 2002
posts:5040
votes: 57


Is it a high traffic site? Perhaps you're exceeding the max connections and any more connections get dropped.

[oxpedia.org...]
1:07 pm on Aug 4, 2019 (gmt 0)

New User

joined:Aug 3, 2019
posts: 7
votes: 0


No it's not high traffic. It works 100% of the time on WIFI and fails 100% of the time on the mobile connection. I've tested it 1000s of times while trying to fix this, so I don't think this is a random connection drop.

I'm getting this in the error log for the IP concerned with the mobile connection:

[ssl:info] [pid 6031] (70007)The timeout specified has expired: [client IP] AH01991: SSL input filter read failed.

It's only "info" alert level and I haven't had any luck trying to find an answer to that online, it may not be relevant.

I have tried increasing timeout settings in apache config to no avail. The ERR_CONNECTION_RESET message happens immediately when trying to access the website.

From my tshark analysis the client never sends a "hello" message, so I guess behind the scenes the server is timing out, but that is not what is causing the issue.

I'm just trying to work out why the server is not receiving a hello from the client.
1:11 pm on Aug 4, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member Top Contributors Of The Month

joined:Nov 13, 2016
posts:1194
votes: 285


i don't know if it can help, but can you post your nginx configuration file ?
10:13 pm on Aug 4, 2019 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11847
votes: 242


I have tried increasing timeout settings in apache config to no avail

can you post your nginx configuration file

when you instruct a server to "listen" to port 443, there is nothing to hear until a secure connection has been made.
in other words, until the secure handshake is complete the web server config is irrelevant.
10:23 pm on Aug 4, 2019 (gmt 0)

New User

joined:Aug 3, 2019
posts: 7
votes: 0


Thank you all for your responses.

So I've finally got to the bottom of this error, after a few days of scratching my head and spending a lot of time researching and fiddling with the server config.

It turns out the website has been blocked by a couple of ISPs who have decided to now deem the website as containing adult content (it's doesn't and has never had this issue before in its 10 year history).

Rather than providing any kind of useful redirect to let a user know why they can't access the site, they simply don't send a hello packet and up comes the ERR_CONNECTION_RESET.

I must say, this is something that had crossed my mind early on in my investigations. However, rather unhelpfully, when logging into my mobile phone account it said the adult content filter was off. I discovered this wasn't in fact the case when I called them to double check!