| 2:42 pm on Apr 17, 2014 (gmt 0)|
One other thing I've been thinking about. The level of logging that takes place. On the same server I've got IIS log files, log4net log files, and MySql logging for my own analytics. But it's a blazing fast SSD and I've made the logging asynchronous... Hmmm, maybe the answer is Microsoft sucks.
|brotherhood of LAN|
| 2:58 pm on Apr 17, 2014 (gmt 0)|
Just some ideas,
- Check connection limit values for the web server
- Check connection limit values for the DB
- Check slow query log for DB.
- Check for any hard limits on the system (like ulimit values on UNIX systems)
| 3:10 pm on Apr 17, 2014 (gmt 0)|
Thanks brotherhood. My act of writing that long post gave me a path to follow. I think claiming that my software wasn't causing the issue was premature and naive. I need more measurement prior to asking for help I think.
|brotherhood of LAN|
| 3:14 pm on Apr 17, 2014 (gmt 0)|
I find fleshing something out in writing helps me too. That or sitting away from the PC for a bit. Glad you got it sorted.
The hardware does seem to be sturdy enough. What was/is the problem... web server / db ?
| 3:20 pm on Apr 17, 2014 (gmt 0)|
The problem is too nebulous right now. Basically unauthorized bots impact my revenue in a very noticeable way. Banning them by IP address spikes my revenue in a very noticeable way. Therefore, there must be something fundamentally wrong with my setup. The bots weren't that aggressive.
| 3:23 pm on Apr 17, 2014 (gmt 0)|
My load balancer is a shared resource not under my control. That piece definitely needs more scrutiny. I probably need to move to my own on bare metal. I need to be able to log into it and watch traffic patterns.
| 3:35 pm on Apr 17, 2014 (gmt 0)|
In the past I had an issue where spike of traffic (e.g. > 5 requests/second) was causing IIS server to suddenly start returning many HTTP 500. Developers fixed it, it was something to do with thread management. Perhaps another angle to look at.
I was also recently involved in investigation of another issue, it was with load balancers who when the traffic spiked above whatever limit they had, started to return HTTP 502 rather than passing on requests to the application. Increasing the limit of concurrent requests that load balancer can handle solved it.
| 3:51 pm on Apr 17, 2014 (gmt 0)|
Whoah, Aaakk9999, that IIS issue you described hits very close to home. I am probably seeing a problem when I hit > 5 ASP.NET (4.5) requests per second. Do you have any other details? What IIS version? Was it patched by Microsoft or by the app developers?
You know, I've never looked at the response codes in my IIS logs. That's a most excellent idea. It did not occur to me that my requests might be failing.
I'm allocated 1,000 connections per second on my shared load balancer. It will cut-off connections past 1,000. My graphs (which are smoothed) don't ever go past 200 connections per second. I guess it could be a bug.
Does anyone know of a graphical tool that will let you analyze IIS logs and spot issues? I've written my own console parser, but sure would be nice to point a GUI at the log directories.
| 5:40 pm on Apr 17, 2014 (gmt 0)|
The server is IIS 8.5 running together with SQL Server 2012, .NET is 4.0.30319. I am not entirely sure what they did. But I had the similar problem on another website few years back, that one was running on IIS 6.
To test I used Screaming Frog Crawling Tool where you can adjust number of threads and number of requests per second. When I had it down to 1 thread, 2 requests/sec, no problems.
When I upped the threads to 5 and capped it at >5 requests/sec, I started to get HTTP 500 which would continue until the thread pool was restarted.
I will see if I can get more details from developers on what they have done.
|Banning them by IP address spikes my revenue in a very noticeable way. |
I find this interesting. I think when you ban IPs, this happens on the server, after the request has passed through Load Balancer. If so, these IPs would be filtered by server somehow (whatever method you use). This would more and more indicate to me that it is application/threading issue or perhaps IIS memory caching management issue.
| 12:49 pm on Apr 18, 2014 (gmt 0)|
I have heard from developers. The error that was causing HTTP 500 was "ExecuteReader requires an open and available Connection. The connection's current state is Connecting."
They said they use EntityFramework and that they changed DB Connection to be created at every DB call rather than having it static/global. Static/global connection caused HTTP 500 that would be seen only when there was bigger number of requests.
They pointed to this URL for some more info:
Perhaps you should use some kind of crawl tool, hammer the site for a short while and see what happens - what you get returned on requests (if anything) and try to debug it from there.