Forum Moderators: phranque

Message Too Old, No Replies

Best way to tell users we're too busy

Through Apache? mod_jk? Tomcat?

         

sublime1

1:42 am on Dec 19, 2004 (gmt 0)

10+ Year Member



We're regularly getting hammered by various bots, IE's "offline content" stuff, site downloaders and all sorts of not so nicely behaved user agents. These guys put load on our servers that from time to time create more load than our servers can handle.

When this happens, the servers don't crash or fail, they just get slower and slower. So any extended flood of requests that's more than our throughput will eventually make all requests go into a queue that gets longer and longer, making performance slower more and more quickly, in effect, bringing our site down.

Of course we're working on doing smart things like identifying and shooing away bad bots, optimizing our performance, increasing our capacity and the like.

But I want to find a "when all else fails" mechanism that returns a 503 Server Unavailable response. We have Apache sitting in front of Tomcat using the mod_jk connector. Some things I have considered are:

1) Setting the max<something> directives in Apache to limit the number of pending requests, or
2) Setting some maximum number of threads, connnections, or something in the worker.properties file that configures mod_jk, or maybe Tomcat's server.xml for the connector, or
3) Checking or monitoring for some maximum number of concurrent requests in the Java code.

Whatever we go with should dismiss transactions it can't handle as fast as possible. This suggests doing it in Apache. But the performance issues are not with Apache but with the application being run by Tomcat (and using our database back end, etc.). This suggests that Tomcat, or even our Java code should monitor things.

What is the right directive in Apache, assuming I can say "when we have more than this many pending requests, we should turn away the rest? Can anyone point me to an example?

Or, is there a way via Apache's conf file to measure actual average response time so I could say "If the average response is taking more than 5 seconds, defer other connections until things get better"?

I think there's a setting in the server.xml for the mod_jk connector that specifies some limit. Is this the right way to go? What is the right setting?

Or should my doGet() methods in Java check some servlet container state and handle the problem there?

And now that I think of it, we might even be able to use our BigIP load balancer to protect our servers from being overloaded. Anyone have any suggestions for this kind of solution?

Any advice and pointers to specific suggestions on how to deal with this are gratefully accepted!

Sublime1

jatar_k

1:49 am on Dec 19, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



hey Sublime1,

I think the best approach is to get rid of the bad bots etc. Concentrate efforts on that. You really don't want to turn away any legitamite users because of a bunch of bots. The key to the situation, as I see it, is to weed out all unwanted traffic to allow the good traffic through.

I have a couple threads that might help, they are quite long but I suggest reading the whole thing.

A Close to perfect .htaccess ban list [webmasterworld.com]

sublime1

5:02 pm on Dec 19, 2004 (gmt 0)

10+ Year Member



Thanks jatar_k --

I have read all of these threads and am moving forward with this bad bot banning. It will probably solve the majority of our problems.

For the remaining cases I still contend that it's better to have a server report a "too busy" condition than to effectively not respond at all.

So, I'm still seeking guidance :-) Thanks all!

jatar_k

5:45 pm on Dec 19, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



my opinion is that if you ban everything useless and still have too much traffic, you need some new boxes or a new pipe.

I never turn away good traffic.

sublime1

6:31 pm on Dec 19, 2004 (gmt 0)

10+ Year Member



:-)

Right, of course. Exactly my objective, too.

My philosphy is to design, create, install, test and deploy a system that never fails. But, when it does fail... :-)

I appreciate your guidance. I'm still hoping someone else might be able to suggest a simple fall-back solution.