Hello all - I've been reading this forum for the past few days (as well as about a year ago) to learn more about blocking user agents and IPs using mod_rewrite. I've been largely successful for the past year, but now I'm facing an issue that is outside my realm of understanding.
Without going into too much detail - we have many sites, each with many pages, with each of those pages interacting with a database (reads/writes) for each request.
Typically this all works very well, but rogue user agents will occasionally come in and make outrageous numbers of requests. Not too big of a deal, as I can block user agents pretty easily. This does result in what I refer to as a "de facto DOS", i.e. - no one is trying to take down our server, they're just trying to scrape content or learn more about our sites.
Now, however, I'm having issues with requests that look "normal" in the sense that they are set up to look like it's coming from a normal user/browser (no referrer, but user agent is Mozilla, in this case). Can't block this because it's identical - as far as I can tell - to a request for a bookmarked page or type-in traffic.
For the moment, I'm manually adding IPs to my mod_rewrite rules to block these as well, but I don't want my job to be playing with user agents and IP lists every day.
So, to come to the point - I would like to use script of some sort that will block requests according to a set of rules. Basically a simple if/then statement. For example, last night the IP in question resulted in 17,000+ requests in under 3 minutes. Simple scenario would be - [if "X-number" of requests from IP (to "C" block) within "X-seconds" time period, do 403].
My question - I am I thinking in the right direction here? This is outside my standard realm of experience/knowledge, and I may be reinventing the wheel here. Or, I could have devised the proper solution...thoughts/recommendations?
Quick note add: just came across iptables for Linux. We're running Apache on FreeBSD 7.3.