Forum Moderators: phranque
RewriteCond %{HTTP_USER_AGENT} ^Download\ (bŠDemonŠNinja) [NC,OR]
211.1.xxx.213 - - [27/Oct/2003:00:51:10 -0800] "HEAD /favicon.ico HTTP/1.0" 403 - "-" "Download Ninja 2.0"
211.1.xxx.213 - - [27/Oct/2003:00:51:10 -0800] "GET /index.html HTTP/1.0" 403 - "-" "Download Ninja 2.0"
211.1.xxx.213 - - [27/Oct/2003:00:51:10 -0800] "GET /page1.html HTTP/1.0" 403 - "-" "Download Ninja 2.0"
211.1.xxx.213 - - [27/Oct/2003:00:51:12 -0800] "GET /page2.html HTTP/1.0" 403 - "-" "Download Ninja 2.0"
211.1.xxx.213 - - [27/Oct/2003:00:51:12 -0800] "GET /page3.html HTTP/1.0" 403 - "-" "Download Ninja 2.0"
Is there a method to stop any connection to the server after, say, 3 or 4 403s from the same user? (Some generic rule prior to obtaining his IP)
Thanks.
You can ask your host to block them by IP address at the firewall, but that's the only way to "stop a connection."
An alternative is to make your custom 403 page very short - just put a link on it for humans to click for more information, and trim the <head> of the page down to the bare minimum tags. Humans may follow that link, but most bad-bots won't. At least this will keep your bandwidth down.
I've raised the point before, but it bears repeating: HTTP is a stateless protocol; Each request exists separate and independent of every other request -- The server makes no association between them. So while Webmasters may create scripts to track "sessions" -- associating groups of requests with each other, the server does not do so. As such, solutions for problems like this lie outside the server software itself, and must be scripted or coded and compiled-in separately.
Jim
That's a good idea for crippling very simple 'bots. However, the ones that run multi-threaded can just open a new and separate session while the previous request is still sleeping. The worst ones can hit you with dozens of requests in a second. However, they too have a limit on the number of threads they can support, so despite the fact that this isn't a cure-all, it does usually slow them down to some extent. Just make sure your server can support a large number of (sleeping) processes, otherwise the sleepers may hurt you more than they hurt the bad 'bots! :(
Jim
The ones who have pulled pages very rapidly on my site for several minutes have been from the same IP so far. You bring up a very good point on the processes. On second thought, I guess I'll take this down just in case my jumpers decide to start attacking. Last thing I want to do is crash the server with too many sleeps. lolol
Back to the drawing board.