Forum Moderators: phranque
My gallery has been the target of comment spam attacks of late. Roughly 1,000+ per day attemps to log their links and such. My .htaccess file is over 6k's worth of banned IP's, but they still come.
There must be a better way.
Every instance of this spam uses "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.8) Gecko/20071008 Firefox/2.0.0.8 RPT-HTTPClient/0.3-3"
Is it possible to permanently block this agent?
Thanks
Jeff
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} RPT-HTTPClient/ [NC]
RewriteRule .* - [F]
For example, the pattern "Mozilla/5\.0+\(Win" matches "Mozilla/5.0(Win" or "Mozilla/5.0000(Win", but it won't match "Mozilla/5.0 (Win".
By matching the unanchored "RPT-HTTPClient/" occurring anywhere in the user-agent string, your rule will be effective no matter what browser is claimed, and no matter what version of the HTTPClient is used.
Jim
Of course it does -- You're getting server errors!
The most likely cause is that you are using a custom 403 error page, but have made no provisions to *allow* that error page to be served to denied requestors. So, you will get a second 403 response as a result of a denied request, because the 403 page itself is denied to that client. Then because that too is denied, you'll get a third denied response, and a fourth... ad infinitum until the server gives up and throws a 500-Server error, as you are seeing.
You should see this clearly in your server error log file...
And there's another problem, too...
The simplest solution is to add RewriteConds to this rule to always allow your custom error page and robots.txt file to be served, even to denied clients:
RewriteCond %{REQUEST_URI} !^/robots\.txt$
RewriteCond %{REQUEST_URI} !^/path-to-custom-403-error-page\.html$
RewriteCond %{HTTP_USER_AGENT} RPT-HTTPClient/ [NC]
RewriteRule .* - [F]
You want to allow unconditional robots.txt access because some user-agents will interpret any error response to their request for robots.txt as carte-blanche to spider the whole site. Creating 403-forbidden loops and denying spider and robot access to robots.txt are two very good ways to create "self-inflicted denial-of-service attacks" by overloading your server with error handling. :o
Jim
[edited by: jdMorgan at 4:58 pm (utc) on Jan. 30, 2009]
Please note that I am not very well versed here (as you may have already realized, but just to be sure).
Anyway, thanks again for the help, Jim. If there's another suggestion, please pass it along.