Welcome to WebmasterWorld Guest from 3.80.60.248

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

Deny Access to Specific User Agent?

     
5:30 pm on Jan 28, 2009 (gmt 0)

New User

10+ Year Member

joined:Jan 28, 2009
posts: 17
votes: 0


Hi all,

My gallery has been the target of comment spam attacks of late. Roughly 1,000+ per day attemps to log their links and such. My .htaccess file is over 6k's worth of banned IP's, but they still come.

There must be a better way.

Every instance of this spam uses "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.8) Gecko/20071008 Firefox/2.0.0.8 RPT-HTTPClient/0.3-3"

Is it possible to permanently block this agent?

Thanks
Jeff

8:09 pm on Jan 28, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


Yes, use mod_rewrite (or mod_access plus mod_setenvif) and send a 403-Forbidden response if the User-agent string *contains* "RPT-HTTPClient/".

Jim

1:25 am on Jan 29, 2009 (gmt 0)

New User

10+ Year Member

joined:Jan 28, 2009
posts: 17
votes: 0


Thanks Jim,

Here's what I've got, but it doesn't seem to be working:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/5\.0+\(Windows;+U;+Windows+NT+5\.1;+en-US;+rv:1.8.1.8\)+Gecko/20071008+Firefox/2.0.0.8+RPT-HTTPClient/0.3-3$ [NC]
RewriteRule ^.*$ - [F]

1:43 am on Jan 29, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


You don't need to match the exact string, just match "contains 'RPT-HTTPClient/'", since it's unlikely you'd ever welcome such a request, regardless of the browser base.

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} RPT-HTTPClient/ [NC]
RewriteRule .* - [F]

The problem with your original pattern looks like the literal spaces were converted to plus signs -- kind of a double problem, in that you want to match a literal space "\ ", not a plus sign, *and* that an unescaped "+" means "one or more of the preceding character, character group or parenthesized subpattern."

For example, the pattern "Mozilla/5\.0+\(Win" matches "Mozilla/5.0(Win" or "Mozilla/5.0000(Win", but it won't match "Mozilla/5.0 (Win".

By matching the unanchored "RPT-HTTPClient/" occurring anywhere in the user-agent string, your rule will be effective no matter what browser is claimed, and no matter what version of the HTTPClient is used.

Jim

3:40 pm on Jan 30, 2009 (gmt 0)

New User

10+ Year Member

joined:Jan 28, 2009
posts:17
votes: 0


Hi Jim,

Thanks again for the speedy assistance. The re-write you supplied seems to be blocking access just fine -- however -- I expected a host of 403's in the log file, but instead have close to 1100 500's (ISE). Does it matter on a shared server?

4:57 pm on Jan 30, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


> [I] have close to 1100 500's (Internal Server Errors). Does it matter on a shared server?

Of course it does -- You're getting server errors!

The most likely cause is that you are using a custom 403 error page, but have made no provisions to *allow* that error page to be served to denied requestors. So, you will get a second 403 response as a result of a denied request, because the 403 page itself is denied to that client. Then because that too is denied, you'll get a third denied response, and a fourth... ad infinitum until the server gives up and throws a 500-Server error, as you are seeing.

You should see this clearly in your server error log file...

And there's another problem, too...

The simplest solution is to add RewriteConds to this rule to always allow your custom error page and robots.txt file to be served, even to denied clients:


RewriteCond %{REQUEST_URI} !^/robots\.txt$
RewriteCond %{REQUEST_URI} !^/path-to-custom-403-error-page\.html$
RewriteCond %{HTTP_USER_AGENT} RPT-HTTPClient/ [NC]
RewriteRule .* - [F]

There are several ways to code that more efficiently, but I want the example here to be clear.

You want to allow unconditional robots.txt access because some user-agents will interpret any error response to their request for robots.txt as carte-blanche to spider the whole site. Creating 403-forbidden loops and denying spider and robot access to robots.txt are two very good ways to create "self-inflicted denial-of-service attacks" by overloading your server with error handling. :o

Jim

[edited by: jdMorgan at 4:58 pm (utc) on Jan. 30, 2009]

1:17 pm on Feb 1, 2009 (gmt 0)

New User

10+ Year Member

joined:Jan 28, 2009
posts:17
votes: 0


Yes, I am using custom 403 and 404 pages. I modified the rewrite conditions as you have outlined, but the 500's still continue!

Please note that I am not very well versed here (as you may have already realized, but just to be sure).

Anyway, thanks again for the help, Jim. If there's another suggestion, please pass it along.

7:25 pm on Feb 1, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


Look at your server error log -- It will likely tell you what the problem is.

If you modified the code, you're welcome to post it, in case there's a problem...

Jim

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members