Forum Moderators: phranque

Message Too Old, No Replies

RewriteCond / RewriteRule

How to block strange USER_AGENT

         

Maleville

12:32 am on Jan 30, 2004 (gmt 0)

10+ Year Member



Hello everybody

In my .htaccess I would want to block guys with random strings of letters space and numbers in their USER_AGENT as:

httakcygqunncwmljki v5lp5pkcosyv
obkbkuucrvetpoyymsjn8 wbepncnErok
mimfwacpgouquuluBqpnpocmckus B k Bow

Number of letters seems to be more than 15 characters.
Is this code correct?

RewriteCond %{HTTP_USER_AGENT} ^(¦0-9¦¦[A-Z]{15})$ [OR]

jdMorgan

6:21 am on Jan 30, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Maleville,

It would just be:


RewriteCond %{HTTP_USER_AGENT} [0-9A-Za-z]{15,} [OR]

but the problem is, that might block unexpected-but-valid user-agents, so be careful to check against a good long list of valid user-agents.

What might also work for many of them is to detect 2 and 3-letter "impossibilities" in the letter sequence. This would vary by language, but for example, "q not followed by u" or "two u's in a row not preceded by q".

Just some examples I can see: (nn[^aeiou]¦q[^u]¦[^q]uu¦pn[^e]¦mf[^ ])

Or, you might be better off blocking them behaviourally - with a bad-bot script, for example.

Jim

Maleville

11:01 pm on Feb 2, 2004 (gmt 0)

10+ Year Member



Thank you Jim.

Your code is for 15 or more letters/numbers. Correct?
RewriteCond %{HTTP_USER_AGENT} [0-9A-Za-z]{15,} [OR]

If I want to check any longer is this one correct?
RewriteCond %{HTTP_USER_AGENT} ^[0-9A-Za-z]+$ [OR]

"Or, you might be better off blocking them behaviourally - with a bad-bot script, for example."
- I don't want to block innocent visitor who doesn't know our problems of overload sites.
- I block others visitors with the famous trap.pl

jdMorgan

9:18 pm on Feb 3, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Your code is for 15 or more letters/numbers. Correct?
RewriteCond %{HTTP_USER_AGENT} [0-9A-Za-z]{15,} [OR]

The regex matches 15 or more contiguous upper- or lower-case letters or numbers ONLY - no spaces, slashes, semicolons, or other characters. There are very few user-agents that will contain that many contiguous letters and/or numbers without a space, slash, or semicolon.

Jim