Forum Moderators: phranque

Message Too Old, No Replies

RewriteCond trying to block suspicious queries

         

csdude55

8:00 am on Nov 2, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've been trying to up my security a bit, and one of the things I'd stumbled across was adding this to my .htaccess:

RewriteCond %{QUERY_STRING} (?:[;<>'")]|%(?:0A|0D|22|27|3C|3E|00)).*(?:/\*|union|select|insert|drop|delete|update|cast|create|char|convert|alter|declare|order|script|set|md5|benchmark|encode) [NC]
RewriteRule ^ - [F]

# adding in breaks in an attempt to make the condition more readable
(?:
[;<>'")] |
%(?:0A|0D|22|27|3C|3E|00)
)
.*
(?:
/\* |
union |
select |
insert |
drop |
delete |
update |
cast |
create |
char |
convert |
alter |
declare |
order |
script |
set |
md5 |
benchmark |
encode
)


To be clear, I already have a section in my .htaccess that blocks suspicious things, but this was one that another site recommended adding to that pre-existing list.

I'm not sure that it does exactly as intended, though. I'm not entirely sure what it's trying to prevent, but the way that it's written blocks a query string like:

/blah.php?var1=js(foo)&var2=selectbar

That's a perfectly reasonable string, though, so the RewriteCond regex can't be right.

I suspect that it should be [^&]* instead of .*, but since I really don't get what it's trying to block in the first place, I'm not entirely sure.

Any suggestions?

lucy24

4:04 pm on Nov 2, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The non-final .* was the first thing that jumped out at me, as it makes the whole thing horribly inefficient. Yes, at a minimum it should be [^&]* so it's only looking at one parameter at a time. In fact it should be [^&=]* since I assume you're only concerned with the parameter name, not its value. More narrowly
(?:^|&;)[^=&]*{first-part-here}[^=&]*{second-part-here}

Does your site use queries at all in visible URLs? I'm assuming you do, or you could simply block all queries at the gate. Even so, most sites don't use a huge array of parameters, so you might be able to block all the non-authorized ones.
the way that it's written blocks a query string like
Do you mean that it blocks the query although you don't want it to, or that it's intended to block this query but doesn't? Either way, the rule is looking at the value of the parameter when you presumably only want it to look at its name.

Generic recommendation: As so often, it may work better to start by figuring out exactly what you want to do, and then make a rule to achieve that result, instead of starting with someone else's rule and reverse-engineering it.


[edited by: not2easy at 2:58 am (utc) on Nov 3, 2020]
[edit reason] typo correction requested [/edit]

csdude55

8:02 pm on Nov 2, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You're right, I do often use GET params. I was hoping to eventually use this server-wide, though, at /etc/apache2/conf.d/userdata/security.conf, so I'm trying to block as many hack attempts as I can. Then I can set up site-specific .conf files to restrict to pre-approved params.

I realize that I get a ton of traffic from bots trying to find a backdoor, so anything I can do to block that and save some resources will buy me some time before having to upgrade servers, too. And since I actually DID get hacked a few weeks ago, I'm now resorting to seeing what others have done in an attempt to stay ahead of it.

Do you mean that it blocks the query although you don't want it to, or that it's intended to block this query but doesn't?

It was blocking the query that I expected to be allowed.

Am I understanding correctly that I shouldn't be concerned if ANY of that is in the value, only in the name of the param? Meaning, this should be a concern:

?foo;bar+select...

but this should not be a concern:

?foo=bar;select...

I have 6 more conditions that test &{ QUERY_STRING}, but none of them are restricted to param names. False positives ARE a concern, so if they can all be tightened up to only look at names then that would help limit those false positives.

lucy24

1:51 am on Nov 3, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Pattern:
(?:[;<>'")]|%(?:0A|0D|22|27|3C|3E|00)).*(?:/\*|union|select|insert|drop|delete|update|cast|create|char|convert|alter|declare|order|script|set|md5|benchmark|encode)

i.e.
-- anything in the set [;<>'")] (which includes the close-parenthesis character) either as-is or in percent-encoded form (at least I assume that's the intention, though the lists aren't identical)
followed eventually by
-- one of a long list of strings, including the non-alphanumeric sequence “/*” (slash, asterisk).

Test string:
/blah.php?var1=js(foo)&var2=selectbar 

matches the pattern, because the close-parenthesis of (foo) is eventually followed by the string “select”.

That’s why, at a minimum, you should constrain the pattern to the same parameter. If the pattern is changed to
(?:[;<>'")]|%(?:0A|0D|22|27|3C|3E|00))[^&]*(?:/\*|union|select|insert|drop|delete|update|cast|create|char|convert|alter|declare|order|script|set|md5|benchmark|encode)
it will no longer match, because “js(foo)” and “selectbar” are values of different parameters.

To constrain any pattern to parameter name only, not value, make it look like this (note that there was a mistake in my previous post, which I’ve asked moderators to fix)
(?:^|&)[^&=]*{first-part-here}[^&=]*{second-part-here}
Yes, that illustrates two entirely different uses of ^ in RegEx: “beginning of string” and “group other than”.

The ?: wherever they occur aren't functionally necessary, but may save your server a pico-thingy of work, and could become relevant if you needed to capture.

csdude55

4:45 am on Nov 3, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



at least I assume that's the intention, though the lists aren't identical

I didn't notice that myself, but now that you've mentioned it... I have no idea what %0A, %0D, or %00 are.

If the plan was to list them as percent-encoded then the proper list would have been:

%(?:3B|3C|3E|27|22|29)

Or this, which is 4 bytes shorter but harder to read:

%(?:3[BCE]|2[279])

The ?: wherever they occur aren't functionally necessary, but may save your server a pico-thingy of work, and could become relevant if you needed to capture.

Yup, you taught me that one a couple of months ago and I've been using it ever since when a capture isn't needed. I pretend that it makes things run faster, but I honestly don't know... microseconds add up, though, so it's worth the keystrokes :-)