Forum Moderators: phranque

Message Too Old, No Replies

using mod rewrite to deny a specific QUERY STRING?

         

spinnercee

12:40 am on Nov 21, 2006 (gmt 0)

10+ Year Member



I'm trying to simply deny certain query strings before they get to my PHP scripts so that the script does not have to be parsed and executed to simply send the visitor away via 403:

RewriteEngine On

The strings I want to catch are h=string1 and p=string2 within the QUERY_STRING sent via HTTP GET. The URL I want to catch looks like:

www.example.com/php/script.php?h=string1&p=string2&...

I tried both of the following (but not at the same time):

(1) RewriteCond %{QUERY_STRING} h=string1&p=string2 [NC]

(2) RewriteCond %{REQUEST_URI} ^h=string1&p=string2 [NC]

I use the Forbidden [F] tag... I don't think what is rewritten matters [?] if I'm just sending a 403, does it?

RewriteRule ^/$ - [F,L]

I'm trying to apply this in the server context, and not in an .htaccess, if that matters.

I'm thinking my regex matching is messed up somewhere?

jdMorgan

1:14 am on Nov 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This should work in httpd.conf or conf.d:

RewriteCond %{QUERY_STRING} ^h=string1&p=string2$ [NC]
RewriteRule ^/php/script\.php$ - [F]

If "string1" or "string2" contains anything but letters and digits, then be careful to escape any characters that have special meanings to regex, especially quantifiers like ?, +, and *.

You must restart your server for the new rules to be compiled.

Flush your browser cache before testing any changes to your config files.

Jim

spinnercee

2:56 am on Nov 21, 2006 (gmt 0)

10+ Year Member



You're the man JD~! Perfect.

Some questions:

if h=www.example.com, would the dots have to be escaped in the RewtiteCond like this: h=www\.example\.com?

also, I assume the "$" anchors the end of line, and the "^" the beginning? so how could I catch the 2 specific [h¦p]=string[1¦2] pairs regardless of where they appear within the middle of the query string? ie:

.../script.php?a=string0&h=string1&g=stringG&p=string2&...n=stringN? Would this work:

RewriteCond %{QUERY_STRING} *.h=string1.* [NC]
RewriteCond %{QUERY_STRING} *.p=string2.* [NC]

also, I assume I can check for the alterntate leading "?" or "&" with [\?&]? ie: *.[\?&]p=string2.*

* multiple RewriteCond statements are a implicitly ANDed?

To be more specific, I am getting the query_string targets from the HTTPd log, so they almost always appear in the same order and in the same way but it's possible that they could arrive arranged differently and in different places in the URL.

On a side note, is this a good use of the ReWrite engine? A method to this madness is to keep certain known PHP script injections before they ever get to the PHP engine. This in part involves matching hostnames and port numbers that are submitted as script options.

jdMorgan

4:36 am on Nov 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



if h=www.example.com, would the dots have to be escaped in the RewtiteCond like this: h=www\.example\.com?

Yes, otherwise, any single character in the position where the dots normally are would be accepted as a match.

also, I assume the "$" anchors the end of line, and the "^" the beginning?

Yes. Standard regular-expressions start- and end-anchors -- See the citations in our forum charter for more info.

so how could I catch the 2 specific [h¦p]=string[1¦2] pairs regardless of where they appear within the middle of the query string? ie:

.../script.php?a=string0&h=string1&g=stringG&p=string2&...n=stringN? Would this work:

RewriteCond %{QUERY_STRING} *.h=string1.* [NC]
RewriteCond %{QUERY_STRING} *.p=string2.* [NC]

also, I assume I can check for the alternate leading "?" or "&" with [\?&]? ie: *.[\?&]p=string2.*

By being tricky... :)

First, forget leading and trailing ".*" sequences -- waste of characters and CPU. Leaving them off is fine. That is: ^.*this.*$ == .*this.* == this

The "?" never appears in the %{QUERY_STRING}, in the %{REQUEST_URI}, or in the URL-path examined by RewriteRule. It is a demarcation character between %{REQUEST_URI} and the %{QUERY_STRING}, and is included in neither.

So, to be tricky, either use two RewriteConds:


RewriteCond %{QUERY_STRING} &?h=string1&? [NC]
RewriteCond %{QUERY_STRING} &?p=string2&? [NC]

or combine them with an in-line OR with a bit of duplication for either order:

RewriteCond %{QUERY_STRING} (&?h=string1&([^&]+&)*p=string2&?¦&?p=string2&([^&]+&)*h=string1&?)$ [NC]

or even this, which would also make it name-value pair order-insensitive:

RewriteCond %{QUERY_STRING}%{QUERY_STRING} &?h=string1&([^&]+&)*p=string2&? [NC]

The &? sequences bound the parameters, so that if there is any character preceding "p=" or "h=" or following "string1" or "string2", then it must be an ampersand. This assures an exact match on the individual name/value pairs, preventing an incorrect match if you had another similar n/v pair, say, fish=string1 in the query. The "([^&]+&)*" sequence, loosely translated, means, "Match one or more characters not an ampersand, followed by an ampersand, and repeat that as many times as you like, including zero."

Note: Change the broken pipe "¦" characters in the code above to solid pipe characters before use; Posting on this forum modifies those characters. Just re-type them from your keyboard -- Usually, Shift-\

* multiple RewriteCond statements are implicitly ANDed?

Yes. Unless explicitly [OR]ed. Note also the NOT operator "!", with which you can often apply DeMorgan's theorem
A+B == !(!A*!B)
A*B == !(!A+!B)
when convenient. However, the NOT operator is part of mod_rewrite, and not part of regular expressions; Therefore, it can only appear once in the RewriteCond or RewriteRule pattern, preceding the regular expression.
(If not clear in the two equivalences shown above, "*" is logical AND, "+" is logical OR)

On a side note, is this a good use of the Rewrite engine? A method to this madness is to keep certain known PHP script injections before they ever get to the PHP engine. This in part involves matching hostnames and port numbers that are submitted as script options.

As long as you're not having to jump through hoops in mod_rewrite to do what you need to do, it's fine. You can view mod_rewrite as a tiny script interpreter; It's fast because the number of supported directives, variables, and argument types is tiny compared to PHP, and because the mod_rewrite executable is also tiny compared to PHP. The regular expressions library is probably actually the same code, since it's part of the OS.

I don't get drawn into the Apache-versus-script argument easily. It's too hard to predict performance issues that might cause one to be more efficient than the other, and ultimately, computers are supposed to work for us, not the other way around. So one could easily argue 'comfort factor' if one was more familiar/proficient with one over the other.

You can even combine mod_rewrite with scripting using RewriteMap -- An option you should investigate if your list of hostnames/ports is greater than 12 but less than 100 entries long. Those aren't hard numbers, just based on maintainability versus initial complexity -- and my personal style, really. RewriteMap lets you pass a URL and/or associated HTTP and request variables to a script, and have that script pass back a URL and/or response status info -- all before invoking the content handler. It also lets you use text or hash table lookups for tabular data. That might be useful here if you have long lists.

Jim