Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

Strip invalid characters from query_String

11:58 pm on Dec 14, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Nov 25, 2003
votes: 0

I am trying to use mod_rewrite to strip invalid characters from the suery_string to prevent cross-site scripting. here is what I have so far

rewriterule "([a-z.A-Z0-9.,_=/-]*)([^a-zA-Z0-9.,_=/-])+(.*)" $1$3 [NE,N,E=AC_REWRITE:true]
RewriteCond %{ENV:AC_REWRITE} true
rewriterule (.*) http://%{HTTP_HOST}/$1 [R=permanent,L]

unfortunately, it only strips invalid characters from the actual script name and not the query string. The only legal characters that I wish to keep are alphanumeric, commas, periods, equal, and underscore (and maybe the ampersand, trying to keep it simple for now). how can I have mod_rewrite filter out the invalid characters?

Wing Lian

5:19 pm on Dec 15, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
votes: 0


To clarify, do you need to strip characters from the filename at all, or just the query string?

If you only need to process the query string, then something like this should work:

RewriteCond %{QUERY_STRING} ^(.*)([^a-zA-Z0-9.,_=/-&])+(.*)$
RewriteRule (.*) /$1?%1%3 [N]

or just forbid all cross-scripting attempts:

RewriteCond %{QUERY_STRING} [^a-zA-Z0-9.,_=/-&]
RewriteRule . - [F]

The first block of code runs repeatedly (by restarting mod_rewrite processing) until no more 'illegal' characters are found. Because it restarts, it needs to be among the very first lines of code in your file. An external redirect is avoided for the same reason - restarts and external redirects make the code slow.

The problem with it, as with the original version, is that it simply drops the illegal characters -- no 'intelligence' is there to make sure the query string is still valid after dropping the characters. So, your script must do this function.

The second version simply returns an immediate 403-Forbidden server response if any attempt is made to include illegal characters.