Forum Moderators: phranque

Message Too Old, No Replies

Mod rewrite redirects in .htaccess file

Mod_rewrite redirects to remove newline characters

         

csherwood1234

4:44 pm on Nov 1, 2010 (gmt 0)

10+ Year Member



OK. I'm new to Mod_rewrite but I'm really trying here. I've spent the last week reading the documentation, reviewing code, and looking at literally 50 different sites on the net for examples. I'm just not getting it.

Firstly, I wanted to redirect all requests on our server to the www equivalent of the request if it's not present. Yes, I found lots of examples and at least a dozen variations in those examples. They all *seem* to work but there must be a right way to do this. Are any of these even close?

# Rewrite all requests to include the www
RewriteCond %{HTTP_HOST} ^mysite.com [NC]
RewriteRule (.*) [mysite.com...] [R=301,L]

RewriteCond %{HTTP_HOST} ^\.mysite\.com [NC]
RewriteRule ^(.*)$ [mysite.org...] [R=301,NC,L]

RewriteCond %{HTTP_HOST} ^mysite\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

Also, forgive my ignorance, but I notice that thes rules will do the redirection AND then continue on and execute the rest of my rules. That's what I want, but honestly I don't understand WHY that happens as I thought the [L] flag is suppose to make that the last rule. Can someone enlighten me?

Secondly, I started all of this trying to remove new line characters and carriage returns from requests. Mcaffee is scanning our site for PCI compliance and we're failing due to "HTTP Response Splitting".

I tried the following (and a hundred variations thereof) with no success:

# Block out carriage return and new line characters in the HTTP Request
RewriteCond %{THE_REQUEST} ^.*(\\r|\\n|%0A|%0D).* [NC,OR]
# Block out carriage return and new line characters in the Query String variable
RewriteCond %{QUERY_STRING} ^.*(%0A|%0D).* [NC]
RewriteRule ^(.*)$ [mysite.com...] [R=301,L]

This is the URL their form is submitting:
[mysite.com...]

I've also noticed that the order that I'm listing the rules has different effects on the result, but again, I can't seem to get it right.

This is my entire .htaccess file:

Options +FollowSymLinks
RewriteEngine On

# Rewrite all requests to include the www
RewriteCond %{HTTP_HOST} ^mysite.com [NC]
RewriteRule (.*) [mysite.com...] [R=301,L]

# Block out use of illegal or unsafe characters in the HTTP Request
RewriteCond %{THE_REQUEST} ^.*(\\r|\\n|%0A|%0D).* [NC,OR]
# Block out use of New line characters in the Query String variable
RewriteCond %{QUERY_STRING} ^.*(%0A|%0D).* [NC]
RewriteRule ^(.*)$ [mysite.com...] [R=301,L]

#Serve up the static widget page when the dynamic page is requested - only to deal with sites still linking to our dynamic page
RewriteRule ^cat--Special-Widgets--WIDGETS [mysite.com...] [R=301,L]

#Rewrite dynamic URLs for SEO
RewriteRule ^smallwidgets.htm /cgi-bin/ccp51/cp-app.cgi?seo=cat--Small-Widgets--SMALLWIDGETS [L]
RewriteRule ^bigwidgets.htm /cgi-bin/ccp51/cp-app.cgi?seo=page--Big-Widgets--BIGWIDGETS [L]
RewriteRule ^shop_by_price--(.*) /cgi-bin/ccp51/cp-app.cgi?seo=shop_by_price--$1
RewriteRule ^cat--(.*) /cgi-bin/ccp51/cp-app.cgi?seo=cat--$1
RewriteRule ^item--(.*) /cgi-bin/ccp51/cp-app.cgi?seo=item--$1
RewriteRule ^page--(.*) /cgi-bin/ccp51/cp-app.cgi?seo=page--$1
RewriteRule ^store /cgi-bin/ccp51/cp-app.cgi?pg=store
RewriteRule ^index$ /cgi-bin/ccp51/cp-app.cgi?pg=ste_index_list
RewriteRule ^az-(.)-(.)$ /cgi-bin/ccp51/cp-app.cgi?pg=ste_index_az&startltr=$1&endltr=$2
RewriteRule ^sitemap.xml$ /cgi-bin/ccp51/cp-app.cgi?pg=ste_sitemap_proc


Any help would be greatly appreciated. I've burned an entire week on this already, crashed our CMS a hundred times, and feel like I'm still no closer to figuring this out.

g1smd

9:06 pm on Nov 1, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Add [L] to EVERY RewriteRule.

Place all external redirects before all internal rewrites.

Try this for the canonical domain redirect:

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]


The [L] flag makes this the last matching rule of this attempt, but a redirect ends the current HTTP transaction and tells the browser to make a new request for a new URL in a new HTTP transaction. The server does not "remember" the previous request; mod_rewrite processing starts again for the new request.

csherwood1234

9:38 pm on Nov 1, 2010 (gmt 0)

10+ Year Member



OK. So that gives me this as my .htaccess file:

Options +FollowSymLinks
RewriteEngine On

# Rewrite all requests to include the www
RewriteCond %{HTTP_HOST} !^(www\.mysite\.com)?$
RewriteRule (.*) [mysite.com...] [R=301,L]


# Block out use of illegal or unsafe characters in the HTTP Request
RewriteCond %{THE_REQUEST} ^.*(\\r|\\n|%0A|%0D).* [NC,OR]
# Block out use of New line characters in the Query String variable
RewriteCond %{QUERY_STRING} ^.*(%0A|%0D).* [NC]
RewriteRule ^(.*)$ [mysite.com...] [R=301,L]

#Serve up the static widget page when the dynamic page is requested - only to deal with sites still linking to our dynamic page
RewriteRule ^cat--Special-Widgets--WIDGETS [mysite.com...] [R=301,L]

#Rewrite dynamic URLs for SEO
RewriteRule ^smallwidgets.htm /cgi-bin/ccp51/cp-app.cgi?seo=cat--Small-Widgets--SMALLWIDGETS [L]
RewriteRule ^bigwidgets.htm /cgi-bin/ccp51/cp-app.cgi?seo=page--Big-Widgets--BIGWIDGETS [L]
RewriteRule ^shop_by_price--(.*) /cgi-bin/ccp51/cp-app.cgi?seo=shop_by_price--$1 [L]
RewriteRule ^cat--(.*) /cgi-bin/ccp51/cp-app.cgi?seo=cat--$1 [L]
RewriteRule ^item--(.*) /cgi-bin/ccp51/cp-app.cgi?seo=item--$1 [L]
RewriteRule ^page--(.*) /cgi-bin/ccp51/cp-app.cgi?seo=page--$1 [L]
RewriteRule ^store /cgi-bin/ccp51/cp-app.cgi?pg=store [L]
RewriteRule ^index$ /cgi-bin/ccp51/cp-app.cgi?pg=ste_index_list [L]
RewriteRule ^az-(.)-(.)$ /cgi-bin/ccp51/cp-app.cgi?pg=ste_index_az&startltr=$1&endltr=$2 [L]
RewriteRule ^sitemap.xml$ /cgi-bin/ccp51/cp-app.cgi?pg=ste_sitemap_proc [L]

Can anyone offer any suggestions regarding the right way to remove the newline and carriage return characters? I would think that this code would be useful on everybody's server - unless of course this is a bad approach?

g1smd

9:03 am on Nov 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



One thing about rule order, make the main domain canonicalisation rule the last rule of the redirects. If you don't do this, you'll get an unwanted double redirect, or redirection chain, for some requests.

g1smd

9:05 am on Nov 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



..

csherwood1234

1:41 pm on Nov 2, 2010 (gmt 0)

10+ Year Member



Thanks, g1smd. Maybe I'm getting closer, but my redirect for the line character and carriage return stripping aren't working. I've tried so many variations now that I can't even really rememeber how the results changed . . . but it never did what I wanted it to do.

I think the rule condition isn't firing because there are subdirectories in the test URL - as I've been able to get it to redirect a few times when I removed the subdirectories and/or the "?" from the test URL. I just can't figure out the right pattern to match.

This is the test URL again:
[mysite.com...]

Essentially, all I want to do there is to redirect to my homepage anytime a request contains a newline or carriage return character. Any ideas?


My .htaccess file now:

Options +FollowSymLinks
RewriteEngine On

# Block out use of illegal or unsafe characters in the HTTP Request
RewriteCond %{THE_REQUEST} ^.*(\\r|\\n|%0A|%0D).* [NC,OR]
# Block out use of New line characters in the Query String variable
RewriteCond %{QUERY_STRING} ^.*(%0A|%0D).* [NC]
RewriteRule ^(.*)$ [mysite.com...] [R=301,L]

#Serve up the static widget page when the dynamic page is requested - only to deal with sites still linking to our dynamic page
RewriteRule ^cat--Special-Widgets--WIDGETS [mysite.com...] [R=301,L]

# Rewrite all requests to include the www
RewriteCond %{HTTP_HOST} !^(www\.mysite\.com)?$
RewriteRule (.*) [mysite.com...] [R=301,L]

#Rewrite dynamic URLs for SEO
RewriteRule ^smallwidgets.htm /cgi-bin/ccp51/cp-app.cgi?seo=cat--Small-Widgets--SMALLWIDGETS [L]
RewriteRule ^bigwidgets.htm /cgi-bin/ccp51/cp-app.cgi?seo=page--Big-Widgets--BIGWIDGETS [L]
RewriteRule ^shop_by_price--(.*) /cgi-bin/ccp51/cp-app.cgi?seo=shop_by_price--$1 [L]
RewriteRule ^cat--(.*) /cgi-bin/ccp51/cp-app.cgi?seo=cat--$1 [L]
RewriteRule ^item--(.*) /cgi-bin/ccp51/cp-app.cgi?seo=item--$1 [L]
RewriteRule ^page--(.*) /cgi-bin/ccp51/cp-app.cgi?seo=page--$1 [L]
RewriteRule ^store /cgi-bin/ccp51/cp-app.cgi?pg=store [L]
RewriteRule ^index$ /cgi-bin/ccp51/cp-app.cgi?pg=ste_index_list [L]
RewriteRule ^az-(.)-(.)$ /cgi-bin/ccp51/cp-app.cgi?pg=ste_index_az&startltr=$1&endltr=$2 [L]
RewriteRule ^sitemap.xml$ /cgi-bin/ccp51/cp-app.cgi?pg=ste_sitemap_proc [L]

csherwood1234

7:55 am on Nov 3, 2010 (gmt 0)

10+ Year Member



So, after experimenting and testing all day again today, I've discovered that my rules for removing newline and carriage return characters all work just fine. In fact, any one of them will do the trick . . . unless there happens to be an ampersand in the URL! Apparently, Mod_rewrite's RewriteCond stops scanning for matches when it encounters and ampersand?!?

So the following rule will work on requests that contain newline characters, but DON'T INCLUDE ANY AMPERSANDS BEFORE AT LEAST 1 NEWLINE CHARACTER SET IS MATCHED!

# Block out use of illegal or unsafe characters in the HTTP Request
RewriteCond %{QUERY_STRING} ^.*(\\r|\\n|%0A|%0D).* [NC]
RewriteRule ^(.*)$ [mysite.com...] [R=301,L]

Unfortunately, the test strings submitted by our PCI compliance certification company all have at least 1 ampersand in the request which occurs before any of the newline character - consequently, the RewriteRule never gets triggered.

Here's an example of the URL they're submitting:
[mysite.com...]

Any brilliant ideas? I think I've got about 100 hours in this since last week. :(

jdMorgan

5:32 pm on Nov 29, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The most likely cause of this problem is that you neglected to escape characters in your RewriteCond pattern which are required to be escaped. Your rule also fails to remove the query string, so if %0A or %0D are present in the requested query string, they will be retained in the redirection URL. Try:

# Redirect to index page and remove query string if illegal or unsafe characters in the HTTP Request
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /[^%]*\%0[AD] [NC]
RewriteRule ^ http://www.mysite.com/? [R=301,L]

A simpler and more HTTP-compliant solution would be:

# Return 403-Forbidden response if illegal or unsafe characters in the HTTP Request
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /[^%]*\%0[AD] [NC]
RewriteRule ^ - [F]

Note that \\r and \\n probably won't appear in any HTTP request, and if they do, they are harmless unless one of your scripts will accept them and "un-escape" them.

Jim