The excellent mod_rewrite rule set described in the library thread
"Guide to fixing duplicate content & URL issues" [webmasterworld.com] seems to incorrectly rewrite URLs that contain the percent symbol.
For example, if the visitor requests an existing page called "Document%20Index.htm" and there is a reason to rewrite the URL (such as the missing www. subdomain), the URL is rewritten to one that does not exist, i.e. "Document%
2520Index.htm" (note the newly added string "25").
In some cases this even leads to a 301-recursion that ends with a Segmentation Fault (not sure if it is exploitable).
The fix appears to be quite simple:
Replace the line:
RewriteRule .? http://www.example.com%{ENV:myURI}%{ENV:myQS} [R=301,L] with the following:
RewriteRule .? http://www.example.com%{ENV:myURI}%{ENV:myQS} [R=301,L,NE] The only difference is the 'noescape' flag (NE) which prevents the percent symbol from being escaped.
However, I'm not sure this fix does not break something else. What do you think?