Forum Moderators: phranque

Message Too Old, No Replies

htaccess to escape percent (%) from URL

htaccess to escape percent (%) from URL

         

asifroyal

11:37 am on Oct 22, 2010 (gmt 0)

10+ Year Member



Hi All,

Having percent sign inside URL, returns bad request (error 400) to the browser. I have file name that have percent (%) sign, resides at server.

Original File name:
204153_20090605_Aluminiumacetotartraat_DCB_oordruppels_1,2%.pdf


Url in browser after clicking on download link:

http://www.example.com/204153_20090605_Aluminiumacetotartraat_DCB_oordruppels_1%2C2%25.pdf

This returns 400 error with bad request.

Please guide, how we can rectify using htaccess rule

Thanks is advance.



Existing htaccess file contents are as follows:

# Turn on URL rewriting
RewriteEngine On

# Installation directory
RewriteBase /


# Protect hidden files from being viewed
<Files .*>
Order Deny,Allow
Deny From All
</Files>

# Protect application and system files from being viewed
RewriteRule ^(?:application|modules|system)\b.* index.php/$0 [L]

# Allow any files or directories that exist to be displayed directly
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

# Rewrite all other URLs to index.php/URL
RewriteRule .* index.php/$0 [PT]

jdMorgan

12:57 pm on Oct 22, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



400-Bad Request errors cannot be fixed in .htaccess, because these requests are rejected before the server executes any configuration files. The server is saying "This request is invalid, and I cannot process it at all."

You must correct the links on your pages (or correct the script that generates those links) to comply with the HTTP requirements for valid character-set use in URLS. See RFC 2396- Uniform Resource Identifiers (URI) [faqs.org].

You will not be able to use a literal "%" character in the filepath, because that character is reserved for escaping of all other characters in URLs. Because a URL may have to be multiply-escaped as it passes through various proxies and servers between the client and the end-server that will actually serve the request, the "%" character itself cannot be escaped.

For example, a space in a URL clicked on a page will be sent by the browser as %20, but may arrive at the server as %2520 (double-escaped), or even as %2525252525252520 if it passes through many proxies on the way.

The inventors of HTTP therefore had to choose whether %2520 means "doubly-encoded space" or "percent-sign, space." In order to allow escaping of any other characters, the former interpretation had to be chosen. So basically, you cannot send a percent sign in a URL using HTTP.

As Webmasters, we are NOT free to use just any characters that we want in URLs.

Jim

asifroyal

1:15 pm on Oct 22, 2010 (gmt 0)

10+ Year Member



Thank you very much @jdMorgan.

Somebody suggested that:

Try the B flag to ensure the %25, unescaped to % by mod_rewrite, is re-escaped back to %25 when inserted back into the target path.

RewriteRule .* index.php/$0 [PT,B]


But its not working anymore.

I don't have any control over file names as it is maintained by other application.

I have simply read the file names from directory using PHP and added as a link to a page.

jdMorgan

2:29 pm on Oct 22, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The problem is not limited to just Apache. It can occur anywhere in the network -- for example, any proxy between your visitors and your server will reject these requests. You have no control over this.

There is no way to fix this problem except to correct the URLs. As stated above, HTTP URLs *may not* contain literal "%" characters.

Jim