Forum Moderators: phranque

Message Too Old, No Replies

Most frustrating problem ever

         

ATWeb

11:44 am on Aug 29, 2008 (gmt 0)

10+ Year Member



Yesterday, I switched from a (100% working) PHP solution for handling URL rewriting to doing it in the Apache 2.2 config with mod_rewrite for optimization/purity reasons. (No, I will not go back to using PHP.)

Everything works fine except for one thing: I now get "Bad request" (error 400) spat back from Apache if I try to access the (purposely) invalid URL: http://www.example.com/article/123/This_is_an_invalid_URL_look_a_percentage_10%_and_I_know_it!

The URL http://www.example.com/article/123/This_is_a_valid_URL works and redirects as expected. Pretty much any character except for % seems to work.

First line after my RewriteEngine on in my vhost: "RewriteRule ^/article/([0-9]+)/.+$ /article/$1 [L,R=301]"

Somebody please tell me what I am doing wrong. The reason for the URL to be invalid is that it is purely used for a redirect and to show a nice address to humans. Encoding the URL is not an option because that would defeat the entire purpose of this practice. Please don't question why I am doing this. I am way too tired at this point to explain this. Please try to help me with my problem at hand instead.

It cannot be a browser/protocol-side "hard stop" error, because as mentioned, it used to work earlier.

I have tried enabling rewritelogging, but nothing is logged there or in the normal error log. Even the acces log only shows a "400" entry.

I am going crazy here and I suspect that the fix is as simple as adding some sort of [X] flag or something. Or some really obvious thing that I have overlooked.

Please help me. This is fatal to my business and it's already been broken for over 24 hours.

jdMorgan

1:07 pm on Aug 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The URL is invalid at the HTTP protocol level, because the % character *must* be encoded as %25 if it does not precede a two-digit number -- See RFC 2396 [faqs.org].

This is not your RewriteRule rejecting the request, this is the server rejecting the request, because it is a "Bad Request" when validated against the HTTP protocol requirements.

In other words,
http://example.com/article/123/This_is_a_valid_URL_with_a_percentage_sign_%10_and_it_should_work!
-and-
http://example.com/article/123/This_is_a_valid_URL_with_encoded_percentage_sign_10%25_and_it_will_work!
-and-
http://example.com/article/123/This_is_a_violation_of_the_HTTP_protocol_and_will_fail_100%_of_the_time!

Jim

ATWeb

3:34 pm on Aug 29, 2008 (gmt 0)

10+ Year Member



So why did it work before?

Demaestro

3:44 pm on Aug 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Not sure about the order of operations but it could be because you were pre-parsing the URL in PHP before the request was send to the server so it was already handled before the server saw it.

ATWeb

4:24 pm on Aug 29, 2008 (gmt 0)

10+ Year Member



Stop everything! Delete this thread!

jdMorgan

4:31 pm on Aug 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> already handled

Right, it must have been hex-encoded, as required by the HTTP protocol. Any normal browser or "high-level" Web scripting language such as PHP will do this encoding, because it is a fundamental requirement for sending requests using HTTP.

Jim

jdMorgan

4:32 pm on Aug 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Rather than deleting this thread (which we almost never do), how about sharing your findings with other readers who may have the same problem later? :)

Jim

ATWeb

5:39 pm on Aug 29, 2008 (gmt 0)

10+ Year Member



OK. Apparently, it always gave the "bad request" on %. I reverted back to the old solution after all, and tested it.