Forum Moderators: phranque

Message Too Old, No Replies

Trailing slash rewrite not quite right!

just a little bug somewhere..

         

BenSeb

11:41 pm on Dec 6, 2006 (gmt 0)

10+ Year Member



Hi

I have the code in place:

RewriteRule ^whats-on/(.*)/(.*)$ /whats-on/$1/$2/ [R]
RewriteRule ^whats-on/(.*)/(.*)/$ mytown/index.php?c=$2&d=0&m=0&y=0&radius=10 [L]

the second part of this which deals with the actual rewriting of pages works fine. The first line is meant to catch any:

www.domain.com/whats-on/test/a1

and rewrite to

www.domain.com/whats-on/test/a1/

the obviously the second rule would kick in. This is exactly as specified in the apache docs. However, when removing the trailing slash in the url bar it loops and tries to go to www.domain.com/whats-on/test/a1//

If I remove the / from in front of the /whats-on :

RewriteRule ^whats-on/(.*)/(.*)$ whats-on/$1/$2/ [R]

it rewrites to

www.domain.com/home/vhosts/website/whats-on/test/a1/

Can anyone spot the error? Many thanks for your help

jdMorgan

12:15 am on Dec 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's a combination of two errors: Incorrect rule order (for those rules as-written) and the use of the greedy, promiscuous, and processing-intensive ".*" pattern.

A better approach for the example URLs you provided would be:


RewriteRule ^whats-on/([^/]+)/([^/]+)/$ mytown/index.php?c=$2&d=0&m=0&y=0&radius=10 [L]
RewriteRule ^whats-on/([^/+])/([^/]+)$ http://www.example.com/whats-on/$1/$2/ [R=301,L]

If you also want to invoke these rules for additional subdirectory levels, and not just the exact one or two levels as originally shown in your example and implied by your rules, then the patterns will need to change a bit.

The pattern "[^/]+" means "match one or more characters not equal to a slash" or equivalently, "match one or more characters, up to, but not including, the next slash." As such is it very much more specific and efficient to process than the original ".*" pattern.

Using the ".*" pattern twice in one rule forces the regular-expressions matching routine into a doubly-nested back-off loop, where it first matches all characters into the first ".*", fails to find a match, backs off one characters from the end and tries again. This loop will repeat N times, where N is the length of the URL following the second slash plus one. In this case, with your example "/whats-on/test/a1" URL, this would be 3 iterations through the loop to find a match. If the URL-path after the second slash was 20 characters long, the original pattern would require 21 passes to find a match. The take-home lesson is to avoid "(.*)x(.*)" patterns whenever possible (which is almost always).

With the "[^/]+" negative-match patterns, the entire rule is matches in one pass, evaluating directly from left to right.

Jim

BenSeb

11:08 pm on Dec 7, 2006 (gmt 0)

10+ Year Member



Brilliant, thanks for such a detailed reply. I've implemented both those suggestions :)