Welcome to WebmasterWorld Guest from 54.161.110.186

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

Help further simplifing a RewriteRule

simplify rewriterule help

     
8:29 pm on Feb 27, 2013 (gmt 0)

5+ Year Member



I have the following RewriteRule that is working well. My question and curosity is, how simple can it be made. I am sure it can be shorter but I would love someone to show me how and why.

# Redirect old misspelt libarary folder
RewriteRule ^(Libarary|Library)(s|\'s|\%27s|27s|\%20s)/(.*)$ http://www.example.com/Libraries/$3 [R=301,NC,L]

One idea i had was this

# Redirect old misspelt libarary folder
RewriteRule ^Liba?rary(s|\'s|\%27s|27s|\%20s)/(.*)$ http://www.example.com/Libraries/$2 [R=301,NC,L]

Is this good?

Any takers?
2:02 am on Feb 28, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Ugh. Assuming for the sake of discussion that you really want to redirect all those people who can't spell "libraries" hahaha

you can collapse a little further:

s|\'s can become '?s (I don't think you need to escape the apostrophe)

\%27s|27s|\%20s can become \%?2[07]s

And you might want to throw in a

liba?r?ary

to get the "libary" folks. (I don't like the order of those question marks. Lemme think some more.)

But your rule doesn't cover one group: the people who get the final -ies part right, but misspell Librar-

And btw do you really want to keep the directory name in Title Case? Seems anomalous in the circumstances.

Can't help but wonder: Do all these misspellings really occur? Is there some history you're not telling us?
9:09 am on Feb 28, 2013 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I've worked on a site that now redirects ^co[mn]ponen?ts?/(.*) due to a previous site design having certain mis-spellings in internal linking for a number of years.

You missed redirecting "Library". The addition of a question mark fixes that.

RewriteRule ^Liba?r?ary(('|\%?2[07])?s)?/(.*)$ http://www.example.com/Libraries/$2 [R=301,NC,L]


You can also fix requests where "ies" is right, but the preceding is wrong.

RewriteCond {REQUEST_URI} !^Libraries/
RewriteRule ^Liba?r?ar(ies|y(('|\%?2[07])?s)?)/(.*)$ http://www.example.com/Libraries/$4 [R=301,NC,L]
9:22 am on Feb 28, 2013 (gmt 0)

5+ Year Member



I love this (('|\%?2[07])?s)?

When I read this I understand why but I would not have thought of it myself. A great explination as always and I will use some of this logic with other code that I have to see if I can reduce it myself.

Thank you guys, I always come away feeling like I have learny something.
10:07 am on Feb 28, 2013 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



The idea is that if something in the X|Y|Z pattern is repeated, then parts can be combined.

What you want to aim for is a single left to right parse of the input string. Once something has been "found" you don't want to have to come back to it again when following parts don't match up.

The oft quoted example is
(\.jpg|\.jpeg|\.gif|\.png)
Having found the period, why look for it again when "jpg" isn't a match for the ".png" input and you move to the next option? This simplifies to
\.(jpg|jpeg|gif|png)
Having found the "jp" why find it again when "g" isn't a match for the ".jpeg" input, and you move to the next option. This simplifies to
\.(jpe?g|gif|png)
and parses left to right in one straight pass. There's no "backtracking" to rematch the period or the jp.
10:25 am on Feb 28, 2013 (gmt 0)

5+ Year Member



That is a great example and helps explain how to work though the simplification process. I am creating some examples and testing them myself at the minute.
10:39 am on Feb 28, 2013 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Another one, is wanting to match "/xyz/123" and "/123".

You already know that (/xyz/123|/123) or /(xyz/123|123) is many shades of wrong.

You sometimes see (/xyz)?/123 but that means the initial "/" has to be matched again when the input is "/123".
Mentioning no names, she knows who it is. :)

/(xyz/)?123 matches the slash, then checks for the optional extra folder level and its trailing slash. When "xyz/" isn't a match, no backtracking is required as the leading slash has already been accounted for.

Related practical application: ^/(([^/]+/)*)index\.php$
(where ? meaning "0 or 1" is replaced by * meaning "0 or more").
12:24 pm on Feb 28, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



^/

There speaks a man who has his own server ;)

RewriteCond %{REQUEST_URI} !^Libraries/


may be the best escape route because then you can cover absolutely everything.

^Lib[ar]+[iey]+(?:'|%?2[07])?s?/{and now we're home free}


Then again, you could have a preliminary

RewriteRule ^Libraries/ {blahblah} [S=1]


instead of the Condition.

The [S] flag makes me anxious, but if you have a lot of people spelling the name right, counterbalanced by a lot of others who get it wrong, it may save time.


I've just finished link-checking the General Index to the Paston Letters, complete with misspelled cross-references. I would be lost without [eiy].
12:45 pm on Feb 28, 2013 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I strenuously avoid the [S] flag.

You have to remember to alter the number whenever you add new rules.

That's way beyond my level of organisational abilities.

I much prefer attaching the conditions directly to the rule they apply to, even if that means duplicating the same condition(s) on multiple rules.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month