homepage Welcome to WebmasterWorld Guest from 23.21.9.44
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Help further simplifing a RewriteRule
simplify rewriterule help
hottrout




msg:4549583
 8:29 pm on Feb 27, 2013 (gmt 0)

I have the following RewriteRule that is working well. My question and curosity is, how simple can it be made. I am sure it can be shorter but I would love someone to show me how and why.

# Redirect old misspelt libarary folder
RewriteRule ^(Libarary|Library)(s|\'s|\%27s|27s|\%20s)/(.*)$ http://www.example.com/Libraries/$3 [R=301,NC,L]

One idea i had was this

# Redirect old misspelt libarary folder
RewriteRule ^Liba?rary(s|\'s|\%27s|27s|\%20s)/(.*)$ http://www.example.com/Libraries/$2 [R=301,NC,L]

Is this good?

Any takers?

 

lucy24




msg:4549644
 2:02 am on Feb 28, 2013 (gmt 0)

Ugh. Assuming for the sake of discussion that you really want to redirect all those people who can't spell "libraries" hahaha

you can collapse a little further:

s|\'s can become '?s (I don't think you need to escape the apostrophe)

\%27s|27s|\%20s can become \%?2[07]s

And you might want to throw in a

liba?r?ary

to get the "libary" folks. (I don't like the order of those question marks. Lemme think some more.)

But your rule doesn't cover one group: the people who get the final -ies part right, but misspell Librar-

And btw do you really want to keep the directory name in Title Case? Seems anomalous in the circumstances.

Can't help but wonder: Do all these misspellings really occur? Is there some history you're not telling us?

g1smd




msg:4549769
 9:09 am on Feb 28, 2013 (gmt 0)

I've worked on a site that now redirects ^co[mn]ponen?ts?/(.*) due to a previous site design having certain mis-spellings in internal linking for a number of years.

You missed redirecting "Library". The addition of a question mark fixes that.

RewriteRule ^Liba?r?ary(('|\%?2[07])?s)?/(.*)$ http://www.example.com/Libraries/$2 [R=301,NC,L]

You can also fix requests where "ies" is right, but the preceding is wrong.

RewriteCond {REQUEST_URI} !^Libraries/
RewriteRule ^Liba?r?ar(ies|y(('|\%?2[07])?s)?)/(.*)$ http://www.example.com/Libraries/$4 [R=301,NC,L]

hottrout




msg:4549774
 9:22 am on Feb 28, 2013 (gmt 0)

I love this (('|\%?2[07])?s)?

When I read this I understand why but I would not have thought of it myself. A great explination as always and I will use some of this logic with other code that I have to see if I can reduce it myself.

Thank you guys, I always come away feeling like I have learny something.

g1smd




msg:4549783
 10:07 am on Feb 28, 2013 (gmt 0)

The idea is that if something in the X|Y|Z pattern is repeated, then parts can be combined.

What you want to aim for is a single left to right parse of the input string. Once something has been "found" you don't want to have to come back to it again when following parts don't match up.

The oft quoted example is
(\.jpg|\.jpeg|\.gif|\.png)
Having found the period, why look for it again when "jpg" isn't a match for the ".png" input and you move to the next option? This simplifies to
\.(jpg|jpeg|gif|png)
Having found the "jp" why find it again when "g" isn't a match for the ".jpeg" input, and you move to the next option. This simplifies to
\.(jpe?g|gif|png)
and parses left to right in one straight pass. There's no "backtracking" to rematch the period or the jp.

hottrout




msg:4549785
 10:25 am on Feb 28, 2013 (gmt 0)

That is a great example and helps explain how to work though the simplification process. I am creating some examples and testing them myself at the minute.

g1smd




msg:4549787
 10:39 am on Feb 28, 2013 (gmt 0)

Another one, is wanting to match "/xyz/123" and "/123".

You already know that (/xyz/123|/123) or /(xyz/123|123) is many shades of wrong.

You sometimes see (/xyz)?/123 but that means the initial "/" has to be matched again when the input is "/123".
Mentioning no names, she knows who it is. :)

/(xyz/)?123 matches the slash, then checks for the optional extra folder level and its trailing slash. When "xyz/" isn't a match, no backtracking is required as the leading slash has already been accounted for.

Related practical application: ^/(([^/]+/)*)index\.php$
(where ? meaning "0 or 1" is replaced by * meaning "0 or more").

lucy24




msg:4549819
 12:24 pm on Feb 28, 2013 (gmt 0)

^/

There speaks a man who has his own server ;)

RewriteCond %{REQUEST_URI} !^Libraries/

may be the best escape route because then you can cover absolutely everything.

^Lib[ar]+[iey]+(?:'|%?2[07])?s?/{and now we're home free}

Then again, you could have a preliminary

RewriteRule ^Libraries/ {blahblah} [S=1]

instead of the Condition.

The [S] flag makes me anxious, but if you have a lot of people spelling the name right, counterbalanced by a lot of others who get it wrong, it may save time.


I've just finished link-checking the General Index to the Paston Letters, complete with misspelled cross-references. I would be lost without [eiy].

g1smd




msg:4549834
 12:45 pm on Feb 28, 2013 (gmt 0)

I strenuously avoid the [S] flag.

You have to remember to alter the number whenever you add new rules.

That's way beyond my level of organisational abilities.

I much prefer attaching the conditions directly to the rule they apply to, even if that means duplicating the same condition(s) on multiple rules.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved