Forum Moderators: phranque

Message Too Old, No Replies

redirect in 1 time to lowercase + hyphen

         

solof

6:46 pm on Oct 13, 2009 (gmt 0)

10+ Year Member



Problem: I would like to redirect (301) my current URLs, which uses underscores and capital letters, to a new URL which uses hyphens and only lowercase. A couple of directories can not be included (img & css) in the entire process.

I came up with (for in httpd.conf):

RewriteRule ![A-Z] - [S=2]
RewriteCond %{REQUEST_URI} !^/(img¦css)/.*$
RewriteCond %{REQUEST_URI} ^/.*_
RewriteRule ^/(.*)_(.*) /$1-$2 [N,QSA]
rewritemap lowercase int:tolower
RewriteCond %{REQUEST_URI} !^/(img¦css)/.*$
RewriteCond $1 [A-Z]
RewriteRule ^/(.*)$ http://example.com/${lowercase:$1} [R=301,L]

Question: is this the best/fastest way to do this?

solof

1:15 am on Oct 14, 2009 (gmt 0)

10+ Year Member



just came up with the following:

Rewriterule ^(img¦css)/ - [L]
RewriteRule ![A-Z] - [S=2]
RewriteCond %{REQUEST_URI} ^/.*_
RewriteRule ^/(.*)_(.*) /$1-$2 [N,QSA]
rewritemap lowercase int:tolower
RewriteCond %{REQUEST_URI} !^/(css¦img)/.*$
RewriteCond $1 [A-Z]
RewriteRule ^/(.*)$ http://example.com/${lowercase:$1} [R=301,L]

I think its getting better...

jdMorgan

1:59 pm on Oct 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This RewriteCond in the second rule is redundant, because the following rule's pattern already looks for essentially the same thing (an underscore). So
 RewriteCond %{REQUEST_URI} ^/.*_ 

can be removed.

Similarly, the second RewriteCond in the third rule is redundant and can be removed by changing the follwoing rules pattern:


RewriteMap lowercase int:tolower
RewriteCond %{REQUEST_URI} !^/(css¦img)/.*$
RewriteRule ^/([^A-Z]*[A-Z].*)$ http://example.com/${lowercase:$1} [R=301,L]

Jim

solof

2:11 pm on Oct 14, 2009 (gmt 0)

10+ Year Member



Thank you, Jim. I appreciate it.

It's working for 90%. It seems that the first rule is *not* working: "Rewriterule ^(img¦css)/ - [L]"
Files in these folders are not being skipped. Replacing [L] with [S=3] does not help.

jdMorgan

2:26 pm on Oct 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Did you change the broken pipe "¦" character to a solid pipe character before trying to use that? Posting on this forum changes that character from a solid pipe to a broken one, which is treated as a literal instead of functioning as a 'local OR' operator.

Also, because of the patterns in the other rules, I assume that this code goes into httpd.conf or some other server configuration file, outside of any <Directory> container, and not in a .htaccess file. In that case, the RewriteRule's pattern must start with a slash, as shown in your other rules.

Jim

solof

2:29 pm on Oct 14, 2009 (gmt 0)

10+ Year Member



yes its a solid pipe.

yes, its meant for httpd.conf and outside <Directory> container.

So the first rule should be this? or do more slashes have to added?
Rewriterule ^/(img¦css)/ - [L]

Is the 2nd rule (RewriteRule ![A-Z] - [S=2]) the best way to skip all other rules?

jdMorgan

5:49 pm on Oct 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, there are hundreds of ways to do this, and some of them will actually work... :)

For example, you could replace the whole thing with:


RewriteMap lowercase int:tolower
#
RewriteCond $1 !^(img¦css)/
RewriteCond $1 [A-Z]
RewriteRule ^/(([^_]*)_(.*))$ /$2-$3 [E=ReplacedUnderscore:Yes,N]
#
RewriteCond %{ENV:ReplacedUnderscore} =Yes
RewriteRule ^/(.+)$ http://example.com/${lowercase:$1} [R=301,L]

This code uses a user-defined variable to pass the information that underscores and uppercase letters were present in the requested URL-path, the request was not for /img/ or /css/, and that one or more underscores were replaced from the first rule to the second rule. So these conditions need not be tested again in the second rule, any number of underscores can be replaced, and tolower is only called once.

But, that's just one of many methods that can work... :)

(Remember to replace that broken pipe character)

If you only have a known-maximum number of underscores, you could write a stack of rules. The first rule could replace three underscores if found, the second could replace two, and the last rule could replace one, for up to a total of six. Then do the lowercasing. This avoids having to do looping in the code, which can be very slow if the underscore-replacing rule isn't at or very near the top of the code.

If you use that approach, it's important that each rule replace one less underscore than the one that precedes it, or you will get into a situation where the rule-set fails when certain numbers of underscores are present.

You'd also want to skip the whole stack if the requested URL-path does not contain both underscores and uppercase characters, or if it is for /img/ or /css/ paths.

Alternatively, you could define a RewriteMap to call a PERL script that both replaces all underscores at once and converts the result to lowercase, greatly simplifying the rule, which would then only have to test the conditions of the first rule in this post.

I don't know which way is faster/better; You'll have to test to see what kind of performance impact each method has on your server.

Jim

solof

6:19 pm on Oct 14, 2009 (gmt 0)

10+ Year Member



thanks again, Jim :-).

when I use your last example, wouldnot it be better to skip these rules when no underscores and/or uppercase characters are present? or am I missing something here...?

jdMorgan

10:25 pm on Oct 14, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You will note that the first RewriteRule's pattern only matches if there's an undescore. This pattern is evaluated first (see Apache mod_rewrite documentation for details). Then if the rule pattern matches, the first RewriteCond pattern is evaluated and rejects requests for /img/ or /css/. If that RewriteCond matches, then the next RewriteCond requires at least one uppercase character.

The second rule only runs if the first rule has set the 'ReplacedUnderscore' flag.

You can play around with the order of the first rule's RewriteConds if you like. Put whichever RewriteCond is likely to fail first. I put the img/css RewriteCond first because most requests to a typical web site are for images.

Jim

solof

11:10 pm on Oct 14, 2009 (gmt 0)

10+ Year Member



currently, if only one (or more) uppercase character is present and no underscore are present, then uppercase characters are not being replaced by lowercase. what to do?

jdMorgan

4:51 am on Oct 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Problem: I would like to redirect (301) my current URLs, which uses underscores and capital letters, to a new URL which uses hyphens and only lowercase. A couple of directories can not be included (img & css) in the entire process.

Correct your initial requirements specification, and start again...

You may dispense entirely with the "ReplacedUnderscore" variable logic if you wish to correct case and underscores separately, and simply replicate the RewriteConds from the first rule to the second as needed -- or use a "skip rule" as you did at the start.

Jim

solof

10:01 am on Oct 15, 2009 (gmt 0)

10+ Year Member



I meant that your code works:
RewriteMap lowercase int:tolower
#
RewriteCond $1 !^(img¦css)/
RewriteCond $1 [A-Z]
RewriteRule ^/(([^_]*)_(.*))$ /$2-$3 [E=ReplacedUnderscore:Yes,N]
#
RewriteCond %{ENV:ReplacedUnderscore} =Yes
RewriteRule ^/(.+)$ http://example.com/${lowercase:$1} [R=301,L]

Except when an URL does not have underscore(-s), the uppercase characters are not being replaced by lowercase.

So the problem is: replace underscore and/or uppercase in one redirect.

solof

10:32 am on Oct 15, 2009 (gmt 0)

10+ Year Member



do you mean something like this:
RewriteMap lowercase int:tolower
#
RewriteCond $1 !^(img¦css)/
RewriteCond $1 [A-Z]
RewriteRule ^/(([^_]*)_(.*))$ /$2-$3 [E=ReplacedUnderscore:Yes,N]
#
RewriteCond %{ENV:ReplacedUnderscore} =Yes
RewriteRule ^/(.+)$ http://example.com/${lowercase:$1} [R=301,L]

RewriteRule ![A-Z] - [S=1]
RewriteCond $1 [A-Z]
RewriteRule ^/(.+)$ http://example.com/${lowercase:$1} [R=301,L]

solof

10:38 am on Oct 15, 2009 (gmt 0)

10+ Year Member



I meant this:

RewriteMap lowercase int:tolower
#
RewriteCond $1 !^(img¦css)/
RewriteCond $1 [A-Z]
RewriteRule ^/(([^_]*)_(.*))$ /$2-$3 [E=ReplacedUnderscore:Yes,N]
#
RewriteCond %{ENV:ReplacedUnderscore} =Yes
RewriteRule ^/(.+)$ http://example.com/${lowercase:$1} [R=301,L]

RewriteRule ![A-Z] - [S=1]
RewriteCond $1 !^(img¦css)/
RewriteCond $1 [A-Z]
RewriteRule ^/(.+)$ http://example.com/${lowercase:$1} [R=301,L]

jdMorgan

1:24 pm on Oct 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You seem to be in love with 'skip rules'. Your third rule above is utterly redundant with the RewriteCond that follows in the fourth rule that checks exactly the same thing...

It seems to me that what you're wanting --because you have not explicitly corrected your original requirements statement quoted above-- is simply:


RewriteMap lowercase int:tolower
#
# If uppercase characters and underscores, but not an /img/ or
# /css/ subdirectory request, replace underscores with hyphens
RewriteCond $1 !^(img¦css)/
RewriteCond $1 [A-Z]
RewriteRule ^/(([^_]*)_(.*))$ /$2-$3 [N]
#
# If uppercase characters but not /img/ or /css/ request, convert to lowercase
RewriteCond $1 !^(img¦css)/
RewriteRule ^/([^A_Z]*[A-Z].*)$ http://example.com/${lowercase:$1} [R=301,L]

Note that this still leaves out the case where an underscored URL contains no uppercase characters; Those URLs will not be redirected.

The most important step of any coding project is to correctly and precisely define your requirements. Too often there is a rush to coding when requirements have not yet been fully specified. The result is almost always a lot of wasted time and effort, as this thread now illustrates. For a more-joyful life, spend the time needed to write down concise requirements, and then comment your code accurately and thoroughly.

Jim

solof

1:31 pm on Oct 15, 2009 (gmt 0)

10+ Year Member



my sincere apologies. I know what I want but its hard to write it down precisely so others understand.

The only thing I still need to get working is "...the case where an underscored URL contains no uppercase characters".

Thats it.

Thanks again for your patience and help.

jdMorgan

1:44 pm on Oct 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Then remove the second RewriteCond from the first rule I posted in my immediately-previous post, so that underscore correction no longer requires uppercase characters to be present.

Jim

solof

2:04 pm on Oct 15, 2009 (gmt 0)

10+ Year Member



just to be sure: are these rules being ignored when an URL does *not* have uppercase or/and underscores?

jdMorgan

2:31 pm on Oct 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I guess you're not making the connection between requested URLs and the logic and functions of RewriteRule pattern-matching and RewriteCond pattern and condition matching. I'd suggest a thorough study project to completely understand mod_rewrite -- and/or programming in general, if that's the source of the problem...

A RewriteRule won't run if the pattern in the RewriteRule itself does not match, or if a condition in a RewriteCond does not match. In addition, RewriteConds are not processed at all if the RewriteRule pattern does not match (see Apache mod_rewrite documentation for details).

Disregarding all of the details, though, you should be able to establish whether the proposed code works or not simply by testing two sets of URLs: Test URLs which should be redirected, and test other URLs which should not be redirected. If operation is correct for both sets of URLs, and the URL-sets are sufficiently representative of all of the URLs used by your site, then the code can be shown to "do what you want."

I'm sorry, but I don't see or understand the 'disconnect' here that is blocking understanding. This is probably because I've been programming in one form or another for 40+ years, and I just "think in terms of logic and code" naturally.

Jim

solof

3:03 pm on Oct 15, 2009 (gmt 0)

10+ Year Member



Jim, I was asking because I am able to create a code/script myself but not optimize it. Mainly because I am not advanced in mod_rewrite as I am sure you are not in all script/programming.

So sorry for the n00b questions but it was just to be certain. I already tested the script and its running perfectly.

So thank you for your help and your fatherly lectures.

jdMorgan

3:26 pm on Oct 15, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



... at least you didn't say "grandfatherly"... :)

Glad it's working.

Jim

solof

6:21 pm on Oct 15, 2009 (gmt 0)

10+ Year Member



hehe, lol.

Thanks again for your help. I appreciate it a lot!