homepage Welcome to WebmasterWorld Guest from 54.204.58.87
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Partial URL Rewrite Upper to Lower Case in Apache
jackx




msg:4592258
 10:18 pm on Jul 11, 2013 (gmt 0)

Greetings,

Using rewrites I've found a few working examples that convert URLs from upper to lower case, however in my case I need to convert the characters in between the first and second forward slashes only, as in the example below.

There will only be alpha characters in this string, however it may not be a bad idea to be able to include other characters as well, which is not mandatory for my use, but may be helpful to others who read this thread.

http://www.example.com/MAKEALLLOWERCASE/casedoesntmatter

Thanks in advance for any and all help -

 

BillyS




msg:4592260
 10:42 pm on Jul 11, 2013 (gmt 0)

I faced a similar problem recently. Links on the site were mixed case and I wanted everything to be lowercase. I did this in two steps. I would suggest this approach (you can research this a bit more if you'd like). I added the following to my httpd.conf file:

RewriteEngine On
RewriteMap lc int:tolower

The second step involved changing the hard coded links on the site, for that I used MySQL's REPLACE command. I was able to convert about 10,000 internal links in about five minutes. I can provide an example if you'd like.

jackx




msg:4592273
 10:58 pm on Jul 11, 2013 (gmt 0)

In my case I only need characters within the first two forward slashes converted to lower case, as in this example: http://www.example.com/MAKEALLLOWERCASE/casedoesntmatter

The tolower statement would affect the entire URL string, wouldn't it?

In the event I've misunderstood, please elaborate or provide the example so I can glean from that

Many thanks

lucy24




msg:4592283
 11:27 pm on Jul 11, 2013 (gmt 0)

Is it your own server or will this be happening in htaccess? RewriteMaps can only be used in the config file :(

If is is htaccess, there are two practical approaches. If there are just a few URLs involved, make a separate rule for each page. If there are lots of them, rewrite to a php script that does everything and finally issues the redirect.

Yes, you can make RewriteRules that take a randomly cased URL and make it all lower-case using 26 rules in combination with a string of [N] flags. But unless your name is jdmorgan, I wouldn't recommend it.

jackx




msg:4592286
 11:33 pm on Jul 11, 2013 (gmt 0)

I have access to the config files, will not be done in htaccess

lucy24




msg:4592309
 12:20 am on Jul 12, 2013 (gmt 0)

Super. Then you can use the built-in "tolower" discussed above.

The tolower statement would affect the entire URL string, wouldn't it?

Depends what you feed it. You can capture the to-be-changed and the not-to-be-changed parts separately, and feed only the first part to the map.

:: wandering off to experiment with MAMP ::

ymmv depending on exact Apache version, but from where I'm sitting it looks as if you still have to define the map, even though you're using a built-in function. So for example

RewriteMap lowercase int:tolower

RewriteRule ^([^/]+)/(.*) /${lowercase:$1}/$2 [R=301,L]

I have tested this.

Incidentally, Apache helpfully says (verbatim)
While you cannot declare a map in per-directory context it is of course possible to use this map in per-directory context.

Well, of course. Duh. Thank you, Apache.

Edit:
Since this rule will come before your generic "index.html" redirect, you'll need some additional jiggery-pokery to make sure you're not redirecting
/DIRECTORY/index.html
to
/directory/index.html
requiring a further, separate redirect a bit further on. You can't simply have a condition exempting requests ending in index.html (or whatever extension you use), because you still end up redirecting twice-- just in the other direction. But that can be sorted out later.

jackx




msg:4592327
 1:51 am on Jul 12, 2013 (gmt 0)

I'm going to give that a try, thank you - any other considerations, code examples, etc., that I should try?

Many thanks

phranque




msg:4592349
 3:54 am on Jul 12, 2013 (gmt 0)

i would suggest adding a condition to limit that RewriteRule from firing for lower case directory names.
you should also provide a fully qualified url that include the canonical protocol and hostname for these urls.
RewriteMap lowercase int:tolower
RewriteCond $1 [A-Z]
RewriteRule ^([^/]+)/(.*) http://www.example.com/${lowercase:$1}/$2 [R=301,L]



casedoesntmatter

even though mixed case may be acceptable here, make sure the url of the requested resource is of the canonical case.
in other words only one of these requests should get a 200 OK response and the other one should get a 404/410 or a 301 to the canonical url:
http://www.example.com/alllowercase/casedoesntmatter
http://www.example.com/alllowercase/CaseDoesntMatter

lucy24




msg:4592351
 4:30 am on Jul 12, 2013 (gmt 0)

you should also provide a fully qualified url

My bad: I was cutting-and-pasting from MAMP (free version), where you can't put a full domain name in the target. Feel free to bring out the administratorial scissors ;)

i would suggest adding a condition to limit that RewriteRule from firing for lower case directory names.

Let's make that "You MUST add a condition" because otherwise you'll get an infinite redirect loop. (It didn't happen to me only because of the particular way I set up my experiments, or I would speedily have noticed!*)


* What I did notice was that MAMP includes mod_speling, so WrOnG CasE directory names still resolve. Luckily it doesn't happen on my live site. Can you spell "duplicate content"?

BillyS




msg:4592419
 10:22 am on Jul 12, 2013 (gmt 0)

I'm sure someone can improve on this, but I used the below:
RewriteCond %{REQUEST_URI} [A-Z]
RewriteRule (.*) ${lc:$1} [R=301,L]

jackx




msg:4593615
 6:19 pm on Jul 16, 2013 (gmt 0)

I just tried the code that phranque suggested, but no luck - any other suggestions that may help?

RewriteMap lowercase int:tolower
RewriteCond $1 [A-Z]
RewriteRule ^([^/]+)/(.*) http://www.example.com/${lowercase:$1}/$2 [R=301,L]

jackx




msg:4593636
 7:15 pm on Jul 16, 2013 (gmt 0)

This works, but rewrites the entire URL:

RewriteMap lc int:tolower
RewriteRule ^(.*?[A-Z]+.*) ${lc:$1} [R]

jackx




msg:4593705
 10:26 pm on Jul 16, 2013 (gmt 0)

Below is what ended up working - welcome any and all feedback - better way of doing it? Something I may be overlooking which could bite me later?

Goal was to only convert to lowercase the characters between the first and second slash, without converting anything beyond that string

RewriteMap lowercase int:tolower
RewriteCond $1 [A-Z]
RewriteRule ^/([^/]+) /${lowercase:$1} [R,L]

lucy24




msg:4593715
 11:03 pm on Jul 16, 2013 (gmt 0)

If you want a temporary redirect and aren't particular about protocol and domain, you've nailed it.

What happens to the part of the request, if any, after the first directory?

phranque




msg:4593740
 1:30 am on Jul 17, 2013 (gmt 0)

I have access to the config files, will not be done in htaccess


i missed this - that's why my suggested solution didn't work.
in the server config context you need the leading slash or the Pattern won't match.

try this:
RewriteMap lowercase int:tolower
RewriteCond $1 [A-Z]
RewriteRule ^/([^/]+)/(.*) http://www.example.com/${lowercase:$1}/$2 [R=301,L]


as suggested by lucy24, you want to specify the canonical protocol and hostname (http://www.example.com) in the Substitution string and you probably want a 301 status (R=301) instead of the default 302.
and you surely want the second capture group in the Pattern and corresponding back-reference in the Substitution string.

g1smd




msg:4593874
 12:39 pm on Jul 17, 2013 (gmt 0)

Make sure that clicking any link on the site results in a correct URL request. On-site clicks should not result in a redirect.

Make sure that the rule target includes the protocol and hostname. Order all of the rules such that no request results in a multiple step redirection chain.

Beware of "case doesn't matter" URLs. Every resource should have a single or canonical URL with specific casing. Incorrectly cased requests should be redirected. This can be quite easily done as the first function within the PHP or other script.

jackx




msg:4595493
 9:01 pm on Jul 22, 2013 (gmt 0)

a quick thanks to everyone for their input

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved