Forum Moderators: phranque
I'm trying to figure out a more efficient way of doing something with mod_rewrite. I have to rewrite a series of old URLs to new URLs using regexp substitution, converting all underscores to hyphens in the process. I don't have access to httpd.conf, so everything has to be done in .htaccess and hence without RewriteMap.
The old URLs look like this: [url.tld...]
The new URLs look like this:
[url.tld...]
There are say a dozen different sections. The old section and page names have varying numbers of underscores, but the total number of underscores in an old URL is never more than six.
So far the best thing I've come up with is to convert old section names to new section names in one step, then convert underscores to hyphens in a second step, e.g.
RewriteRule old/section/name/(.*)$ [url.tld...] [R=301,L]
[followed by 11 more rules for other sections]
RewriteRule ([^_]+)_([^_\.]+).html [url.tld...] [R=301,L]
[followed by more rules covering the other possible numbers of underscores]
I like this because I only have to use 12+6=18 rules, instead of 12x6=72 rules were I to cover every possible combination of section name and underscore-hyphen rewrites in a unique rule.
What's inefficient about this (and potentially annoying to users) is that the only way I've gotten it to work is by using [R=301,L] after the first *and* second sets of rules, thus sending a partially rewriteen URL back to the browser and only giving the correct URL on the second try. Which is ugly.
I have tried removing the flags from the rules in the first stage, but when I do that I end up with a completely unrewritten URL. How do I get it to continue rewriting after the first stage, and only send the fully rewritten URL back to the user when it's finished?
Many thanks,
akitinic
RewriteRule ^old/section/name/(([^_]+_)+[^_\.]+\.html)$ this-is-the-new-section-name/$1
RewriteRule ^([^_]+)_([^_\.]+)\.html$ http://www.example.com/$1-$2.html [R=301,L]
Jim
Many thanks for the suggestions. I tried leaving the [L] flag off the first set of rules, and it didn't seem to continue correctly to run the second set. In the end, I got the following to work:
# 1. Rewrite underscores to hyphens.
RewriteRule ^(^([^_]+)_(.+)$ $1-$2 [L]
# 2. Rewrite section names.
RewriteRule ^old-section-name(.*)$ new-section-name$1 [R=301,L]
# Followed by additional section-specific rules.
I only used rule #1 once, but with the [L] flag (a) Apache reruns the single rule until it's replaced all underscores in a URL with hyphens and (b) it only generates a 301 when it gets to rule(s) #2. This makes no sense to me, as I though [L] was supposed to terminate rewriting, not redo it. It didn't work without the flag, and it didn't work with the [N] flag!
Happy I got it working, but a bit confused,
akatinic
The [N] flag may not have worked because of the bug I mentioned, which can cause the rewritten URL-path to become 'corrupted' as described.
Jim