Forum Moderators: phranque

Message Too Old, No Replies

RedirectMatch with replacement?

Replacing a character in a URL while doing a Redirect match

         

BigBadBurrow

10:49 pm on Mar 27, 2006 (gmt 0)

10+ Year Member



Hello all,

I'm trying to write RedirectMatch for my website that will redirect urls ending with -similar.htm and replacing any tilds (~) in the URL with hyphens.

My first objective is easy, and I'm using this expression:

RedirectMatch (.*)-similar\.htm$ [mydomain.com...]

I'm not sure how to replace the tilds in the URL though. For example I would like this URL:

[mydomain.com...]

to change to:

[mydomain.com...]

Does anyone have any ideas? I'm quite new to regular expressions so I would really appreciate your help!

Many thanks,

BBB

jdMorgan

11:41 pm on Mar 27, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Because you can have multiple tildes in the path, it would be best not to redirect until all are changed. Otherwise, you'd be doing multiple redirects, one for each tilde, with the client having to re-request the URL several times. For this reason, mod_rewrite would be better-suited to this problem.

If you have a small number of possible tildes in the URL, a simple solution would be to apply (say three or four) rewrites, followed by a redirect:


Options +FollowSymLinks
RewriteEngine on
#
RewriteRule ^([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^\-]+)-similar\.html$ /$1-$2-$3-$4-$5-$6-similar\.html
RewriteRule ^([^~]+)~([^~]+)~([^~]+)~([^\-]+)-similar\.html$ /$1-$2-$3-$4-similar\.html
RewriteRule ^([^~]+)~([^~]+)~([^\-]+)-similar\.html$ /$1-$2-$3-similar\.html
RewriteRule ^([^~]+)~([^\-]+)-similar\.html$ /$1-$2-similar\.html
RewriteRule ^(.+)-similar\.html$ http://www.example.com/$1-new.html [R=301,L]

That will replace one through eleven tildes with hyphens, and then redirect. You can expand this to handle up to 29 tildes, and then a recursive method would be needed to avoid multiple redirects (See mod_rewrite RewriteRule [N] flag).

An alternative would be to handle the various cases individually, up to eight tildes:


RedirectMatch ^([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^\-]+)-similar\.htm$ http://www.example.com/$1-$2-$3-$4-$5-$6-$7-$8-$9-new.html
RedirectMatch ^([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^\-]+)-similar\.htm$ http://www.example.com/$1-$2-$3-$4-$5-$6-$7-$8-new.html
RedirectMatch ^([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^\-]+)-similar\.htm$ http://www.example.com/$1-$2-$3-$4-$5-$6-$7-new.html
RedirectMatch ^([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^\-]+)-similar\.htm$ http://www.example.com/$1-$2-$3-$4-$5-$6-new.html
RedirectMatch ^([^~]+)~([^~]+)~([^~]+)~([^~]+)~([^\-]+)-similar\.htm$ http://www.example.com/$1-$2-$3-$4-$5-new.html
RedirectMatch ^([^~]+)~([^~]+)~([^~]+)~([^\-]+)-similar\.htm$ http://www.example.com/$1-$2-$3-$4-new.html
RedirectMatch ^([^~]+)~([^~]+)~([^\-]+)-similar\.htm$ http://www.example.com/$1-$2-$3-new.html
RedirectMatch ^([^~]+)~([^\-]+)-similar\.htm$ http://www.example.com/$1-$2-new.html

Using this RedirectMatch method, only eight tildes can be replaced with a single redirect, since we are limited to nine back-reference variables, $1 through $9.

Note that the above code has wrapped; Each RedirectMatch directive should be all on one line.

FYI, the [^~]+ construct means, "match one or more characters, not equal to a tilde." In practical terms and as used here, it means, "match all the characters until you find another tilde." This method is much less ambiguous and much much more efficient than using multiple .* patterns.

Jim

BigBadBurrow

8:52 am on Mar 28, 2006 (gmt 0)

10+ Year Member



Thanks Jim!

The reason I'm doing a redirect as supposed to a mod_rewrite is because I'm changing the URLs sitewide from -similar.html to -new.html. If I use a mod_rewrite then I will in effect have the same content being displayed from two URLs (x-y-z-new.html and one from the old x~y~z-similar.html), which I think the search engines will see as an attempt to create thousands of doorway pages.

So for that reason I think your second solution using the different RedirectMatch cases will work for this situation, and will be the safer option.

Many thanks for your help, it's much appreciated.

BBB