Forum Moderators: phranque

Message Too Old, No Replies

How to 301 redirect using wildcards?

Need to change all underscores to hyphens

         

Imaster

8:31 pm on Nov 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wish to use 301 redirect as I am shifting domain. On my older domain, I am currently using underscores (_) and since I am shifting domains, I thought I might as well even convert those underscores to hyphens (-).

I have many pages and for 301 redirecting them I do not wish to make a complete list of old URLs and corresponding new URLs. Is there a method of 301 redirect wherein a single line using wildcards can be written which while redirecting to the new domain, even converts all underscores to dashes.
e.g
olddomain.com/a_b_c.html -> newdomain.com/a-b-c.html
olddomain.com/d_e_f.html -> newdomain.com/d-e-f.html
and so on

Thanks in advance

jdMorgan

5:03 pm on Nov 5, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Imaster,

This is a very interesting problem that pushes the limits of non-scripted solutions. It's somewhat nasty and inefficient, but it can be solved in several ways using mod_rewrite:


# Two or more underscores -- replace one and restart loop internally
RewriteCond %{REQUEST_URI} _.*_
RewriteRule ^([^_]*)_([^_]*)$ /$1-$2 [N]
# One underscore in URL -- replace it and do external redirect
RewriteCond %{REQUEST_URI} _
RewriteRule ^([^_]*)_([^_]*)$ http://www.example.com/$1-$2 [R=301,L]

Make sure you place this code at or very near the TOP of your file; It restarts mod_rewrite processing each time it finds and replaces an underscore!

An alternative approach, if you have only a few underscores per URL (up to five shown here) might be:


RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ http://www.example.com/$1-$2-$3-$4-$5-$6 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)$ http://www.example.com/$1-$2-$3-$4-$5 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)$ http://www.example.com/$1-$2-$3-$4 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)_([^_]*)$ http://www.example.com/$1-$2-$3 [R=301,L]
RewriteRule ^([^_]*)_([^_]*)$ http://www.example.com/$1-$2 [R=301,L]

The second method is better if you cannot place the code near the top of your file. Both methods are designed to avoid multiple external (and therefore slow) redirects. These methods may be given elsewhere as examples, but I just typed this code; It should work, but it is not tested at all. Test in a subdirectory or on a development server before you deploy this code on a live server!

If you have access to httpd.conf (main server configuration file), you can also use RewriteMap to do this. It might be more efficient.

You might also consider calling a simple cgi script to do the character substitution and redirection, if you are more comfortable with that approach. Use only a 301-Moved Permanently redirect to avoid losing your search engine rankings!

No matter which approach you use, I strongly suggest writing individual redirects for the pages, scripts, and images that currently consume the top 33% of your bandwidth. The above code should work, but as stated, it's not terribly efficient. Bypassing it for your busiest pages will likely improve your site's performance.

Notes for all readers: It is not our normal practice to allow "write my code for me" posts here. However, this case is sufficiently interesting and complex that I decided to relax that rule temporarily. Despite that, this should not be taken as a precedent for change; The policy as stated in our Charter still stands. The link below may come in very handy if the code above is not immediately clear. Be sure to follow the links in the first post of the thread, as well as reading the post itself:

Ref links: Introduction to mod_rewrite [webmasterworld.com]

Jim

Imaster

5:35 pm on Nov 5, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Many Thanks, Jim. I really appreciate that you took your time to write the code for this complex case. Thank you.

Imaster

1:08 pm on Nov 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I tried the method but looks like there is a limitation on the number of variables I can use, with max being $9. On my pages there are more number of underscores than 9.

Any method for upto 15 number of underscores?

Thanks.

jdMorgan

4:20 pm on Nov 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Imaster,

Method #1 should work regardless of the number of replacements needed, so I'll assume you are trying to use method #2.

Some subtle modifications are needed -- Note that all the RewriteRules except the last one now use an internal path substitution, not a redirect. They also leave the final underscore of the current URI in place -- this is required in all cases to 'trigger' the last RewriteRule to do the external redirect to tell the browser or spider that the URL has changed.


RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_(.*)$ $1-$2-$3-$4-$5-$6_$7
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_([^_]*)_(.*)$ $1-$2-$3-$4-$5_$6
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_([^_]*)_(.*)$ $1-$2-$3-$4_$5
RewriteRule ^([^_]*)_([^_]*)_([^_]*)_(.*)$ $1-$2-$3_$4
RewriteRule ^([^_]*)_([^_]*)_(.*)$ $1-$2_$3
RewriteRule ^([^_]*)_([^_]*)$ http://www.example.com/$1-$2 [R=301,L]

This code, unlike the previous version, is now set up to process each rule on each request. It's better for your needs because you don't need so many rules; It allows up to 15 underscores to be replaced by 6 lines of code. If more than 15 underscores are present, then the first 15 will be replaced, the external redirect will take place, the browser (or spider) will issue a new request, and then the next 15 will be replaced, etc. This introduces the possibility of confusing search engine spiders (two 301 redirects while trying to fetch a resource), so make sure your limit is 15 underscores on files you *must* have properly indexed in the search engines. If not, then simply add another longer line with 8 backreferences at the top of this code, using the existing series as a pattern. Don't do it unless required, though -- it'll make the code ~15% less efficient.

If you do notice performance problems, you might consider adding a 'skip' clause to the code to exclude file-types that do not contain underscores (if there are any), or put the code only in a subdirectory that requires it... anything to avoid processing this code for every page, image, script, and CSS file request to your server.

For example to skip processing for .gif or .jpg files, you would add:


RewriteRule \.(gif¦jpg)$ - [S=6]

to the top of the code above to skip 6 rewrites if the filetype was .gif or .jpg. This speeds up serving images a lot, but slows down serving everything else a little. If there are more images requested than anything else (a common occurance), this will help.

Jim