Forum Moderators: phranque

Message Too Old, No Replies

URL redirect for .htaccess

using regular expressions

         

jdkuehne

8:39 pm on Feb 10, 2005 (gmt 0)

10+ Year Member



I recently changed the extensions on our company website from .html to .shtml for SSI's. In looking for a way to rediect SE hits I ran across this regex script on Apache's website:

# backward compatibility ruleset for
# rewriting document.html to document.shtml
# when and only when document.shtml exists
# but no longer document.html
RewriteEngine on
RewriteBase /~quux/
# parse out basename, but remember the fact
RewriteRule ^(.*)\.html$ $1 [C,E=WasHTML:yes]
# rewrite to document.shtml if exists
RewriteCond %{REQUEST_FILENAME}.shtml -f
RewriteRule ^(.*)$ $1.shtml [S=1]
# else reverse the previous basename cutout
RewriteCond %{ENV:WasHTML} ^yes$
RewriteRule ^(.*)$ $1.html

This works great for pages in the main directory, but subdirectory pages are still going to the 404error page and not being redirected. Since regular expression are making my head spin, can anyone point the way to a resolution? Thanks.

jdMorgan

9:41 pm on Feb 10, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It seems to me that the code should work as-is.

You can try rooting the substitution URL-path by preceding it with a slash and see if that helps:


RewriteRule ^(.+)\.html$ $1 [C,E=WasHTML:yes]
RewriteCond %{REQUEST_FILENAME}.shtml -f
RewriteRule ^(.*)$ /$1.shtml [S=1]
RewriteCond %{ENV:WasHTML} ^yes$
RewriteRule ^(.*)$ /$1.html

The code above assumes that you need to process more rewriterules after those shown. If that's not the case, you can speed it up some:

RewriteRule ^(.+)\.html$ $1 [C,E=WasHTML:yes]
RewriteCond %{REQUEST_FILENAME}.shtml -f
RewriteRule ^(.*)$ /$1.shtml [L]
RewriteCond %{ENV:WasHTML} ^yes$
RewriteRule ^(.*)$ /$1.html [L]

Another possible reason for problems is that the subdirectories are not inheriting the rules from their parents. In that case, try adding RewriteOptions inherit to .htaccess in a subdirectory, and see if that helps.

Jim

jdkuehne

3:11 pm on Feb 11, 2005 (gmt 0)

10+ Year Member



Thanks for the reply Jim. Your first suggestion gave me internal server errors but your suggestion to use 'RewriteOptions inherit' seems to be doing the job.

On a separate but related topic, are there any newbie-friendly tutorials on regular expressions? Every one I've looked at so far makes my eyes start crossing after a few pages. ;)

Thanks again.

jdkuehne

jdMorgan

4:39 pm on Feb 11, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I suggest the one cited in our Forum Charter -- It's pretty good, and short.

One thing you'll notice as you deal more with regex is that it is hard to read, but it's much, much easier to write, once you get comfortable with it. Regular expressions are so powerful, and each little nuance so important, that it's difficult to comprehend the "side effects" of a particular complex pattern unless you know exactly what the original goal was. However, if you set out with a clear goal in mind, then it gets quite easy to write the regular expressions pattern needed to reach that goal... Well, it's easy most of the time. ;)

Jim

jdkuehne

7:38 pm on Feb 11, 2005 (gmt 0)

10+ Year Member



Thanks Jim,

I've looked at that one, but I'll try again. I guess I need to start trying to write some regular expressions. I do know how to read *.*
I guess that's a start. :)

jdkuehne

jdMorgan

12:56 am on Feb 12, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> I do know how to read *.* I guess that's a start. :)

Yes, just avoid using it whenever possible: It is the least efficient and most ambiguous pattern, and over-using it often leads to performance problems and "mysterious" and unexpected mod_rewrite results.

Jim