Welcome to WebmasterWorld Guest from 18.207.132.114

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

301 Redirects and 301 Rewrites

Using mod_alias with mod_rewrite in .htaccess

     
8:32 pm on Jan 13, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 29, 2006
posts:1378
votes: 18


301 Redirects and 301 Rewrites
Using mod_alias with mod_rewrite in .htaccess

All over the web you will find well-intentioned people advising that you should not - or even must not - use mod_alias Redirects and mod_rewrite Redirects in the same .htaccess file.

Sometimes a seemingly plausible reason ("redirect chain") is offered, but mostly the advice is just recited as received wisdom, a dogma that has been treated as gospel on webmaster forums for about fifteen years now.

This is despite the two modules being compatible when used together in the Apache configuration files, no bug being reported, and the documentation stating that "mod_rewrite should be considered a last resort... understanding what other alternatives are available is a very important step towards mod_rewrite mastery".

The common belief that using mod_alias and mod_rewrite together is "wrong" was apparently derived from a misunderstanding of ancient posts in this forum, and I think it should be challenged.

Proof Of Concept
Tested on standard Apache 2.x (shared hosting, no SSL):

# Samizdata Redirect Method (non SSL)

# Not Required
PassEnv REQUEST_URI
PassEnv REDIRECT_URI

# Redirect URLs
Redirect 301 /apache.htm http://www.example.com/litespeed.htm
RedirectMatch 301 ^/nginx /litespeed.htm

# Enable Rewrites
RewriteEngine On

# Check Redirects First
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule (.*) - [END]

# Canonical Fix (non SSL)
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.%{SERVER_NAME} [NC]
RewriteRule (.*) http://%{SERVER_NAME}/%{REQUEST_URI} [R=301,L]

# See SSL version below.

Results

Request: example.com/apache.htm : Log Entries

"GET /apache.htm HTTP/1.1" 301 245 "-"
"GET /litespeed.htm HTTP/1.1" 200 38 "-"

Request: example.com/apache.htm : Live Headers

Server: Apache
GET http://www.example.com/apache.htm [HTTP/1.1 301 Moved Permanently 0ms]
GET http://www.example.com/litespeed.htm [HTTP/1.1 200 OK 107ms]

Results speak for themselves.

But what you really want is this:

Untested SSL Version

# Samizdata Redirect Method (SSL)

# Redirect URLs
Redirect 301 /apache.htm https://www.example.com/litespeed.htm
RedirectMatch 301 ^/nginx /litespeed.htm

# Enable Rewrites
RewriteEngine On

# Check Redirects First
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule (.*) - [END]

# Canonical & Encryption
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.%{SERVER_NAME} [NC,OR]
RewriteCond %{HTTPS} !=on
RewriteRule (.*) https://%{SERVER_NAME}/%{REQUEST_URI} [R=301,L]

I can't test this as my secure server runs on LiteSpeed (which has a built-in fix), but it should work the same way as the unencrypted example - a single 301 will show in the http log, with a 200 in the https log as expected.

Forget about "magical incantations" or "voodoo".

Things happen in the right order, so code accordingly.

...

The Right Order

Apache processes an incoming request in several phases.

The last phase before a response is served is the Fixups phase.

The fixups phase is used by modules to 'reassert' their ownership or force the request's fields to their appropriate values. [httpd.apache.org...]

Fixups is when your DocumentRoot .htaccess files are scanned - and by design they are scanned several times, with the cycles taking all the modules invoked by your .htaccess into account on each pass.

# Enable Rewrites
RewriteEngine On

This invokes mod_rewrite, which is documented to run before mod_alias does.

# Check Redirect Status
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule (.*) - [END]

This first rule checks whether a Response Status Code has already been set - if not, set the status to 200 (default unless specified), terminate the first mod_rewrite Fixup pass (END flag), then move on to the next module in the sequence (whatever it may be), leaving other RewriteRules to be processed in a later cycle.

# Environment
PassEnv REQUEST_URI
PassEnv REDIRECT_URI

This code is actually not necessary - it was intended to ensure that REQUEST_URI and REDIRECT_URI were shared between modules (they are by default) and available to scripts (an as yet untested possibility). It illustrates the "magic" part of the Fixup process neatly.

Modules I invoke in my own .htaccess include mod_alias, mod_env, mod_setenvif, mod_mime, mod_headers and mod_dir as well as mod_rewrite, but our only concern here is with mod_alias redirection.

# Redirect URLs
Redirect 301 /apache.htm https://www.example.com/litespeed.htm
RedirectMatch 301 ^/nginx /litespeed.htm

This invokes mod_alias - RedirectMatch is more versatile than Redirect, does not require a full URL-path for the target, and can use regular expressions.

Either type of directive will change the REQUEST_URI variable (to the redirect target) and the REDIRECT_STATUS (302 unless specified), then feed them back into the Fixups cycle.

Regardless of other modules, mod_rewrite will start from RewriteEngine On as usual on the next pass - but the REDIRECT_STATUS variable now has an existing value (200 or 301), which allows the request a passthrough to the second rule.

# Canonical & Encryption
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.%{SERVER_NAME} [NC,OR]
RewriteCond %{HTTPS} !=on
RewriteRule (.*) https://%{SERVER_NAME}/%{REQUEST_URI} [R=301,L]

On this mod_rewrite pass the canonical and encryption rule in our example is dealing with the REQUEST_URI variable from mod_alias (lightspeed.htm) rather than the original request (apache.htm), now preserved as the REDIRECT_URI variable.

The substitution was done internally, so there is only one redirect.

If the original incoming request had been for a resource you are not redirecting then that would be the REQUEST_URI, producing a single canonical and encryption redirect (with no SEO downside), or in the case of "iis.htm" more likely triggering an ErrorDocument 404 directive (core or custom).

Any other rewrite rules below the canonical one are then processed in the order they are written. I have about 65 rules (with 1200 conditions) in my boilerplate .htaccess file, most of which deal with bot control.

I keep my mod_alias Redirects at the top of the file for convenience only - mod_rewrite always runs first, but I usually have at least 1,500 lines of code for that module under RewriteEngine On, and only 50 lines for all the other modules combined.

What you shouldn't do is mix lines of code from the separate modules in the hope of influencing the execution order - communication between the modules is done by modifying the environment variables that are available in the Fixups loop.

The Fixups phase eventually ends when a scan cycle of all the modules finds no more directives left to process.

The response to the request is then served:

"GET /apache.htm HTTP/1.1" 301 245 "-"
"GET /litespeed.htm HTTP/1.1" 200 38 "-"

One redirect, no chain, no problem.

...

It's easy to do, and even easier to copy and paste.

Redirects in mod_alias are not a problem - but mod_rewrite logic can be.

My thanks to jdMorgan, who once advised me to read the Apache documentation.

Cheers Jim, I finally got around to it.

...

[edited by: phranque at 7:39 am (utc) on Jan 14, 2019]
[edit reason] unlinked urls [/edit]

10:17 am on Jan 14, 2019 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11875
votes: 246


for those still interested, much of this was already discussed in this thread:
[webmasterworld.com...]

that thread is where samizdata offered this advice:
Avoid using mod_alias and mod_rewrite together unless you either have access to the Apache configuration above Directory level or are on a LiteSpeed server (where the two modules seem to play nice).

The current Apache .htaccess implementation omits the Passthrough option offered at VirtualHost level, so a rewritten URI cannot be passed internally to mod_alias before execution (as it sometimes needs to be) in a shared hosting environment.

This may lead to a chain of two 301 Permanent redirects being executed, generally undesirable and not popular with search engines.

A future Apache release needs to address this issue - mod_rewrite and mod_alias will play nice if you ask them, but you can't ask them from a shared hosting account.
10:42 am on Jan 14, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 29, 2006
posts:1378
votes: 18


That was a proposed modification of the misguided boilerplate advice given out on WebmasterWorld.

It is superceded by the post above, which I recommend that you read carefully.

And my advice on the other thread was:

My advice is to RTFM and keep testing your RewriteRule until you get it right.

I have done the reading and testing for you, see above.

It is perfectly possible - desirable even - to use mod_alias and mod_rewrite together.

It always has been.

...
12:29 pm on Jan 14, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 29, 2006
posts:1378
votes: 18


This is the magic bullet:

# Enable Rewrites
RewriteEngine On

# Check Redirects First
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule (.*) - [END]

This instruction to check with mod_alias before further processing must be the first RewriteRule if you use any external redirects in mod_rewrite.

The second rule should be the problematic canonical redirect, which will not work as intended in all circumstances unless you do this check.

We all need the canonical redirect but it can only be done with mod_rewrite, so in order to use it without breaking another Apache module you must do the Environment variable check first.

The problem is not the mod_alias Redirect.

The problem is the mod_rewrite canonical Redirect.

The solution is the RewriteRule above.

...
2:23 pm on Jan 14, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 29, 2006
posts:1378
votes: 18


Q: What is the Canonical & Encryption RewriteRule (that we all use) actually for?

A: It is meant to capture *all* incoming requests and - if necessary - redirect them to https and your preferred canonical format.

Q: So why do I get a redirect chain if I use mod_alias Redirects?

A: Because your Canonical & Encryption RewriteRule does not capture *all* incoming requests.

Q: So how do I fix this?

A: Use the Environment variable check posted above to capture the mod_alias Redirects, problem solved.

Q: Are you crazy?

A: Some people think so, I beg to differ.

...
9:23 pm on Jan 14, 2019 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 29, 2006
posts:1378
votes: 18


IMPORTANT CORRECTION

Yes, I found a problem. No, I did not give up.

I corrected the first RewriteRule:

# Allow mod_alias Redirect
RewriteCond %{ENV:REDIRECT_STATUS} 301
RewriteRule (.*) - [L]

It allows a passthrough to the mod_alias redirect.

But it also inserts an unwanted extra slash.

Adding this should probably take care of it:

RedirectMatch ^/(.*?)/$ /$1

My advice was to keep testing until you get it right.

I am taking my own advice.

...
3:41 am on Jan 15, 2019 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11875
votes: 246


This is the magic bullet:
...
We all need the canonical redirect but it can only be done with mod_rewrite, so in order to use it without breaking another Apache module you must do the Environment variable check first.
...
The solution is the RewriteRule above.

am i understanding this correctly?
you are using mod_rewrite directives to avoid firing the hostname canonicalization (mod_rewrite) redirect ruleset in order to prove that you can make a mod_alias directive fire first.

why?

Q: Are you crazy?

i enjoy watching gymnastics but it isn't my preferred method of daily exercise.