Forum Moderators: phranque

Message Too Old, No Replies

Apache redirect best-practice query

mod_redirect or virtual host change?

         

petjb

11:45 pm on Dec 19, 2010 (gmt 0)

10+ Year Member



Hi all,

Firstly a massive 'thanks!' to all in this forum - particularly jdMorgan - I've been doing a lot of reading in here over the last few weeks and have learned heaps. I've been working with web serving software for years, but primarily IIS, so I've had some catching up to do to become more familiar with Apache.

Apologies for the length of this post, but I want to properly explain the problem that I'm attempting to solve, and the possible solutions I've considered. Any feedback is very welcome, thank you! :)

My question relates to a bunch of domain names that all point to one IP-based Apache virtual host - so currently:
  • domain1.com
  • domain1.com/index.html
  • www.domain1.com
  • www.domain1.com/index.html
  • domain2.com
  • domain2.com/index.html
  • www.domain2.com
  • www.domain2.com/index.html
(x about 30-odd domain names)

...all point to the same content, with no URL rewriting occurring.

What I have been tasked with is pointing all domain names to the one canonical address - www.domain1.com(.)

The first step - redirecting any traffic hitting www.domain1.com/index.html to www.domain1.com seems pretty simple, by adding the following rule to the httpd.conf file:

# redirect home page /index.html to [domain1.com...] for SEO
# cannot use Redirect or RedirectMatch, as they cause a redirect loop

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html [NC]
RewriteRule ^index\.html$ [domain1.com...] [R=301,L]

If anyone has any comments/recommendations re this change, I'd love to hear them.


The second part - redirecting any www.domainX.com or domainX.com traffic to www.domain1.com - seems a bit trickier, and I'd like to implement the best-practice option, if there is indeed one! :) From what I can see, there are three options to achieve this:

1) Create a new IP-based VirtualHost, with a new public IP address, which I can then point all the non-canonical versions of the domain names to (eg domain1.com, domain2.com, www.domain2.com etc), which then pushes the traffic to the canonical name with a mod_rewrite redirect.

Pros:
  • minimal chance of breaking anything currently in production
  • simple process to follow if new domain names are added in the future; simply point them at the redirection IP address
Cons:
  • quite a bit of work; new IP address allocation, configuration on prod Apache boxes, config on load balancer etc
  • requires updating A records for all non-canonical domain names


2) Tweak existing httpd.conf file to handle redirection to canonical address

This seems to be the simplest option, but I'm concerned that there may be a performance hit with handling the change this way. The proposed code to handle this redirection is below...

# 301 redirect all FQDNs (other than www.domain1.com) to www.domain1.com for SEO
# maintains any url path/filename past the domain name

RewriteCond %{HTTP_HOST} !^www\.domain1\.com [NC]
RewriteCond %{HTTP_HOST} !^$
RewriteRule ^/?(.*) [domain1.com...] [R=301,NE,L]

Pros:
  • simple change; update httpd.conf and that's it
  • no changes required for any new domain names added in the future
Cons:
  • possibly performance hit?


3) Change existing IP-based VirtualHost to name-based, and add new IP-based VirtualHost to catch non-matching traffic

I don't even know if this is possible in this scenario, but it seems to be a valid concept... Basically the change would involve changing the existing IP-based VirtualHost to name-based (www.domain1.com), serving content as per normal. Traffic hitting this IP address that does NOT match that FQDN would then match against an IP-based VirtualHost, which would serve the above-mentioned mod_rewrite rule to redirect the traffic back to www.domain1.com, which (once redirected) would match on the name-based VirtualHost and serve the appropriate page.

However - I'm not sure if I can combine a name-based *and* IP-based VirtualHost like that...?

Once again, many thanks for any advice that can be offered.

Cheers
jb :)

g1smd

11:55 pm on Dec 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Change the pattern in the index 301 redirect so that it can redirect any index request at any folder depth to equivalent URL without the index filename. This makes the code more robust.

If all domains on the server need to be redirected to one canonical domain, there's very simple and efficient code to achieve that. It's similar to your existing code, but the RewriteCond pattern should be
^(www\.example\.com)?$
or similar.

You do not need any extra servers or IPs. This can all be done on the existing server.

jdMorgan

12:27 am on Dec 21, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Note that the "/index.html" to "/" must be executed before the domain canonicalization redirect, in order to avoid chained/stacked multiple redirect (an SEO sin).

Therefore, either move all redirects to your config file, or put them all in \.htaccess.

Follow all redirects with any internal rewrites.

Put the rules in each of these two groups in order from most-specific patterns and conditions to least-specific... This is why the index redirect precedes the canonicalization redirect, because it is specific to one URL-path only.

Access control rules, if any, should precede your redirects -- no use wasting time redirecting unwanted requests.

Jim

petjb

5:47 am on Dec 21, 2010 (gmt 0)

10+ Year Member



Thanks for your comments guys.

g1smd: Change the pattern in the index 301 redirect so that it can redirect any index request at any folder depth to equivalent URL without the index filename.

Gotcha. Something like this?

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html\ HTTP/
RewriteRule ^(([^/]+/)*)index\.html$ http://www.domain1.com/$1 [R=301,NE,L]



g1smd: If all domains on the server need to be redirected to one canonical domain, there's very simple and efficient code to achieve that. It's similar to your existing code, but the RewriteCond pattern should be ^(www\.example\.com)?$ or similar.

Would you mind expanding on this? The code I used was referenced from the Apache docs ([httpd.apache.org ]), and seems to provide exactly what I want...

jdMorgan, thanks for the tips. These rules will be going in the config file; we don't use folder-level .htaccess files. I'll ensure that the index.html redirect is first in the list, followed by the canonical redirect rule. No access control rules exist.

Thanks again for the help! :)

g1smd

8:22 am on Dec 21, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html\ HTTP/
RewriteRule ^/(([^/]+/)*)index\.html$ http://www.example.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule ^/(.*) http://www.example.com/$1 [R=301,L]


Every minuscule change to the code has a good reason.