Forum Moderators: phranque

Message Too Old, No Replies

Adding New Rewrite Rule for Newbie in httpd.conf

How to add generic rewrite rule to remove www. from domains via 301 redirec

         

cenocre

10:23 pm on Jul 22, 2008 (gmt 0)

10+ Year Member



We have been trying to do a universal 301 redirect on Apache (Mac OS X Server 10.4.11) to redirect all domains from www.domain.com to domain.com to improve search engine rankings. We are doing it centrally for the server in httpd.conf so that we only need one rule for all domains.

This is what the default looks like in httpd.conf

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_METHOD} ^TRACE
RewriteRule .* - [F]
</IfModule>

Below are the two rules that have been suggested and I want to add one to the above configuration. Which rule looks better and how do I add one of them?

RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^(.*)$ [%1...] [R=301,L]

OR

RewriteCond %{HTTP_HOST} ^www\.(.*)
RewriteRule ^(.*)$ [%1...] [R=301,L]

Also, would anything additional have to be done to the server, such as modifying the individual host files?

jdMorgan

11:09 pm on Jul 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Both rules are flawed if you want fully-canonicalize the domain name. I'd suggest:

RewriteCond %{HTTP_HOST} ^www\.([^.]+(\.[^.:]+)+)(\.?:[0-9]+)?$ [NC]
RewriteRule ^/(.*)$ http://%1/$1 [R=301,L]

The reason you need to use this more-complex form is that the following would all be perfectly-valid values for the HTTP_HOST variable:
domain.com
domain.co.uk
domain.com.
domain.com:80
domain.com.:80
www.domain.com
www.domain.com.:80
etc.

You should also consider the possibility that bogus or incorrect subdomains might be prepended to the domain name or appended to the "www" -- For example, foo.example.com or www.foo.example.com. However, since the hostname has to be fully parsed to detect this, and since ICANN is opening up the TLDs so that just about anything will be able to be registered as a TLD, it would be easier to do this canonicalization on a per-host basis, and redirect all but the canonical hostname. This also makes it possible to detect and redirect accesses via IP address, if your hosts will have unique IP addresses. For example, in the VirtualHost container:


RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^canonical-domain\.tld$
RewriteRule ^/(.*)$ http://canonical-domain.tld/$1 [R=301,L]

This redirects all but the canonical hostname as long as the client sends a non-blank HTTP "Host:" header.

And just in case, the only reason that this may "improve search engine rankings" is that it forces all requests to a single form of the domain, and thus focuses PageRank and Link-popularity on a single domain. It also helps prevent people from bookmarking, copying and pasting, or linking to an incorrect version of the domain. There is nothing magic about the non-www version, and you could just as well reverse the domains and redirect to the www (or any other subdomain) if you wanted to.

For this reason, you should consider whether any of these domains will be used in radio or print advertising campaigns, where the "www" serves as a visual and auditory cue that a domain name follows...

As to what else you might do, well, there a million things, depending on what you want to do...

Jim

nick279

11:31 pm on Jul 22, 2008 (gmt 0)

10+ Year Member



Can't you just use

RewriteCond %{HTTP_HOST} !^domain.com$

jdMorgan

11:46 pm on Jul 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No, not in all cases. If the Host header is blank, the result would be an "infinite" redirection loop.

A lot depends on "future plans" -- If these sites are intended never to be hosted on an IP-based server, then using only the negative-match pattern RewriteCond would be safe. But if hosted on an IP-based server, the site is then exposed to HTTP/1.0 clients, which do not send a Host: header.

It's a matter of 'style' I suppose, but I believe in future-proofing designs.

Jim

cenocre

3:22 pm on Jul 23, 2008 (gmt 0)

10+ Year Member



Thanks, I'll try it, but my main two questions still remain - How to add it to the httpd.conf and do I need to do anything to the individual virtual host configurations.

Given the default httpd.conf:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_METHOD} ^TRACE
RewriteRule .* - [F]
</IfModule>

Would this be the way to add your rule and not interfere with the rewrite rule that was already there?

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_METHOD} ^TRACE
RewriteRule .* - [F]
RewriteCond %{HTTP_HOST} ^www\.([^.]+(\.[^.:]+)+)(\.?:[0-9]+)?$ [NC]
RewriteRule ^/(.*)$ [%1...] [R=301,L]
</IfModule>

Also, given that your proposed solution would redirect domain.com:80. What if I need to access a port such as domain.com:9999? Will that get changed too?

jdMorgan

3:57 pm on Jul 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Your proposed code position looks fine.

The code I posted will redirect any request with a period or port number appended to the hostname. Modify it to suit your needs.

As with any software, coding is easy, it is precisely and completely defining your requirements that is difficult.

Jim

cenocre

5:57 pm on Jul 23, 2008 (gmt 0)

10+ Year Member



Tried the code below in httpd.conf and nothing happened.

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_METHOD} ^TRACE
RewriteRule .* - [F]
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^(.*)$ [%1...] [R=301,L]</IfModule>

When I tried the second condition and rule as below in an .htaccess file at the root of a domain, it worked fine. Any idea why?

RewriteEngine On
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^(.*)$ [%1...] [R=301,L]

jdMorgan

12:15 am on Jul 24, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Move </IfModule> to its own line, and restart your server.

Jim

cenocre

8:44 am on Jul 24, 2008 (gmt 0)

10+ Year Member



Sorry, there was an error in pasting. It was on its own line as below and did not work.

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_METHOD} ^TRACE
RewriteRule .* - [F]
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^(.*)$ [%1...] [R=301,L]
</IfModule>

and1c

3:36 am on Jul 26, 2008 (gmt 0)

10+ Year Member



Have you restarted apache?

cenocre

4:17 am on Jul 26, 2008 (gmt 0)

10+ Year Member



Yes. I restarted. I am using the .htaccess method for the moment. If the above script is correct I can try it again later as a few things on this server haven't worked the first time, whereas exactly the same code worked the next day.

jdMorgan

1:29 pm on Jul 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> exactly the same code worked the next day.

There is no reason for this to happen "inexplicably." The most common reason this may appear to happen is if you test a new configuration, but do not flush your browser cache.

Be aware that URL-paths examined by RewriteRule patterns in httpd.conf, conf.d, etc. are full paths; They must specify the entire path from root on down. However, in .htaccess, the URL-paths are localized to the current directory in which this .htaccess file resides. In simple terms, that means that patterns in htppd.conf will always start with a slash and the full URL-path that follows the domain name, whereas in .htaccess, the leading slash and any URL-path info needed to reach the current .htaccess file's directory will not be present.

You code works in .htaccess, so when trying to get your code working in httpd.conf, conf.d, etc. be sure that the config context is the same as that set up for your .htaccess file: Options FollowSymLinks or SymLinksIfOwnerMatch must be set (see mod_rewrite docs), RewriteEngine on must be specified, etc. Study the Options and other configuration settings that you've applied to your .htaccess <Directory> and replicate those at the <VirtualHost> or server level.

Jim

cenocre

2:33 pm on Jul 26, 2008 (gmt 0)

10+ Year Member



Thanks for the detailed reply.

Regarding browser cache, I'll check it out again. I do usually try things in multiple browsers and have one set with no cache just for the purpose of testing sites.

>However, in .htaccess, the URL-paths are localized to the current directory in which this .htaccess file resides.

Makes sense. In this case though since I am only dealing with the domain name and no path beyond that wouldn't the code be the same. After all, the code above that you said would work in httpd.conf did work when extracted verbatim for use in .htaccess.

> Study the Options and other configuration settings that you've applied to your .htaccess <Directory> and replicate those at the <VirtualHost> or server level.

As you can see from the code above I do declare that the RewriteEngine is on in httpd.conf and the same code is in the indvidual host conf. FollowSymLinks is set to AllowOverride None in httpd.conf and there is no declaration in the individual host conf.

When you say "<VirtualHost> OR server level", I have only done settings in httpd.conf and nothing in any individual virtual host file when NOT using .htaccess. I asked about this in my first question and it would appear from you saying "or" and the other answers that is okay when I am only doing a server-wide rule.

Thanks again, I will try just the httpd.conf method in the next few days.

jdMorgan

3:35 pm on Jul 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Makes sense. In this case though since I am only dealing with the domain name and no path beyond that wouldn't the code be the same. After all, the code above that you said would work in httpd.conf did work when extracted verbatim for use in .htaccess.

Either you didn't copy my code exactly, or you've got DocumentRoot set incorrectly, then. Look at these rules, which are correct for each context:
httpd.conf:
 RewriteRule ^/includes/perl/do-something\.pl$ /includes/perl/do-other-thing.pl [L]

/.htaccess
 RewriteRule ^includes/perl/do-something\.pl$ /includes/perl/do-other-thing.pl [L]

/includes/.htaccess
 RewriteRule ^perl/do-something\.pl$ /includes/perl/do-other-thing.pl [L]

/includes/perl/.htaccess
 RewriteRule ^do-something\.pl$ /includes/perl/do-other-thing.pl [L]

FollowSymLinks is set to AllowOverride None in httpd.conf and there is no declaration in the individual host conf.

This statement is mixed-up. AllowOverride controls what Options in .htaccess can override server config-level Options settings. FollowSymLinks is an Option, settable at server, vHost, or .htaccess level. So, I don't know what you mean, but as documented, FollowSymLinks or SymLinksIfOwnerMatch must be enabled in order to enable mod_rewrite, no matter which context mod_rewrite is used in.

Each Apache directive has an entry at the top of its documentation labeled "Context" which describes where it can be used. Typically, it says things like "Server, Virtual Host, Directory, .htaccess". Some directives are limited in scope so that security can be maintained and so that users (e.g. Webmsters) on shared virtual hosting cannot shoot each others' Web sites down -- intentionally or unintentionally.

Outside of the designed-in context restrictions, you put your config code at the level where you want it (for security, administration, or maintainability reasons) -- Server level to apply to all vHosts, vHost level so the 'user' can't change it or, if you are the 'user', in .htaccess.

Jim