Forum Moderators: phranque

Message Too Old, No Replies

how to rewrite non www url's to www url's?

         

glimbeek

11:26 am on Jan 18, 2010 (gmt 0)

10+ Year Member



This question has been asked and answered before all over the web, however mine has a small twist which I haven't figured out and I haven't found anyone that could explain it to me.

I bet the answer is simple!

Anyway..

I have a root folder which has an .htaccess
I have a /blog/ folder in that root which has a Joomla installation and its own .htaccess file.

As the title says, I want to rewrite all non www urls to www.example.com.

In the .htaccess file of the root I put the following:

RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301]

That works like a charm, except for the /blog/ folder. AKA http://example.com/blog/ doesn't get rewritten to http://www.example.com/blog/

Because the /blog/ folder has its own .htaccess file, I need to setup that file in such a way that it rewrites as well, right? Tell me if I'm doing/understanding this the wrong way.

To do this, I tried using the same code:

RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301]

Which results in the url being rewritten with www but without /blog/ in the url, so I end up with http://www.example.com/news/article/ instead of http://www.example.com/blog/news/article/.

To fix this I'm using:
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/blog/$1 [R=301,L]

That works, but it confuses me. Why do I need to add /blog/ in the rewriterule line, shoulnd't it just pick up on it?

And in general is there a better way of doing this? I looked in using CNAME or apache server alias but a few people across the web told me that works fine on the browser side of things but not for search engines like Google.

Thanks in advance.

[edited by: engine at 12:20 pm (utc) on Jan. 18, 2010]
[edit reason] Please use example.com [/edit]

encyclo

1:18 pm on Jan 18, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome to WebmasterWorld glimbeek :) If the blog in question is running WordPress, then you need to set the canonical name as the www version directly in the WordPress admin control panel - WP does its own rewriting which can override your .htaccess rules.

glimbeek

1:22 pm on Jan 18, 2010 (gmt 0)

10+ Year Member



Hi encyclo,

and thanks for the welcome.

No it's a Joomla website, for a small part. But the rewrite doesn't concern Joomla. The non www URL should be rewritten to a an URL with www even before Joomla does anything.

g1smd

7:21 pm on Jan 18, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301]

There's several flaws here. Firstly you need to add the [L] flag to every RewriteRule line.

The other is that this rule does NOT redirect for ALL non-www URLs. It fails to redirect for an appended port number and/or appended period after the hostname. It also fails to do that for www URLs.

This fixes both of those flaws:

RewriteCond %{HTTP_HOST} [b]![/b]^www\.example\.com$
RewriteRule (.*) http://www.example.com/$1 [R=301[b],L[/b]]


Remove the NC flag, otherwise the rule will fail to work properly.

Finally, to clarify, when you use (.*) to pick up the path requested in the original URL request, note that the path is localised to the current folder where the .htaccess file resides. That is, the .htaccess file in the /path1/ folder at /path1/.htaccess can only 'see' the path2/path3 part of the /path1/path2/path3 URL request.

jdMorgan

8:14 pm on Jan 18, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It might be simpler to use one rule in example.com/.htaccess only, and then set
RewriteOptions Inherit
in the /blog/.htaccess file.

In this way, the rules in the main .htaccess will apply to the /blog URL-paths as well.

Jim

g1smd

9:41 pm on Jan 18, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes, I can certainly recommend having one .htaccess file in the root with all the rules for the site.

glimbeek

7:16 am on Jan 19, 2010 (gmt 0)

10+ Year Member



If I use only one .htaccess file in the root with RewriteOptions Inherit, will Joomla still work properly then?

This is something I can't really test nor do I have the time to test this properly.

So an easier solution would be to add the following code to the Joomla .htaccess in the /blog/ folder as well, right?

RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

But if I use that the url:
http://example.com/blog/section/catergory/news-article/ get rewritten to:
http://www.example.com/section/catergory/news-article/

It misses the /blog/

So I use this code:

RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule (.*) http://www.example.com/blog/$1 [R=301,L]

But why do I need to add /blog/ there?

g1smd

10:08 am on Jan 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Path details available for use in .htaccess rewrite rules are 'localised' as I explained above.

Also to be clear on terminology, the effect you are looking for is an external redirect. A rewrite does something else, even though the code is very similar. It is a very important difference.

glimbeek

11:54 am on Jan 19, 2010 (gmt 0)

10+ Year Member



Aha! Thank you for the reply's.

Thanks for clarifying that again.
"external redirect. A rewrite does something else, even though the code is very similar. It is a very important difference."

Could you explain it more detailed? Sorry if I'm asking "stupid" questions but I've read so many different things over the last week and a half, I'm getting confused.

To clarify the code used and please tell me if I'm wrong:
First line:
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteCond = Tells apache that the following rewriterule should only be done of it passes this condition
%{HTTP_HOST} = Checks the requested domain?
! = is a "if not"
^ = is the start of the condition/pattern you make the check on
The \ before the . is to escape the .
$ is the end for the pattern you make the check on.

Second line:
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
(.*) = In this case gathers everything from the URL that comes behind the domain name.
$1 = Puts everything gathered from the URL on this location
R=301 = Makes sure the rewrite is a 301 Moved Permanently rewrite
L = Makes sure apache handles and finishes this rewrite rule first before it does anything else.

g1smd

12:03 pm on Jan 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A rewrite takes a URL request and matches it to an internal folder and filepath inside the server. Content is served for the currently requested URL.

A redirect closes the current HTTP transaction and suggests that the browser makes a new request for a different URL.

R=301 : Makes sure the rule is a 301 Moved Permanently redirect.

glimbeek

12:28 pm on Jan 19, 2010 (gmt 0)

10+ Year Member



So the rewriterule is a redirect and not a rewrite.
Other then that I am correct?

**EDIT**
Another question,

Do I need use QSA? I don't have any query strings in my urls. All my urls at the moment are SEO friendly. But maybe for the future or isn't QSA needed with the way you explained to me?

Thought of something else as well.
I have a subdomain, is this going to cause a problem?

g1smd

1:23 pm on Jan 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Query strings are automatically re-appended unless you have replaced the query string with a different one, or you end the target with a question mark which clears the query string. So, you only need [QSA] in the first case, and only if you need to re-append the values.

To stop Duplicate Content issues, for sites that do not use query strings, I clear all query strings in the redirect. That means, if anyone does ask for a URL and they include an unwanted query string, the site does not serve content at a duplicate URL, they instead are redirected to the canonical form.

glimbeek

1:28 pm on Jan 19, 2010 (gmt 0)

10+ Year Member



Thank you g1smd!

My explanation of the code was correct then?

I have a subdomain, is this going to cause a problem?

"I clear all query strings in the redirect" How do you do that?

jdMorgan

3:15 pm on Jan 19, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's becoming obvious that you could save a lot of time (and typing) by reading the mod_rewrite documentation at Apache.org, particularly the RewriteRule and RewriteCond directive descriptions. Trying to write, evaluate, test, or modify mod_rewrite code without taking that first step is not only likely to be ineffective, it can also be dangerous, as a single typo or logic error can --if you are lucky-- immediately take your site offline. If you are not so lucky, that small error can just sit there, quietly destroying your search engine rankings over time.

The [QSA] flag is needed only to append additional query string data to an existing query string; If no additional query data is present in the RewriteRule substitution field, then the original query string is passed through RewriteRules unchanged.

If you have a subdomain, exclude all variations of it from the canonicalization rule g1smd suggested above by adding a negative-match RewriteCond. Then copy the resulting rule, and reproduce its function for the subdomain as well, by swapping all occurrences of the subdomain and main domain patterns and URLs. In this way, both the main domain and the subdomain will be canonicalized. Since the rules will be mutually-exclusive, you may put them in any order, although you would want the one that is most likely to run most often (probably the one for your main domain) to be placed first.

You clear the query string by appending a "?" to the RewriteRule substitution field. As described in the documentation, this is a mod_rewrite operator, and will not appear in the rewritten or redirected path.

Jim

glimbeek

2:13 pm on Jan 22, 2010 (gmt 0)

10+ Year Member



I want to thank everybody for the support!