Forum Moderators: phranque
I bet the answer is simple!
Anyway..
I have a root folder which has an .htaccess
I have a /blog/ folder in that root which has a Joomla installation and its own .htaccess file.
As the title says, I want to rewrite all non www urls to www.example.com.
In the .htaccess file of the root I put the following:
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301]
That works like a charm, except for the /blog/ folder. AKA http://example.com/blog/ doesn't get rewritten to http://www.example.com/blog/
Because the /blog/ folder has its own .htaccess file, I need to setup that file in such a way that it rewrites as well, right? Tell me if I'm doing/understanding this the wrong way.
To do this, I tried using the same code:
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301]
Which results in the url being rewritten with www but without /blog/ in the url, so I end up with http://www.example.com/news/article/ instead of http://www.example.com/blog/news/article/.
To fix this I'm using:
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/blog/$1 [R=301,L]
That works, but it confuses me. Why do I need to add /blog/ in the rewriterule line, shoulnd't it just pick up on it?
And in general is there a better way of doing this? I looked in using CNAME or apache server alias but a few people across the web told me that works fine on the browser side of things but not for search engines like Google.
Thanks in advance.
[edited by: engine at 12:20 pm (utc) on Jan. 18, 2010]
[edit reason] Please use example.com [/edit]
RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301]
There's several flaws here. Firstly you need to add the [L] flag to every RewriteRule line.
The other is that this rule does NOT redirect for ALL non-www URLs. It fails to redirect for an appended port number and/or appended period after the hostname. It also fails to do that for www URLs.
This fixes both of those flaws:
RewriteCond %{HTTP_HOST} [b]![/b]^www\.example\.com$
RewriteRule (.*) http://www.example.com/$1 [R=301[b],L[/b]] Finally, to clarify, when you use (.*) to pick up the path requested in the original URL request, note that the path is localised to the current folder where the .htaccess file resides. That is, the .htaccess file in the /path1/ folder at /path1/.htaccess can only 'see' the path2/path3 part of the /path1/path2/path3 URL request.
This is something I can't really test nor do I have the time to test this properly.
So an easier solution would be to add the following code to the Joomla .htaccess in the /blog/ folder as well, right?
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
But if I use that the url:
http://example.com/blog/section/catergory/news-article/ get rewritten to:
http://www.example.com/section/catergory/news-article/
It misses the /blog/
So I use this code:
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule (.*) http://www.example.com/blog/$1 [R=301,L]
But why do I need to add /blog/ there?
Thanks for clarifying that again.
"external redirect. A rewrite does something else, even though the code is very similar. It is a very important difference."
Could you explain it more detailed? Sorry if I'm asking "stupid" questions but I've read so many different things over the last week and a half, I'm getting confused.
To clarify the code used and please tell me if I'm wrong:
First line:
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteCond = Tells apache that the following rewriterule should only be done of it passes this condition
%{HTTP_HOST} = Checks the requested domain?
! = is a "if not"
^ = is the start of the condition/pattern you make the check on
The \ before the . is to escape the .
$ is the end for the pattern you make the check on.
Second line:
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
(.*) = In this case gathers everything from the URL that comes behind the domain name.
$1 = Puts everything gathered from the URL on this location
R=301 = Makes sure the rewrite is a 301 Moved Permanently rewrite
L = Makes sure apache handles and finishes this rewrite rule first before it does anything else.
A redirect closes the current HTTP transaction and suggests that the browser makes a new request for a different URL.
R=301 : Makes sure the rule is a 301 Moved Permanently redirect.
**EDIT**
Another question,
Do I need use QSA? I don't have any query strings in my urls. All my urls at the moment are SEO friendly. But maybe for the future or isn't QSA needed with the way you explained to me?
Thought of something else as well.
I have a subdomain, is this going to cause a problem?
To stop Duplicate Content issues, for sites that do not use query strings, I clear all query strings in the redirect. That means, if anyone does ask for a URL and they include an unwanted query string, the site does not serve content at a duplicate URL, they instead are redirected to the canonical form.
The [QSA] flag is needed only to append additional query string data to an existing query string; If no additional query data is present in the RewriteRule substitution field, then the original query string is passed through RewriteRules unchanged.
If you have a subdomain, exclude all variations of it from the canonicalization rule g1smd suggested above by adding a negative-match RewriteCond. Then copy the resulting rule, and reproduce its function for the subdomain as well, by swapping all occurrences of the subdomain and main domain patterns and URLs. In this way, both the main domain and the subdomain will be canonicalized. Since the rules will be mutually-exclusive, you may put them in any order, although you would want the one that is most likely to run most often (probably the one for your main domain) to be placed first.
You clear the query string by appending a "?" to the RewriteRule substitution field. As described in the documentation, this is a mod_rewrite operator, and will not appear in the rewritten or redirected path.
Jim