Forum Moderators: phranque
I hope everyone has had a pleasant start to the New Year, so far.
I have what may be an unusual mod-rewrite question, since I have been unable to find an answer elsewhere (as yet).
Background
I am using a shared hosting service provided by Site5.com. By using the shared host, you are required to choose a primary (or main) domain for the account and then you choose as many other secondary domains to host as you would like. You choose a secondary domain by either "pointing" to it from another domain register account or by "parking" it on the Site5 host.
Here is my issue:
Each domain pointer's directory is located in the primary domain's public_html folder. So everytime I "point" a new website domain to my server, it is shows up as a new directory folder in my primary domain.
This means that when I point "www.secondarydomain.com" to my server. The index page will be indexed by the search engines on both of these addresses: "http://www.secondarydomain.com/index.html" and "http://www.primarydomain.com/secondarydomain/index.html" (the index page was just an example, but actually ALL content is duplicated due to this issue).
As you all know, this creates duplicate content in the search engines.
Can anyone please tell me how to 301 permanently (and safely) redirect all request for "http://www.primarydomain.com/secondarydomain/" to it's corresponding "http://www.secondarydomain.com/" addresses?
Any and all help will be greatly appreciated.
Thank you
Options -Indexes +FollowSymLinks
# General rule to force all non-www URLs to be www URLs.
# This rule must be the last one of the redirects:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^primarydomain\.com [NC]
RewriteRule ^(.*)$ http://www.primarydomain.com/$1 [R=301,L]
# Redirects all request for directory folder of primary domain to secondary domain
RedirectMatch 301 ^secondarydomaindirectory/(.*)$ http://www.secondarydomain.com/$1
Hope this helps.
[edited by: jdMorgan at 3:35 pm (utc) on Jan. 5, 2010]
[edit reason] De-linked [/edit]
Firstly you need to have the specific redirect that I posted as the first item, otherwise for a non-www request for the folder URL there will be a double redirect. Note the words in the comments of your post where it says "last of the redirects"; don't ignore that.
Secondly, if you have a rule using RewriteRule do not use any rules using Redirect or RedirectMatch in the same .htaccess file. The order the rules are processed may be in the wrong order to properly fix the problem.
Directives are processed in per-module order, and not in the order you list them in the .htaccess file. That fact often traps the unwary. Use RewriteRule for all of the redirects and for any rewrites.
In the primarydomain .htaccess file:
# Setup Options
Options -Indexes +FollowSymLinks
RewriteEngine On
#
# Redirect the secondarydomain folder inside the primary domain to the secondarydomain
[b]RewriteRule ^[/b]secondarydomainfolder/(.*) http[i][/i]://www[i][/i].secondarydomain.com/$1 [b][R=301,L][/b]
#
# General rule to force all non-www URLs to be www URLs.
# This rule must be the last one of the redirects:
RewriteCond %{HTTP_HOST} [b]!^www\.[/b]primarydomain\.com[b]$[/b]
RewriteRule (.*) http[i][/i]://www[i][/i].primarydomain.com/$1 [R=301,L] The full job is often more complicated than you might first expect.
In the secondarydomain .htaccess file:
# Setup Options
Options -Indexes +FollowSymLinks
RewriteEngine On
#
# General rule to force all non-www URLs to be www URLs.
# This rule must be the last one of the redirects:
RewriteCond %{HTTP_HOST} [b]!^www\.[/b]secondarydomain\.com[b]$[/b]
RewriteRule (.*) http[i][/i]://www[i][/i].secondarydomain.com/$1 [R=301,L] Note that any and all other changes to your code (however minor they might seem) are deliberate and each one has a specific purpose, especially the added ! and $ and the removal of [NC].
I have one major issue using the code you provided (due to my omission of info).......
Since I am using a shared host, when the websites access a SSL certificate on the host it will direct them to a shared SSL certificate. For example, "http://www.secondarydomain.com/index.php?var=somevarible" will be redirected to "https://sudomain.webhost.com/~hostusername/secondarydomain/index.php?var=somevarible" during a secure transaction.
This shared SSL certificate is accessed via my primary domain, which means using the rewrite code you provided above will transfer "https://sudomain.webhost.com/~hostusername/secondarydomain/index.php?var=somevarible" BACK to "http://www.secondarydomain.com/index.php?var=somevarible".
Any suggestions? Maybe adding some type of exception the the primary domain rewrite rules to exclude request that start with "https://sudomain.webhost.com/~hostusername/..." ?
Thanks
You then need redirects in both directions to correct any URL requested with the wrong protocol. This can get a little complicated, which is why you need to make a list before you start.
RewriteCond %{SERVER_PORT} !=443
Jim
Where do I place "RewriteCond %{SERVER_PORT} !=443"? I am guessing in the primary domain .htaccess file, but in what order? Should it be the first RewriteCond?
One question on canonicalizing all URLs. If all HTTPS URLs will all use the same basic structure, such as "https://sudomain.webhost.com/~hostusername/secondarydomain/index.php?var=somevarible", then is there general rewrite rule I can use for all secondary domains?
For instance, "http://www.secondarydomain1.com/index.php?var=somevarible" would use "https://sudomain.webhost.com/~hostusername/secondarydomain1/index.php?var=somevarible" as it's HTTPS version and "http://www.secondarydomain2.com/index.php?var=somevarible" would use "https://sudomain.webhost.com/~hostusername/secondarydomain2/index.php?var=somevarible" as it's HTTPS version......and so on and so on.
Is there a general rule I can use to cover all secondary domains using "https://sudomain.webhost.com/~hostusername/secondarydomainname..." as the prefix for all HTTPS transactions, instead of manually writing each individual rewrite rule based on the number of variables that may be added to the end of the URL.
I hope I'm not being confusing, so let me try to clarify the above with an example....is there a one to two line rewrite rule that would cover all variations such as "https://sudomain.webhost.com/~hostusername/secondarydomain1/index.php?var1=somevarible" OR "https://sudomain.webhost.com/~hostusername/secondarydomain1/index.php?var1=somevarible1&var2=somevarible2" OR "https://sudomain.webhost.com/~hostusername/secondarydomain1/index.php?var1=somevarible1&var2=somevarible2&var3=somevarible3"....(and so on and so on)?
Or do I have to write a new rewrite rule or condition every time a new variable may be added to the end of the HTTPS url prefix, considering that ALL HTTPS transaction will ALWAYS START with the "https://sudomain.webhost.com/~hostusername/secondarydomain" format?
I hope my question made sense (.....I think I'm even confused now)
Thanks for your guys help so far.
You can write wild-card patterns using regex, or you can simply leave the required "^somvariable-name=<somevariable-value-pattern>" pattern without an end-anchor, in which case any query that starts with that name/value pair will match the pattern.
Regular expressions are a very powerful critical component of mod_rewrite, and useful not only in mod_rewrite code, but in every modern major scripting and high-level programming language. Well worth the investment of your time.
We call these cryptic regex notations 'patterns' for good reason: Anything (e.g. a URL or a form entry or a shell command line) that follows a specific/specifiable format can be efficiently detected and manipulated using regular expressions. Chances are that almost everything you ever type into a computer is processed by one or more layers of regular-expression-powered routines. In the case of things done over the Web, likely dozens or even hundreds...
Jim