Forum Moderators: phranque
I come from a Google discussion about canonical fix.
I was not aware that my domain name could be resolved in many different ways:
192.168.123.123/foldername/
quux-foo.com/foldername/
www.quux-foo.com/foldername/
anythingyouwantrandom.quux-foo.com/foldername/
www.anythingyouwantrandom.quux-foo.com/foldername/
foldername.quux-foo.com/
www.foldername.quux-foo.com/
example.com/
www.example.com/ <=== Canonical!
example.com/
www.example.com/
I'm wondering about how to redirect the undesired domains to the canonical one:
What code should I add in my .htaccess file?
Thanks
[edited by: jdMorgan at 7:39 pm (utc) on July 2, 2007]
[edit reason] example.com [/edit]
Options +FollowSymLinks
RewriteEngine on
#
# Redirect to fix "foldername" requests
RewriteRule ^foldername(.*)$ http://www.example.com$1 [R=301,L]
#
# Redirect all non-canonical domain requests to requested resource in canonical domain
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
The fix for "foldername" may need some adjustment, as it wasn't entirely clear what "foldername" means -- Whether it represents one specific folder or multiple folders which are identifiable in some unspecified manner.
For more information, see the documents cited in our forum charter [webmasterworld.com] and the tutorials in the Apache forum section of the WebmasterWorld library [webmasterworld.com].
Jim
[edit] Fixed misspelled "RewriteRule" [/edit]
[edited by: jdMorgan at 6:58 pm (utc) on July 4, 2007]
Basically, if you build a new site and you pick one and only one version of the domain and then use code like this to redirect all others, you will never have this problem again. But if you have an existing site that is already listed under multiple variations of the domain, then you need to take into account how you have linked to the pages in the site, how others have linked to pages in the site, and how search engines have listed the pages in the site.
Jim
But if you have an existing site that is already listed under multiple variations of the domain, then you need to take into account how you have linked to the pages in the site, how others have linked to pages in the site, and how search engines have listed the pages in the site.
Well, that's just my case.
Links are all at http://www.example.com/ or also http://www.example.com/page.htm.
Google lists my pages as www.example.com/page.htm
but it displays different results (especially in terms of number of indexed pages)depending on the fact that I search for "www" or "non-www" domain.In particular, if I search for "non-www", it returns many more results...
Anyway in both cases urls displayed are "www"...hope I was clear...
What would be your advice?
[edited by: jdMorgan at 11:31 pm (utc) on July 6, 2007]
[edit reason] example.com [/edit]
This is my current .htaccess file:
# -FrontPage-
IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*
<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>
AuthName www.mydomain.net
AuthUserFile /home/user/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/user/public_html/_vti_pvt/service.grp
AddType application/x-httpd-cgi .htm
options -Indexes
Could I add your code simply putting it above or below it?
Besides, I don't want the foldername fix: how can i remove it from that code?
Please, be patient, I'm totally in the dark in dealing with .htaccess file...
Thanks for a kind support. :)
The order that the modules process the .htaccess file is determined by the reverse LoadModule order on Apache 1.x, anbd by an internal priority scheme on Apache 2.x.
I encourage you to experiment and test. Keep a backup of your old .htaccess file, and if you break something, simply replace your new (broken) .htaccess with the old working backup. This limits any 'damage' to the few seconds that the broken .htaccess is active on your server.
Jim
I should add the following code to my .htaccess file:
Options +FollowSymLinks
RewriteEngine on
#
# Redirect to fix "foldername" requests
RewriteRule ^foldername(.*)$ http://www.example.com$1 [R=301,L]
#
# Redirect all non-canonical domain requests to requested resource in canonical domain
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}!^www\.example\.com
RewrireRule (.*) http://www.example.com/$1 [R=301,L]
below is my current .htaccess file:
# -FrontPage-
IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*
<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>
AuthName www.mydomain.net
AuthUserFile /home/user/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/user/public_html/_vti_pvt/service.grp
AddType application/x-httpd-cgi .htm
options -Indexes
How should I do?
I tried toadd it above and below it but it doesn't work.
I also tried the sole code but it doesn't work,again.
Please,help me.I'm zero about .htaccess issues.
How could I resolve?
Thanks :)
# -FrontPage-
IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*
Order deny,allow
<Limit GET POST>
Allow from all
</Limit>
<Limit PUT DELETE>
Deny from all
</Limit>
AuthName www.mydomain.net
AuthUserFile /home/user/public_html/_vti_pvt/service.pwd
AuthGroupFile /home/user/public_html/_vti_pvt/service.grp
AddType application/x-httpd-cgi .htm
Options -Indexes +FollowSymLinks
RewriteEngine on
# Redirect all non-canonical domain requests to requested resource in canonical domain
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
I have to come back on the issue:
I discovered that the code above affects also the subdomains:
[sub.example.com...] is resolved as http://www.example.com/sub/
I would keep my subdomains as I spread them for the linking: is there a way to fix this or should I resign to transform my subdomains in subdirectories and re-submit them for the linking?
Hope I was clear...
Thanks
[edited by: jdMorgan at 11:30 pm (utc) on July 6, 2007]
[edit reason] example.com [/edit]
Using an undefined set of subdomains opens up your site for malicious linking -- people linking to subdomains of your domain that do not exist. So, I suggest that you limit yourself to a short list of pre-defined subdomains, and redirect everything else to the canonical domain. Again, the solution depends on your site's specifics.
Jim
In the original example that this was copied from, foldername was the name of the folder on the main host domain that the add-on domain (a separate site) was being served from. The folder name fix was to allow the content in that folder to be only indexed under the add-on domain name that resolves directly to that folder as a separate site.
The redirect also catches all sub-domains, and sub-sub-domains, etc, and redirects them all to the canonical add-on domain name.
The code went in the root of the add-on domain, which is actually a folder off the main domain; that is, a folder off the main hosting account. The file did not go in the root of the main domain.
That is, the code was located:
# THIS FILE RESIDES AT: 123.123.123.123/foldername/.htaccess
# a.k.a. (www.)mainsite.com/foldername/.htaccess
# a.k.a. (www.)(anythingyouwant.)foldername.mainsite.com/.htaccess
# DOMAINS: (www.)some-site.com and (www.)that-site.com
# are parked and served from the same server and folder as add-on domains.
The site resolves only at www.some-site.com and all other possible URLs for the content serve a 301 redirect to the canonical URL.
# Redirect all non-canonical domain requests to requested resource
# in canonical domain except for recognized subdomains
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} !^subdomain1\.example\.com
RewriteCond %{HTTP_HOST} !^subdomain2\.example\.com
RewriteCond %{HTTP_HOST} !^subdomain3\.example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
# Redirect all non-canonical domain requests to requested resource
# in canonical domain except for recognized subdomains
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^(www¦subdomain1¦subdomain2¦subdomain3)\.example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
Jim
BUT...
I found another bug:
I have a search function onsite, based upon a perl .cgi script, under the sub http://sub.example.com/
When that search displays the results page it resolves at http://www.example.com/sub/ and all my home page links are "cut off" as they point all to http://sub.example.com!
Apparently I fixed that adding to your code the bolded string below:
# Redirect all non-canonical domain requests to requested resource
# in canonical domain except for recognized subdomains
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} !^sub\.example\.com
RewriteCond %{HTTP_HOST} !^www\.sub\.example\.com
RewriteCond %{HTTP_HOST} !^sub2\.example\.com
RewriteCond %{HTTP_HOST} !^sub3\.example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
Now the search resolves at http://www.sub.example.com/ and that's compatible with my links,but I'm not sure that is technically correct...
What's your opinion?
[edited by: jdMorgan at 5:38 pm (utc) on July 9, 2007]
[edit reason] example.com [/edit]
Without my "fix" ,starting from the correct sub-domain http://sub.example.com/, doing a search, the page of the results resolves at http://www.example.com/sub/.
With my fix it resolves at http://www.sub.example.com/.
Why it resolves at www.sub. instead of sub.,I don't know at all.
All that I know is that in that way it works fine for my links.
I don't think it depends on the .cgi script...
Do you have a better idea?
[edited by: jdMorgan at 5:35 pm (utc) on July 9, 2007]
[edit reason] example.com [/edit]
Rule order is important. In general, you'll want to place your most-specific external redirect rules first, then the least-specific external redirects, then the most specific internal rewrite rules, then the least specific.
Used in conjunction with the [L] flag on rewriterules, this helps to prevent redirects exposing internally-rewritten URL-paths.
Jim
Well,
didn't I overwrite them by editing the .htaccess file with the new code?
didn't the server follow the new rules?
I'm a bit confused...
Jim