Welcome to WebmasterWorld Guest from

Forum Moderators: Robert Charlton & andy langton & goodroi

Message Too Old, No Replies

Google Canonical Duplicate Penalty Help

4:19 am on Sep 30, 2005 (gmt 0)

New User

10+ Year Member

joined:May 19, 2005
votes: 0

Hi everyone:

My site (the home page) was dropped from page 1 to nowhere in Google a few months ago. I haven't done anything prior to that. After months of studying here and there, and in particular this forum, I think Google's Canonical duplicate penalty might be the key.

Here is the scnerio:

1) I have internal pages with many different links pointing to my homepage at :
mydomain.com , www.mydomain.com , [mydomain.com...] , www.mydomain.com/index.htm , www.mydomain.com/index.html

2) Google indexed both my [mydomain.com...] and [mydomain.com...]

3) index.htm and index.html have similiar but not identical content, resulting in what Google see as Duplicate Pages (the reason is, either my Frontpage 2000 or my web host messed up, when I update my homepage and upload it, only the index.html page is updated, the index.htm page is not updated). So when I changed the homepage, only the index.html page is changed, but the index.htm page is not. So it leaves 2 pages with very similiar content.

Based on what I've read in this forum, the best way to correct all the above mess is to use 301 Redirect in .htaccess?

If so, how do I write one code to tackle all the above 3 problems? Should I write 2 or 3 different mod-rewrite code for each of the above problem, in the same .htaccess file?

If so, what's the exact code I should use?

Your help and suggestion is highly appreciated.

one frustrated site owner

2:47 pm on Sept 30, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 3, 2003
votes: 0

provided you are hosted on *nix/apache, the following will rewrite and redirect non www versions of your url:

RewriteCond %{HTTP_HOST} ^yourdomain\.com [OR]
RewriteCond %{HTTP_HOST} ^(www\.)?yourdomain\.com
RewriteRule (.*) [yourdomain.com...] [R=301,L]

this will rewrite and redirect the index.html:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html
RewriteRule ^index\.html$ [yourdomain.com...] [R=301,L]

this will rewite and redirect your index.htm:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.htm
RewriteRule ^index\.htm$ [yourdomain.com...] [R=301,L]

You should go through your site and make sure that you only link to your home page, or all pages for that matter, with one style.
ie. [yourdomain.com...]

3:22 pm on Sept 30, 2005 (gmt 0)

Preferred Member

joined:July 19, 2002
votes: 0

I don't know about the last entry: This one seems to be working best for us and hopefully, in time will clear all the dupe probs in both Google and MSN allowing us to retain
our previous positions in the SERPS.

RewriteEngine on
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}!^www\.mysite\.com
RewriteRule (.*) [mysite.com...] [R=301,L]
RewriteCond %{QUERY_STRING}!^$
RewriteRule .* - [G]


3:30 pm on Sept 30, 2005 (gmt 0)

Inactive Member
Account Expired


I have not really gone for the redirects from index.html to the domain.

But I guess I may aswell think about that soon.

Whatever it takes.

5:08 pm on Sept 30, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 9, 2005
votes: 0

Actually my3cents gave some pretty good advice...

The code could be a little tighter like this:

RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.yoursite\.com
RewriteRule (.*) http://www.yoursite.com/$1 [R=301,L]

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html?
RewriteRule ^index\.html?$ http://www.yoursite.com/ [R=301,L]

The first ruleset will redirect all requests that are not for a true subdomain, or for www. to www.yoursite.com (I use a negative condition, because it catches typos EG wwww.yoursite.com in most cases, and in some cases can help with breaking frames.)

Cond 1: checks to see if there is a HTTP HOST (HTTP/1.0 clients do not send HOST headers)
Cond 2: says 'if the HOST is not www.yoursite.com' continue with the redirect.
Rule 1: says 'store' everything requested for use in a back-reference, and if the conditions match, redirect to www.yoursite.com

The second ruleset will redirect any original request (typed in a browser, a clicked link, etc.) for index.htm or index.html to www.yourdomain.com/ (the ? makes the 'l' optional). It is important to use THE REQUEST as a condition for this, or your server will not be able to access the content of the index to serve it to www.yoursite.com/


6:30 pm on Sept 30, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 3, 2003
votes: 0

thanks jd01, I'll use that?

I had a problem with G indexing tracking urls from ppc engines, so I use this:

RewriteCond %{QUERY_STRING} ^[a-z0-9] [NC]
RewriteRule (.*) [mydomain.com...] [R=301,L]

see any room for improvement?

also, I changed my actual code above because I use .shtml, do you have suggestions for covering all bases- html, htm, shtml?

At one time last year google had 8 different urls indexed for my home page, I've been gone for 14 months so far with no hope in site. G is still showing at 2-3 versions of every page of my site, even though I implimented the above about 6 or 7 months ago.

I think all of these duplicate entries are what caused me to lose all ranking, it must be seeing the different urls and considering them different pages, therefor exact duplicates.