Forum Moderators: Robert Charlton & goodroi
My site (the home page) was dropped from page 1 to nowhere in Google a few months ago. I haven't done anything prior to that. After months of studying here and there, and in particular this forum, I think Google's Canonical duplicate penalty might be the key.
Here is the scnerio:
1) I have internal pages with many different links pointing to my homepage at :
mydomain.com , www.mydomain.com , [mydomain.com...] , www.mydomain.com/index.htm , www.mydomain.com/index.html
2) Google indexed both my [mydomain.com...] and [mydomain.com...]
3) index.htm and index.html have similiar but not identical content, resulting in what Google see as Duplicate Pages (the reason is, either my Frontpage 2000 or my web host messed up, when I update my homepage and upload it, only the index.html page is updated, the index.htm page is not updated). So when I changed the homepage, only the index.html page is changed, but the index.htm page is not. So it leaves 2 pages with very similiar content.
Based on what I've read in this forum, the best way to correct all the above mess is to use 301 Redirect in .htaccess?
If so, how do I write one code to tackle all the above 3 problems? Should I write 2 or 3 different mod-rewrite code for each of the above problem, in the same .htaccess file?
If so, what's the exact code I should use?
Your help and suggestion is highly appreciated.
Regards,
one frustrated site owner
RewriteCond %{HTTP_HOST} ^yourdomain\.com [OR]
RewriteCond %{HTTP_HOST} ^(www\.)?yourdomain\.com
RewriteRule (.*) [yourdomain.com...] [R=301,L]
this will rewrite and redirect the index.html:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html
RewriteRule ^index\.html$ [yourdomain.com...] [R=301,L]
this will rewite and redirect your index.htm:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.htm
RewriteRule ^index\.htm$ [yourdomain.com...] [R=301,L]
You should go through your site and make sure that you only link to your home page, or all pages for that matter, with one style.
ie. [yourdomain.com...]
RewriteEngine on
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}!^www\.mysite\.com
RewriteRule (.*) [mysite.com...] [R=301,L]
RewriteCond %{QUERY_STRING}!^$
RewriteRule .* - [G]
But I guess I may aswell think about that soon.
Whatever it takes.
The code could be a little tighter like this:
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.yoursite\.com
RewriteRule (.*) http://www.yoursite.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html?
RewriteRule ^index\.html?$ http://www.yoursite.com/ [R=301,L]
The first ruleset will redirect all requests that are not for a true subdomain, or for www. to www.yoursite.com (I use a negative condition, because it catches typos EG wwww.yoursite.com in most cases, and in some cases can help with breaking frames.)
Cond 1: checks to see if there is a HTTP HOST (HTTP/1.0 clients do not send HOST headers)
Cond 2: says 'if the HOST is not www.yoursite.com' continue with the redirect.
Rule 1: says 'store' everything requested for use in a back-reference, and if the conditions match, redirect to www.yoursite.com
The second ruleset will redirect any original request (typed in a browser, a clicked link, etc.) for index.htm or index.html to www.yourdomain.com/ (the ? makes the 'l' optional). It is important to use THE REQUEST as a condition for this, or your server will not be able to access the content of the index to serve it to www.yoursite.com/
Justin
I had a problem with G indexing tracking urls from ppc engines, so I use this:
RewriteCond %{QUERY_STRING} ^[a-z0-9] [NC]
RewriteRule (.*) [mydomain.com...] [R=301,L]
see any room for improvement?
also, I changed my actual code above because I use .shtml, do you have suggestions for covering all bases- html, htm, shtml?
At one time last year google had 8 different urls indexed for my home page, I've been gone for 14 months so far with no hope in site. G is still showing at 2-3 versions of every page of my site, even though I implimented the above about 6 or 7 months ago.
I think all of these duplicate entries are what caused me to lose all ranking, it must be seeing the different urls and considering them different pages, therefor exact duplicates.