Avoiding duplicates with htaccess

Hello, the object is to avoid all possible sources of apparent duplicate content for spiders.

- www.mysite.net is also accessible by .org and .com
- Except for things in cgi-bin, no parameters are used and all pages are .html
- For some reason the (shared) server reponds equally to www.mysite.net// (two slashes) and to www.mysite.net/index.html/qsdfgh and also, of course, to www.mysite.net/index.html?qsdfgh

The following .htaccess at root level seems to prevent all these duplicates (except dups inside cgi-bin, which don't matter because excluded in robots.txt and by "noindex nofollow" in the pages themselves).

Can anyone see any errors or improvements, please?

...
RewriteEngine On
...
# If wrong TLD or query outside cgi-bin, then redirect to .net without the query
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.mysite\.net [OR]
RewriteCond %{QUERY_STRING} .
RewriteCond $1 !^cgi-bin/
RewriteRule ^(.*)$ ht tp://www.mysite.net/$1? [R=301,L]
# If anything after .html, strip it off and redirect
RewriteRule ^(.*)\.html(.) ht*p://www.mysite.net/$1.html? [R=301,L]
# Rewrite www.mysite.net/folder//
RewriteRule ^(.*)// ht*p://www.mysite.net/$1/? [R=301,L]
# Rewrite www.mysite.net//
RewriteRule ^/ ht*p://www.mysite.net/? [R=301,L]
...

Thank you.

# If anything after .html, strip it off and redirect RewriteRule ^[b]([^.]*)[/b]\.ht[b]ml.[/b] http://www.mysite.net/$1.html? [R=301,L] # Redirect www.mysite.net/folder(s)//<anything> RewriteRule ^(.+)[b]//+(.*)[/b] http://www.mysite.net/$1/[b]$2[/b]? [R=301,L] # Redirect www.mysite.net//<anything> RewriteRule ^/+[b](.*)[/b] http://www.mysite.net/[b]$1[/b]? [R=301,L]

Avoiding duplicates with htaccess

Stopping all duplicate pages for spiders

Peter

jdMorgan

Peter

jdMorgan

Peter

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week