Forum Moderators: open
Google sees both www.example.com and example.com, When i type in site:example.com google shows the domain but also has this below:
"In order to show you the most relevant results, we have omitted some entries very similar to the 1 already displayed. If you like, you can repeat the search with the omitted results included."
when i click that it then proceeds to show me the www and the non www example domains on top of one another, Im really concerned this may trip some dup content filter.
Right now when i type example.com and www.example.com seperatly both names stick in the browser and i can browse without either changing, I really want for google and for people visiting to only see the example.com version. Is there some way using htaccess perhaps to force google to only see example.com and to enforce people who type or visit www.example.com to be redirected to example.com only?
Thanks for your time
Mark
You can use a 301 redirect to correct this for ALL SEs, and it will also consolidate your PR which is probably split between the domains presently.
I use this code in my .htacces file to accomplish the redirect:
RewriteEngine on
RewriteCond %{HTTP_HOST} !^example\.com
RewriteRule ^(.*)$ http://example.com/$1 [R=301]
If memory serves, it can take several months for Google to get this all sorted out. Using the Server Header Check [webmasterworld.com] is a convenient way to verify that your redirect is doing what you intend.
Note the [R=301,L] flag in the RewriteRule in DaveAtIFG's second code post - You should use that form as well.
rfgdxm1,
One of the other advantages of doing this kind of redirect is that it makes it a lot harder for other webmasters to link to you at the "wrong address." Since most of us cut-n-paste, the fact that the browser address bar gets "corrected" can help a lot.
Jim
I have a .htaccess file with this in it
# -FrontPage-
IndexIgnore .htaccess */.?* *~ *# */HEADER* */README* */_vti*
<Limit GET POST>
order deny,allow
deny from all
allow from all
</Limit>
<Limit PUT DELETE>
order deny,allow
deny from all
</Limit>
AuthName mysite.fp.domain.net
AuthUserFile /home/sc81/www/_vti_pvt/service.pwd
AuthGroupFile /home/sc81/www/_vti_pvt/service.grp
(whatever that is)
Can I just add the code to this file as well? If so should I add it to the top or bottom or does'nt it matter?
Thx.
CAUTION: IF YOU ARE USING THE FRONTPAGE EXTENSIONS IN YOUR PAGES, TO PROCESS A FORM FOR EXAMPLE, THESE MOD_REWRITE DIRECTIVES WILL NOT WORK AS ADVERTISED.
RewriteEngine on
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
Jim
Yep. It looks for the .htaccess in the directory where the requested file resides and then looks in all parent directories and processes them all.
But most servers are configured to do this anyway, so it doesn't make a difference.
In the end it's only peanuts ...
I happen to know someone whose husband was struck by lightning, so don't give up hope! wwww is just a subdomain, sure they can pick it up. They could even pick this up as a subdomain:
Aghhhhhhhhh;)
Googlebot will pick up anything that can't run away on its own two feet.
[webmasterworld.com...]
> Are you using an HTTP/1.0 client? If so, you'll need to add another RewriteCond [...]
Um, why?
> RewriteCond %{HTTP_HOST} .
This is saying "If there's a HTTP_HOST header AND [more rewrites...]", correct?
When trying to figure out what this looping condition is about, I did notice this in RFC 2616 (and 1945 & 2068):
Note: When automatically redirecting a POST request after
receiving a 301 status code, some existing HTTP/1.0 user agents
will erroneously change it into a GET request.
This has nothing to do with my question, right? (And can't be controlled at the server-side, I should think...) Actually, that seems to be a stupid question, but I just thought it was something worth noting...
> Um, why?
HTTP/1.0 clients do not provide a Host: request header.
Therefore, the %{HTTP_HOST} environment variable will be empty if the request comes from or through an HTTP/1.0 client or proxy.
If the condition "RewriteCond %{HTTP_HOST} !^your_hostname_here" is used alone without checking for the existence of a hostname first, then that condition will match a blank hostname, and the redirect will occur for every request. The client and server will end up in a 301 redirect loop, until one or the other reaches its redirection limit.
Jim
The advantage of mod_rewrite is that it can test the requested Host HTTP header, and redirect only if it is "wrong" -- i.e. not the one you want to be known by. The problem comes in where a true HTTP/1.0 client is involved in the transaction. In that case, no Host header is sent with the request and as a result, there is nothing there for mod_rewrite to test. So the above code modification prevents a loop in that case.
The Host header was added in HTTP/1.1 in order to support shared IP addresses. Without the Host header, the server has no idea which virtual server to steer the request to. As a result, HTTP/1.0 is almost dead today, because it won't work with shared IP addresses. But there are still a few holdouts.
It is one thing if you're on a ahared IP and HTTP/1.0 requests can't access your site. But it's potentially worse if HTTP/1.0 requests *can* access your site but your code makes them to go into a loop and possibly crash your server!
Note that although Google and several other search engines "advertise" in access logs as HTTP/1.0 clients, they *do* send a Host header, and they do support many HTTP/1.1 functions. I think they advertise as HTTP/1.0 in order to at least function with servers that don't support HTTP/1.1, while not giving up all of the advantages of HTTP/1.1
Jim