Forum Moderators: phranque

Message Too Old, No Replies

Trying to setup 301 Redirect that Google recognizes

         

JohnKelly

4:47 pm on Apr 14, 2005 (gmt 0)

10+ Year Member



I've got a site that for several years has two domains pointing to the same content. After reading that is a bad thing, I did a 301 redirect from the secondary domain to the primary and the secondary domain started disappearing from search engines as anticipated - except for Google.

Google seems to not recognize a 301 redirect using HTTP 1.1, as all the pages are still there under the secondary domain.

After some experimentation here's what I've found. I was using the following in my .htaccess:

RewriteEngine on
RewriteCond %{HTTP_HOST}!^www\.domain\.com [NC]
RewriteCond %{HTTP_HOST}!^$
RewriteRule ^(.*)$ [domain.com...] [R=301,L,QSA]

Using WebBug v5.3 the above produced a 200 OK code using HTTP 0.9 and HTTP 1.0, and a 301 Moved Permanently using HTTP 1.1

I then commented out the RewriteCond %{HTTP_HOST}!^$ line:

RewriteEngine on
RewriteCond %{HTTP_HOST}!^www\.domain\.com [NC]
# RewriteCond %{HTTP_HOST}!^$
RewriteRule ^(.*)$ [domain.com...] [R=301,L,QSA]

This produced a 301 Moved Permanently code for HTTP 0.9, 1.0 and 1.1. I'm not entirely sure why I had that line in there....

What I wish to do is to forward the secondary domain to www.domain.com, and also forward some defunct subdomains to www.domain.com as well. Can anyone tell me if the above code is correct, and what the RewriteCond %{HTTP_HOST}!^$ line does?

jdMorgan

8:15 pm on Apr 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The original code was correct, with the exception of a few superfluous items.

The "!^$" pattern bypasses the redirect if the HTTP_HOST field is empty. HTTP/1.0 and below do not support passing the requested domain to the server using the "Host" header. Therefore, the check for a blank HTTP_HOST is necessary to prevent real HTTP/1.0 clients from getting into an infinite loop. In other words, that line needs to be there, and domain-based redirects are impossible when dealing with real HTTP/1.0 clients and older.

Just to clean this up, I'd recommend:


RewriteEngine on
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.domain\.com [NC]
RewriteRule (.*) http://www.domain.com/$1 [R=301,L]

This is entirely equivalent to what you had, just a few characters shorter/faster. Checking for non-blank "." is equivalent to checking for NOT blank "!^$". Since you're using a single 'greedy' ".*" pattern, the start and end anchors (^ and $) in the RewriteRule pattern are not needed, and the [QSA] flag is only needed if you wish to append a new query string part to an existing one -- the existing query will be passed through unmodified by default as-is.

[added] As to whether Google will pick up this 301 redirect, that all depends on how often G spiders your site. Generally, this should be daily if your site has a toolbar PageRank of 6, and less often below that. You could always go submit a few of your most-popular pages to Google once, but using the 'incorrect' domain names, and see if that helps. (This is about all that 'submit your site' is good for at Google -- correcting problems -- since they will usually find any page that has any link to it. [/added]

Jim

JohnKelly

2:06 am on Apr 15, 2005 (gmt 0)

10+ Year Member



Thanks jdMorgan for the very helpful post... I'll make the changes and see what Google does.

JohnKelly

7:31 pm on Apr 15, 2005 (gmt 0)

10+ Year Member



I must be doing something wrong. I put the following code in my .htaccess file:

RewriteEngine on
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}!^www\.domain\.com [NC]
RewriteRule (.*) [domain.com...] [R=301,L]

And then used WebBug to view the headers from [domain.com...] (which should 301 redirect to [domain.com...] It still ouputs a 200 OK code:

Sent Data:
HEAD / HTTP/1.0
Accept: */*
User-Agent: WebBug/5.0

Received Data:
HTTP/1.1 200 OK
Date: Fri, 15 Apr 2005 19:26:04 GMT
Server: Apache
Last-Modified: Thu, 03 Mar 2005 00:54:58 GMT
ETag: "1e00c1-1aa3-142f5c80"
Accept-Ranges: bytes
Content-Length: 6819
Connection: close
Content-Type: text/html; charset=UTF-8

jdMorgan

2:37 am on Apr 16, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Assuming that there is a space between "}" and "!" in your code (which gets removed when posting here, and would cause a different problem if it were missing) there is nothing wrong with your code.

However, you may be having problems with either the mechanism by which these other domains are 'forwarded' to your server, with other code in .htaccess or in httpd.conf that is interfering with your code, or with the location of the .htaccess code; A universal truth that may apply in this case is that .htaccess code can only affect the directory in which it resides, or subdirectories of that directory.

If your domain 'forwarding' function involves additional redirects or page-framing, then this just can't work. You need to have real DNS entries that point those alternate domains to your server's IP address. This domain redirect cannot work if the mechanism used to reach the server does not leave the requested Host header intact, since this is the source of the HTTP_HOST variable's value tested by RewriteCond.

Jim

JohnKelly

11:12 pm on Apr 16, 2005 (gmt 0)

10+ Year Member



Yes, there was a space between the "}" and "!", in my code.

I have the .htaccess file in my root web directory. I have DNS setup for both domains pointing to the same IP address (my dedicated server), and am not using forwarding or framing.

There may possibly be something going on with the httpd config file(s), or even the WebBug program I'm using. I'll download another pgm to check it with to verify.

Thanks again for your help, and I'll return to post my findings here.

JohnKelly

6:37 pm on Apr 25, 2005 (gmt 0)

10+ Year Member



After some more testing, it appears that it is not possible to generate a 301 redirect using HTTP/1.0. I've tried modifying the ServerName in my httpd config file and still get a 200 OK unless I'm using HTTP/1.1.

Even testing yahoo.com only generates a 301 Redirect using HTTP/1.1 -- HTTP/1.0 generates a 200 OK.

jdMorgan

7:44 pm on Apr 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I mentioned that fact in the second paragraph of msg#2 above. Name-based virtual hosting was not possible with HTTP/1.0, a major reason for the development of HTTP/1.1.

Jim

sitz

1:03 am on Apr 26, 2005 (gmt 0)

10+ Year Member



Although, to be clear, according to [httpd.apache.org ], most HTTP/1.0 clients *will* send a Host: header despite it's absence in RFC-1945 [w3.org]. Your web bug client may not send it, but most HTTP/1.0-only browsers (are there still any out there? =) ). You /can/ send something like the following:

HEAD / HTTP/1.0
Host: www.example.com

...and expect to get redirected.