Forum Moderators: phranque
Here is what I used below..
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}!^www\.example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
The first line should be checking for blank host because http 0.9, or 1.0 send host headers. The rewrite works properly for http1.1, and 0.9 i dont really care about, but some still use http 1.0, and I need to find a fix. I have also tried the standard rewrite without the negative pattern, does the same thing.
phish
I assume that's a typo, since HTTP/0.9 and HTTP/1.0 *do not* send host headers, and therefore cannot be used to access name-based virtual servers. That means that for a true HTTP/0.9 or HTTP/1.0 client, your server must be on a unique (non-shared IP address) to be accessible, and the distinction between hostnames does not exist, except at the DNS level. For that reason, the RewriteRule is disabled if the hostname is blank, and no redirection can (or should) take place.
I should note that I used the "%{HTTP_HOST} ." in code that I posted here years ago in preference to using the protocol field at the end of the client request header (available in %{THE_REQUEST}) simply because Googlebot and others "advertise" HTTP/1.0 in that field, but in actuality, they support "extended" HTTP/1.0, which *does* support sending a Host: header in the request. As a result, they *can* access name-based hosts on shared IP addresses, and can handle a hostname-based domain redirect, so we want to give it to them.
Just trying to make sure I understand the question here...
JIm
yes it was a typo..sorry
and yes, supposedly the site is setup on a static ip, (by itself) which is why i'm also confused.
and yes again..Google is the reason for this..
this is a brand new site , so i want to make sure i have my ducks in a row before releasing it. This is actually for a client who had wanted to use their own hosting over ours, which is why im having this issue. Our own dedicated server we do all redirects in http.conf, whereas im trying to do this thru .htaccess, and i know some of the rewrite rules differ between the 2. I was making sure i didnt have a typo or missed escaping a charachter or some regex stuff
If you have the DNS set up to point those two domains to different servers, then both RewriteCond lines can be omitted, and of course all content should be removed and put on the other server.
Another way to put this is that in the HTTP/1.0 world domains names mean nothing to a server; all it knows/uses/cares about is its own IP address. And the domain name is translated at the DNS level to an IP address before any HTTP/1.0 request is sent by the client, and is then essentially discarded for the remaining duration of the transaction.
So in HTTP/1.0, the rule should not redirect based on domain names, and if it did, it would have to be to a different server at a different IP address. If it were allowed to redirect to itself, it would continue doing so, in an 'infinite' loop, which is why we include the RewriteConds to prevent this should a 'real' HTTP/1.0 request actually arrive at your server.
HTTP/1.1 was released primarily so that name-based virtual servers could be used, because the rate of IP address consumption was projected to exceed the highest possible number of IP addresses very soon, demanding that servers be enabled to share IP addresses. So the only way to do that was to bolt a "Host:" header into the HTTP request, so that a server could tell which of many sites at the same IP address to send the request to. Thus, "name-based virtual hosting" entered our lexicon, and we haven't run out of IPv4 addresses quite yet. But that "Host:" header is not used to route HTTP requests over TCP/IP; That still works in essentially the same manner as it did in the HTTP/1.0 days.
Does this help?
Jim
Googlebot lies. It is really capable of both HTTP/1.0 and HTTP/1.1
It sends requests claiming that it's using HTTP/1.0 because older servers can't work properly with HTTP/1.1
But it really sends "Host:" headers in its purported HTTP/1.0 requests, making them "extended HTTP/1.0" requests.
True HTTP/1.0 servers will ignore that header.
HTTP/1.1 servers don't care if a request is HTTP/1.0 as long as it contains the Host: header, if that header is required for the server to work properly (it usually is on any shared server, so the host can limit your domain names).
So I think maybe your test plan is flawed.
Here is the correct behaviour of your server, and I believe you'll find the code complies with this:
You can verify this operation manually using Telnet and connecting to port 80, if you find this necessary to bypass any inconsistencies in whatever tool you're using. The "hyperterm" program bundled with Windows can be used for Telnet. (I'm not actually sure if they still ship it with newer Windows releases like XP, so search for it on your machine and see). It requires you to literally type in the entire HTTP request, the process is entirely case-sensitive, and there is no support for editing what you type, so it can be a real pain. You can write text "scripts" as shown below into plain-text files, and tell Telnet to send them if you're a poor typer like me.
But just for example, you would type:
GET / HTTP/1.0
User-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Now here's what to type to emulate what Googlebot might actually send:
GET / HTTP/1.0
User-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Host: example.com
Next, here's what Googlebot would probably send in response to that redirect:
GET / HTTP/1.1
User-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Host: www.example.com
Now just to verify that I spoke the truth above, try this:
GET / HTTP/1.0
User-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Host: www.example.com
Jim
#1 GET / HTTP/1.0
User-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Sends a 301 reidrect to www.example.com
-------------------------------------------------------
#2 GET / HTTP/1.0
User-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Host: example.com
Sends a 301 redirect to www.example.com
---------------------------------------------------------
#3 GET / HTTP/1.1
User-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Host: www.example.com
Sends a 200 displays page..
-----------------------------------------------------
GET / HTTP/1.0
User-agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Host: www.example.com
Sends a 200 displays page..
----------------------------------------
If I post an examplified copy of my virtual host container, can you take a look at it? Obviously I've been doing something wrong, which is probably why my rankings tanked. The way they set this up is confusing me....
phish