homepage Welcome to WebmasterWorld Guest from 54.205.5.68
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
HTTP_HOST for HTTP/1.0 browser
Variable




msg:1503873
 6:23 pm on Oct 28, 2004 (gmt 0)

I have a site with several subdomains:

www.domain.com
subdomain1.domain.com
subdomain2.domain.com
etc.

and am using PHP to determine whether the URL is www.domain.com or one of the subdomains:

if ($_SERVER["http_host"] == "domain.com") $_SERVER["http_host"] == "www.domain.com")
{
// URL is www
}
else
{
// assume URL is a subdomain
}

The above code works fine for the vast majority of my clients. My problem is that I have some clients that I know are HTTP/1.0 and this code is failing for them. When they go to www.domain.com, the php code executes the else code instead of the "www" code.

I searched the forums and found this comment:

"You may have had problems because some clients were using HTTP/1.0, which does not provide the Hostname request header tested by http_host"

So two questions I have:
1. Do HTTP/1.0 clients send an HTTP_HOST?
2. If they don't, is there a way I can get the URL they are requesting using PHP?

Many thanks in advance.

 

jdMorgan




msg:1503874
 7:18 pm on Oct 28, 2004 (gmt 0)

1. Do HTTP/1.0 clients send an HTTP_HOST?
2. If they don't, is there a way I can get the URL they are requesting using PHP?

1. No.
2. No, the information doesn't exist in HTTP/1.0.

This was a major reason for HTTP/1.1.
HTTP/1.0 does not allow for the use of shared IP addresses among several domains.
HTTP/1.1 added the Hostname header to HTTP requests to support shared virtual hosting.

You can detect HTTP/1.0 and redirect those visitors to a special page where they can select the desired "domain" by referring (linking) directly to the correct subdirectory. Do not do this redirect if the visitor is already requesting a subdorectory, otherwise, he/she'll be stuck in a loop.

Also do not allow search engine spiders to index this special page, or massive duplicate-content problems will ensue.

And... In order to prevent problems, I suggest you detect HTTP/1.0 by detecting the blank hostname; Some search engine spiders claim to be HTTP/1.0 clients, but can actually handle HTTP/1.1. If they supply a hostname, treat then as HTTP/1.1-capable.

Jim

gergoe




msg:1503875
 7:33 pm on Oct 28, 2004 (gmt 0)

In HTTP 1.0 and above the Host header is not required to be included in each request, therefore you may have this problem. The main problem is that when the HTTP/1.0 client requests [domain.com...] the following happens:
  1. Get the ip address of the www.domain.com
  2. Connect to the resolved ip address on port 80
  3. It will send GET /index.html HTTP/1.1 as a request line
  4. ...

So as you can see, neither the server, nor the PHP script will know what the original URL was. The only thing you can do is to insert a small JavaScript code into the else statement of your php code, because in the browser you can check what was the original URL (window.location), and make some sort of client sided redirection based on this. But of course this can fail if there is no JavaScript support in the browser...

Variable




msg:1503876
 10:05 pm on Oct 29, 2004 (gmt 0)

Thanks for your help guys. Though now I am curious.

I checked my logs and they show that googlebot is using HTTP/1.0. Does this mean that any site with:

www.domain.com
subdomain.domain.com

will have both the above URLs indexed in google as the same page? I would guess this would not be the case as this would cause horrible indexing problems with any site using subdomains. So the next question is, does googlebot send an HTTP_HOST header?

jdMorgan




msg:1503877
 10:19 pm on Oct 29, 2004 (gmt 0)

And... In order to prevent problems, I suggest you detect HTTP/1.0 by detecting the blank hostname; Some search engine spiders claim to be HTTP/1.0 clients, but can actually handle HTTP/1.1. If they supply a hostname, treat then as HTTP/1.1-capable.

Google is one of these that advertises HTTP/1.0 but is actually HTTP/1.1 capable. You can prove this with a bit of research in Google search results.

Jim

bull




msg:1503878
 6:20 pm on Oct 30, 2004 (gmt 0)

The Googlebot with the User-agent
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

does explicitly use HTTP/1.1, while
Googlebot/2.1 (+http://www.google.com/bot.html)

claims to use 1.0, but does really use 1.1 too.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved