Forum Moderators: open

Message Too Old, No Replies

Missing HTTP HOST

No value at all in HTTP_HOST

         

dstiles

1:17 am on May 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm used to the occasional IP in HTTP_HOST instead of the target domain name but I've just seen a completely empty one - no IP, no domain name, nix.

Anyone else seen this?

How did it find me? I assume the incoming packets specify the target IP.

This hit from a single UK domestic broadband line has been repeated 58 times in the past four days. None of the usual headers are present. It hits either the home page or non-existent pages that died years ago or never existed at all - I can't tell because there are several sites, new and old, on the target IP.

caribguy

2:36 am on May 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



From http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.23 [w3.org]

A client MUST include a Host header field in all HTTP/1.1 request messages. If the requested URI does not include an Internet host name for the service being requested, then the Host header field MUST be given with an empty value. [..] All Internet-based HTTP/1.1 servers MUST respond with a 400 (Bad Request) status code to any HTTP/1.1 request message which lacks a Host header field.


Was HTTP_X_FORWARDED_HOST in the request?

jdMorgan

2:43 am on May 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If your server has a unique IP address, then it is likely an IP-based server, and can be accessed by its IP address using HTTP/1.0 or even HTTP/0.9. These early versions of the HTTP protocol do not define the HTTP Host request header, leaving HTTP_HOST blank.

If this is the case, then make sure that your domain canonicalization code does not attempt to redirect requests with a blank HTTP_HOST variable to the canonical hostname -- The result can be an 'infinite' loop.

If you're on a shared IP address/name-based virtual host, then the sites won't be accessible using HTTP/0.9 or HTTP/1.0, and a 400-Bad Request should be the result, as caribguy states. Check your raw access logs for these requests.

Jim

dstiles

4:57 pm on May 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks, Caribguy. No forwarding - see below. I'll change the response from 403 to 400 for that condition.

I began logging full environment vars after posting the OP. For reference, the headers given (excluding most locals) are:

AUTH_PASSWORD:
AUTH_TYPE:
AUTH_USER:
CERT_COOKIE:
CERT_FLAGS:
CERT_ISSUER:
CERT_KEYSIZE:
CERT_SECRETKEYSIZE:
CERT_SERIALNUMBER:
CERT_SERVER_ISSUER:
CERT_SERVER_SUBJECT:
CERT_SUBJECT:
CONTENT_LENGTH: 0
CONTENT_TYPE:
GATEWAY_INTERFACE: CGI/1.1
HTTPS: off
HTTPS_KEYSIZE:
HTTPS_SECRETKEYSIZE:
HTTPS_SERVER_ISSUER:
HTTPS_SERVER_SUBJECT:
LOCAL_ADDR: 83.170.nnn.nnn
LOGON_USER:
QUERY_STRING:
REMOTE_ADDR: 78.151.201.nnn
REMOTE_HOST: 78.151.201.nnn
REMOTE_USER:
REQUEST_METHOD: GET
SERVER_PORT: 80
SERVER_PORT_SECURE: 0
SERVER_PROTOCOL: HTTP/1.0

I suspect it's a badly made a) trojan b) toolbar c) spyware. Accesses were scattered throughout 24 hours so obviously a machine permanently on. Since many domestivs turn off their machines at night it could be some script kiddy mucking about, I suppose. Odd it's only targetted at a single ip of the several I have on the server, and not the server's default IP at that.

Jim - not an IP-based server. It's a virtual server with many domains and several IPs. IP instead of domain is blocked and returns an error code. Since it's HTTP/1.0 I suppose it could be just a stupid older bot but I haven't seen a missing domain before, even on HTTP/1.0. And, of course, there are no other headings and certainly not a UA.

caribguy

5:23 pm on May 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've seen odd traffic from that service provider. Was it a common UA? Maybe consider whitelisting.

Note that a missing (not simply blank) HTTP_HOST is only illegal according to the 1.1 protocol.

dstiles

9:30 pm on May 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It's a basic UK broadband supplier - my brother uses them. Nothing on it that I wouldn't epect from any other BB IP - apart from this.

There WAS no UA and I certainly don't want to whitelist it - no point as it couldn't get a site with its header credentials anyway. :)

If a 1.0 protocol hits sites with no domain name in HTTP_HOST then it's not going to get very much from the majority of web sites, which are mostly hosted on virtual servers. Although several bots pretend to be 1.0 they obviously can't function as such unless they include the host.

caribguy

9:46 pm on May 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



don't want to whitelist it
I should learn to finish my sentences, I meant whitelist to exclude this guy - you already seem to have that covered :)

Can you reproduce that behavior by telneting to your port 80?

And agreed on the second paragraph. And even "the occasional ip" is IMO nothing but scanning for vulnerabilities.

dstiles

10:13 pm on May 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Bad bhaviour automatically gets a ban, usually temporarily but based on frequency of mis-behaviour. Currently this guy is banned for 72 days and rising! :)

Interesting idea but I'm not really familiar with telnet.

I tried a couple of connections direct to the IP and port but got back a 400 which wasn't generated by me - obviously the server didn't like it. The most relevant response was to GET /index.asp which returned 400 (Invalid Hostname). That was using HTTP/1.1 - no idea how to change it to 1.0 (linux command line telnet).

dstiles

8:30 pm on May 14, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've just noticed Netcraft Survey are using this technique. They are not getting anything either. :)