Forum Moderators: open

Message Too Old, No Replies

Blank header fields

         

dstiles

10:40 am on Oct 11, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In the past I've blocked the four major, traditional header fields if a certain combination were empty (not naming them because it may help hackers). Over the past few months I've seen repeated genuine accesses that have gradually eroded the usefulness of those traps, especially from UK education proxies. Today I finally accepted they could ALL be empty.

Anyone else finding this or am I just unlucky?

lucy24

11:15 pm on Oct 11, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Empty or absent? Don't know about you, but my access-control rules don't--because they can't--make a distinction. Casual stroll through header logs turns up a fair number of empty User-Agent fields, to which one can only say Meh, screw 'em.

How many IPs are involved? Could you toggle the relevant flags if the request comes from specified ranges, and/or there are other headers that point to a legitimate proxy?

keyplyr

2:40 am on Oct 12, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There are some apps that send empty header fields. I don't block them.

lucy24

4:20 am on Oct 12, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



How, if at all, do you distinguish between empty and absent?

Now, one thing I do block is the garbled header, like
User-Agent: User-Agent: blahblah

or
Accept-Language: blahblah Some-Other-Header-Name: blahblah
I don't currently find any of the latter, but they must once have been part of a widely circulated robot script--common enough to provoke me to block it.

keyplyr

4:36 am on Oct 12, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Casual stroll through header logs turns up a fair number of empty User-Agent fields, to which one can only say Meh, screw 'em
I do block empty/missing UA fields.

How, if at all, do you distinguish between empty and absent?
Can't with the log data offered by most shared environments.

If formatting reports for your own web server, there's a step (if I remember correctly) to put a character, like a dash, for empty fields so you can tell the difference between an empty field and one not sent. But I haven't managed a front facing web server in years so...

dstiles

10:24 am on Oct 12, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sorry, folks, I should have said "not counting user-agent". I block those if they are empty. And no, I don't detect the difference between missing and empty. I was referring to the other four traditional headers. I know at least one of them is now deprecated but the others used to be a good indicator in combination; now apparently no longer.

lucy24

6:44 pm on Oct 12, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



put a character, like a dash, for empty fields so you can tell the difference between an empty field and one not sent.
Yes, that's how my logs come out. But when assessing the header with things like mod_setenvif, all you can say is "has a value" or "has no value" (!.); it can't distinguish between empty and absent. Fortunately I've yet to find a situation where it makes a difference.

My current access controls require three specific header fields to be present, unless it is a named authorized robot. It may be different on your own server; on shared hosting the "Host" and "Connection-Type" headers are always present so I don't consider those. (Don't know about Connection, but with no Host header the request simply doesn't reach the appropriate vhost envelope.) Mobiles are also troublesome because they may either omit one header, or send a value that's otherwise only seen in robots. And there's no way in ### I can distinguish between genuine, currently-in-use Androids, and bogus ones.

I also have an environmental variable I call botheader which flags the presence of certain headers, including a number of common misspellings (leaving me in doubt about the intelligence of people who write robot scripts) as well as values that no human would ever send.