Can anyone explain this partial website file access behavior?

A hit to my company website results in a total of 32 files being delivered to the requesting browser. One default.html file, one favicon.ico file, and 30 gif files. I recognize a legit/human web-hit primarily when a complete set of these files are downloaded to a believable IP address using a legit-looking user-agent. A referrer like google or bing is almost always also present - but sometimes there is no referrer. Now if someone decides to browse the site, then more files will be downloaded. If they go away, then that's all I'll see in the logs.

Starting July 17 this year, I started to see rare hits where only the default.html file and two specific gif files were downloaded. These 2 gif's are usually the first 2 that are normally downloaded when a "full hit" happens. I've never seen bots do this. They will go after the html files and ignore the gif's.

Looking at these hits in detail - the user agent is one of these:

a) Intel Mac OSX 10_x_something (where x is 11 or 12 or 13) - about 20%
b) Windows NT 10 (about 50%)
c) Windows NT 6.1 (about 30%)

In all cases, browser is Chrome.

There have been 26 such hits, all from different IP's, the last being Oct 3. At most there is 7 or 8 days between hits, sometimes not quite 24 hours. Typical seems to be 3 to 5 days.

21 of the IP's are from western countries (7 US, 6 Canada) the other 5 are what I would call third-world. Two are major biotech companies, 1 is a US university, the rest seem to be a mix of residential and business big-ISP broadband (verizon, sbcglobal, comcast, roadrunner, bell, etc). There is at least one case where I've gotten a previous hit last year from the exact same IP, and that hit (last year) looked "normal".

I have a theory. That because the site is http (not https) that something in the browser or some network device at the user location has decided to prevent the user from surfing our site due to it being http and only the 3 files in question made it out to the user (or their network) before the session was terminated.

Does this sound legit? Or is it something else? Some sophisticated web-caching going on at the user's end, where they already have a copy of our files somehow?

Can anyone explain this partial website file access behavior?

web logs show strange file-access pattern

SumGuy

keyplyr

not2easy

keyplyr

TorontoBoy

SumGuy

SumGuy

keyplyr

phranque

SumGuy

keyplyr

martinibuster

SumGuy

SumGuy

keyplyr

martinibuster

SumGuy

martinibuster

SumGuy

keyplyr

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week