Forum Moderators: phranque

Message Too Old, No Replies

Seeing the hex string at the end of the UA again

coordinated hits to my PDF files

         

SumGuy

12:51 am on Oct 5, 2020 (gmt 0)

5+ Year Member Top Contributors Of The Month



I think I posted about this phenomena before - seeing hits only to PDF files where the browser UA ended with some sort of hex /string-code. I'm seeing it again. This time the user-agent is this:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML like Gecko) Chrome/80.0.3987.132 Safari/STRING

Where STRING is a 6-character hex code. Here is a table summarizing these hits so far this year:

108.30.200.x 8/6/20 A B60089 nycmny.fios.verizon.net USA UUNET
75.179.161.x 8/6/20 A E4DF71 insight.res.rr.com USA TWC-10796-MIDWEST
76.88.245.x 8/9/20 B 9BE1F5 dc.res.rr.com USA TWC-20001-PACWEST
66.66.6.x 8/9/20 C D92E83 rochester.res.rr.com USA TWC-11351-NORTHEAST
169.234.245.x 8/9/20 C 6E37BA pv.reshsg.uci.edu USA UCINET-AS
73.157.84.x 8/19/20 D DD2599 hsd1.wa.comcast.net USA COMCAST-7922
87.75.25.x 9/17/20 E 4E53ED (none) UK Vodafone Enterprise U.K.
194.5.192.x 9/17/20 E AF79D6 (none) NL Softqloud GmbH
84.139.105.x 9/28/20 F A4E583 dip0.t-ipconnect.de DE Deutsche Telekom AG
90.120.95.x 9/28/20 F A320AE abo.wanadoo.fr FR Orange
89.158.103.x 9/28/20 F 0DBA89 rev.numericable.fr FRSFR SA
85.168.202.x 9/28/20 F 334ADA rev.numericable.fr FRSFR SA
76.8.203.x 9/29/20 G AC5AB1 (none) USA OFF-CAMPUS-TELECOMMUNICATIONS
172.115.99.x 9/30/20 H E5603C socal.res.rr.com USA TWC-20001-PACWEST
71.228.216.x 10/1/20 I 3F2936 hsd1.tn.comcast.net USA COMCAST-7922
86.246.37.x 10/4/20 J A6153E abo.wanadoo.fr FR Orange
109.221.135.x 10/4/20 J 211CF8 abo.wanadoo.fr FR Orange
92.232.186.x 10/4/20 J BC1968 cable.virginm.net UK Virgin Media Limited

(I'd like to make that list better formatted but all attempts to put spaces or tabs is being rejected)

Started in August. The 3'rd column (a single letter from A to J) represents a unique PDF file that was requested. So on Sept 28 there were 4 requests for file "F" from 4 different IP's. The 4'th column is the STRING (6 characters) and the rest is basically who or what the IP is.

I don't know what to make of this, except a coordinated global effort seemingly from consumer IP space to independently retrieve multiple copies of the same file, for what purpose I don't know. Perhaps to verify that my server is serving the identical file regardless who is asking for it? The PDF files in question are full reprints of biomedical scientific research papers.

lucy24

4:32 pm on Oct 5, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



<tangent>
all attempts to put spaces or tabs is being rejected
It should work if you put the whole thing into
code           markup with
non-breaking   spaces
(on my OS it’s option-space; dunno how That Other System does it).
</tangent>

not2easy

5:15 pm on Oct 5, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



That previous post was in April, 2019: [webmasterworld.com...]

It is possible, but I would need to test to verify - that it may be due to the way the pdf file is requested. Possibly, if your logs show a previous hit without that hex code from the same IP prior to the one with the code that it is because they did not open the file but rather clicked "Save Link As" so the OS identifies the request with a default (auto generated) hex string for the command. I am not saying that this is the definitive reason you see the hex string in your logs, but only that it is one possibility.

I haven't used Windows for awhile so I don't know whether they now offer that "Save Link As" option. It could be that on open, Mac OS or iOS users click to 'File > Print > Save as PDF'. Either option is an extra OS request besides the browser's "GET".

SumGuy

3:45 pm on Oct 6, 2020 (gmt 0)

5+ Year Member Top Contributors Of The Month



My web server does not generate these codes or strings. These hits are coming in on HTTP (IIS4) with no referrer and are getting redirected 301 to the HTTPS server (Abyss). I'm going to have a closer look at the abyss logs to see what extra information is being logged there that might be of any use. About half the time for what I think are legit human hits to PDF files I will also see a hit to favicon.ico, but these hits do not do that. I would think that others would also be seeing at least a few of these examples in their logs, from that user-agent, when requesting PDF files.

not2easy

4:23 pm on Oct 6, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



My web server does not generate these codes or strings.
That is not what I suggested. "The OS" is not your server, it is the visitor's Operating System.

It helps a lot in these situations if the server environment is mentioned somewhere in the opening post. I do not use IIS4 servers though I am aware they have been around since a very long time. It may be why you see things in your server logs that some of us aren't familiar with. I cannot help in this case and would not have tried if I had known that. Sorry for being confused.

tangor

7:55 pm on Oct 6, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Amazing how these codes all look like hex color codes. Not helpful, just interesting.

lucy24

8:54 pm on Oct 6, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



these codes all look like hex color codes

Goes with the six-digit territory. Any six-digit hexadecimal will work out to a color, even if some of them are admittedly pretty ugly colors.

:: detour for global replace involving <span style = "background-color: #\1;">\1</span><br> ::

Hm. Surprisingly garish overall, leaning towards blues and purples. Make of this what you will.

Sorry, SumGuy, but it was just irresistible.