Forum Moderators: open

Message Too Old, No Replies

More Bot Fun fron MSFT

New Headers are in use.

         

blend27

4:59 pm on Feb 27, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So these 6 IPs have been hitting one of my sites since 06/30/2019.
40.127.111.2x
40.113.85.12x
13.79.160.15x
13.69.199.20x
104.41.228.25x
104.41.225.4x


I have posted about it in 2020 in this thread [webmasterworld.com...]
The last time they visited was on 09/01/2020.
-----------------------------
ip: 13.69.199.xx
remote host: 13.69.199.xx
TimeDiff(0)
time: {ts '2020-02-04 06:18:46'}
http_content:
method: GET
protocol: HTTP/1.1
connection: Keep-Alive
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
user-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko
Accept-Encoding: gzip, deflate, sdch
host: www.example.com
Upgrade-Insecure-Requests: 1
content-length: 0

So this morning they came back with the new set of headers:
-----------------------------
ip: 13.79.160.159
remote host: 13.69.199.207
time: {ts '2021-02-27 02:35:31'}
http_content:
method: GET
protocol: HTTP/1.1
Accept-Language: en-US,en;q=0.5
user-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko
traceparent: 00-e573ada96200d34e8d1ae33b553fbfb1-70d23488b908da4e-00
Request-Id: |e573ada96200d34e8d1ae33b553fbfb1.70d23488b908da4e.
host: www.example.com
connection: Keep-Alive
Request-Context: appId=cid-v1:6aab17a2-3a95-4c54-a40a-4f8a00bcf3e3
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch
Upgrade-Insecure-Requests: 1
content-length: 0

Notice 3 new headers included
traceparent: 00-e573ada96200d34e8d1ae33b553fbfb1-70d23488b908da4e-00 
Request-Id: |e573ada96200d34e8d1ae33b553fbfb1.70d23488b908da4e.
Request-Context: appId=cid-v1:6aab17a2-3a95-4c54-a40a-4f8a00bcf3e3

This is the first time I see those headers and searched my logs if any other browsers send it, no, never happened before.

I tested to see if the headers are included with stock IE11 in Win7 and Win10. Nope. Also somehow claim to support image/webp - sent via accept header - nice try ah?

lucy24

8:35 pm on Feb 27, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



:: return to logged headers ::

Traceparent: Hey, there they are, with values all in the form 00-\h{32}-\h{16}-00 or -01. (Most text editors don’t have a shorthand for “hexadecimal” i.e. [\da-f] but SubEthaEdit does, trala.)

One, dated 30 November, came with “Elastic-Apm-Traceparent” which came up in the same search, along with “Tracesearch”.

Another, dated way back on 30 May, came accompanied by Request-Id and Request-Context, with similar values to the above-quoted, including the leading | pipe. I don't find any other Request-blahblah.

The biggest batch of “Traceparent” is dated 15 December, which had an unexpected poignancy since that was my father’s birthday. All the same IP, but various UAs.

All definitely robots.

Edit: On a hunch, I searched for the value
00-[\da-f]{32}-[\da-f]{16}-0[01]

This yielded a much larger number of hits, attached to the header “Sentry-Trace” which I also don’t remember seeing. Again, all appear to be robots; many have the environmental variable “botlang” which means
:: shuffling papers ::
Accept-Language ^en-US,en;q=0\.8,(ru|nl);q=0\.6$

It may all be the same robot--that is, the same robot script--from a handful of different IPs.

blend27

9:25 pm on Feb 27, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I forgot to mention that those 3 new headers that I see have different values on each request, even from the same IP. The only this that is the same in given request is:
traceparent: 00-e573ada96200d34e8d1ae33b553fbfb1-70d23488b908da4e-00
Request-Id: |e573ada96200d34e8d1ae33b553fbfb1.70d23488b908da4e.

Not a biggie toe...

lucy24

11:38 pm on Feb 27, 2021 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Clearly the unifying theme is “trace”. Wonder if it has a legitimate purpose?

:: google, google ::

Oh, will you look at that. w3 dot org has a whole page about Trace Context [w3.org]. Some day I'll be able to read more than two paragraphs without getting a headache. (Today's Signal Achievement was poring over the php dot net page about _SERVER variables. Buried in the Comments section was one they left out, REDIRECT_STATUS. We won't talk about the fact that the comment in question was nine years old, and the said variable has not yet been added to the main text. It gives me something to add to logged headers so I can immediately see which requests were blocked.)