In recent months, I've noticed a strange dance performed by none other than AWS (always ap-southeast-1) and Bytespider. Initially the pattern involved a robots.txt-only hit from AWS running Bytespider, followed by a robots.txt-only hit from a non-AWS address (let's call those Independents) running a non-bot UA. This went on for ages until recently when the pattern changed to two AWS+Byte hits together, no Indies.
Here's the interesting bit: the timing.
Every. Single. Time. the second robots.txt-only hit follows the first by exactly four minutes. In ALL cases, and I've tracked many scores of these, both the original AWS+Indie pairs, and thereafter the AWS+Byte pairs. Four minutes.
Well, I think it's interesting:)
Those of you who save your logs, take a look back at ".ap-southeast-1.compute.amazonaws.com" hits running Bytespider and asking for robots.txt. Then count forward exactly four minutes. See? Anybody?
EXAMPLES: AWS+Byte & AWS+Byte
ec2-47-128-54-170.ap-southeast-1.compute.amazonaws.com
Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; spider-feedback@bytedance.com)
/robots.txt GET: 21:05:17
ec2-47-128-52-93.ap-southeast-1.compute.amazonaws.com
Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; spider-feedback@bytedance.com)
/robots.txt GET: 21:01:18
ec2-47-128-27-73.ap-southeast-1.compute.amazonaws.com
Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; spider-feedback@bytedance.com)
/robots.txt GET: 17:44:20
ec2-47-128-48-133.ap-southeast-1.compute.amazonaws.com
Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; spider-feedback@bytedance.com)
/robots.txt GET: 17:40:23
EXAMPLES: AWS+Byte & Indie
86-46-71-xxx-dynamic.agg2.ety.prp-wtd.eircom.net
Mozilla/5.0 (Linux; Android 8.0; Pixel 2 Build/OPD3.170816.012) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.4324.1180 Mobile Safari/537.36
/robots.txt GET: 18:08:42
ec2-47-128-54-63.ap-southeast-1.compute.amazonaws.com
Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; spider-feedback@bytedance.com)
/robots.txt GET: 18:04:27
203.94.xx.x
Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.1244.1462 Mobile Safari/537.36
/robots.txt GET: 12:28:28
ec2-47-128-34-174.ap-southeast-1.compute.amazonaws.com
Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; spider-feedback@bytedance.com)
/robots.txt GET: 12:24:26
(Personally, I find the latter examples more concerning because I wonder if the non-AWS twins were intentionally involved? Or were they randomly cherry-picked to pair up? Jes' musing:)