It appears to be space on Abovenet: 64.124.0.0 - 64.125.255.255 64.124.0.0/15
lucy24
11:58 pm on Dec 2, 2023 (gmt 0)
:: quick look at raw logs ::
Oh, how odd. As far as I can tell, I've never set eyes on them before the past week or so (23 November and later), all from 64.124.8.various. And they never asked for anything but robots.txt, which doesn't mention them. Response size tells me they got the full version, not the minimalist Disallow-Everything that's served to selected bad actors. Wonder what they saw that scared them away?
:: deeper delve into archived logs in case there's a past history ::
Nope, nothing. I will keep my eyes peeled.
Skips
1:26 pm on Jan 19, 2024 (gmt 0)
I am actually pretty ****** (upset) with this bot. It has made over 20.000 requests to our sever over the past 5 hrs, although our robots.txt disallows all bot traffic with exception few major search engines that bring us traffic.
Funnily, their website says that their bots respect robots.txt ...but they do it in a rather peculiar way: if you specifically disallow ImagesiftBot in robots.txt they will leave your site alone, but if you don't specifically disallow their bot, they will follow directives for googlebot. Huh? Do they expect webmasters to know and include every single spam bot on the planet in robots.txt?
not2easy
2:02 pm on Jan 19, 2024 (gmt 0)
You should know that robots.txt is only a suggestion. It does not control the activities of bots mentioned or not mentioned in your robots.txt file. To prevent their access, you would need to control via IP block (see above) or UA block.
lucy24
5:33 pm on Jan 19, 2024 (gmt 0)
Curiouser and curiouser. I hadn't given this robot a thought since the thread started, so I re-checked logs. They continue to come by many times a day, never requesting anything but robots.txt. (Somewhere along the line I must have Disallowed them, but obviously can't have done so before I knew they existed.)
Further oddity: Before this blizzard of robots.txt requests began, I do find a lot of image requests from the same IP range, 64.124.8, interspersed with many robots.txt requests, all with a humanoid UA. These peaked towards the end of last June and then dropped off precipitously.
SumGuy
2:32 pm on Jan 20, 2024 (gmt 0)
The 64.124.8.0/24 being discussed here belongs to AS36321 (Castle Global Inc.) which has, in total, these CIDR's:
Interestingly, I am blocking some of those already in my WWW and SMTP blocking lists, but only one of those was already in my total-block-don't-log list. It was the CIDR in question -> 64.124.8.0/24.
In July 2022, 64.124.8.37 asked for robots.txt and landing page file. That's what probably led me to put it in my WWW blocking list. Looking now at my router's blocking logs, it was a month later that it returned and over several days tried to hit my server, about 80 times, and that would have triggered me to put it in my total block-don't-log list, so I have no records of what it's been up to since then.
But clearly its been around since at least July 2022.
I found this on crunchbase:
Castle Global is a software program company focusing on visual intelligence problems for producing unstructured visual data. Castle Global is a deep learning company founded by two stanford CS graduate students focused on visual intelligence problems. Their flagship product is a hive, a full stack deep learning platform that is composed of data labeling, custom model building, and complete enterprise SaaS solutions.
The listing says it was founded in 2012, and that it's current status is "closed". Other listing sites show a website URL that no longer exists.
lucy24
6:53 pm on Jan 20, 2024 (gmt 0)
The ongoing headscratcher: Why do some entities--the issue is not unique to Imagesift--honor one site's robots.txt but not another? They can't all be fakers, because sometimes the behavior all involves the same IP.