Forum Moderators: open
This agent is possibly scanning for another actor.In general, I would say it doesn't do a fat lot of good to ask for robots.txt under one name and pages under a different one. Admittedly I know of one (only one) entity that does exactly this--but they've got a fixed IP down to the last a.b.c.d so there's not much shadiness involved.
All requests have been for robots.txt where this agent in not disallowedOn my main site, robots.txt has been followed by requests for the basic directories, blocked on header grounds which I haven't looked into more closely. As an encouraging sign, the roboted-out area (pages linked from front page and from 403 page) has not been requested.
User-Agent: The Knowledge AI
Disallow: /
[edited by: keyplyr at 7:01 pm (utc) on Jun 24, 2018]
[edit reason] splice clean-up [/edit]
Out of curiosity, has anyone met it on an HTTPSOf course... I posted didn't I ;)
if the bot was achieving what it was intended to doWhatever that is. I tend to doubt it was “intended to” see how many 301s it could rack up on a single site. On your secure sites, did it start out by making HTTP requests, or have you only ever met it on HTTPS from the beginning?