Forum Moderators: bakedjake
it uses exclusively Amazon EC2 servers
a number of webmasters block by default.
It could be that some webmasters will never see this bot because AFAIK, it uses exclusively Amazon EC2 servers which a number of webmasters block by default.Or could it be because their claims about robots.txt compliance are a barefaced lie?
User-Agent: *
Disallow: /
... which they proceed to ignore. Further quirk is that they then, just like a human, get all the supporting files associated with the 403 page. There's basically no way for anyone to tell if it's really your bot or if its a faker.When they're coming from the down-to-the-last-digit IPs listed on their own page, you kinda have to assume it's the real thing. Unless they're got offspring sneaking in after hours to play with the robot when nobody else is using it?