Forum Moderators: open
[edited by: keyplyr at 12:55 am (utc) on Aug 12, 2018]
[edit reason] removed active links [/edit]
UA: Mozilla/5.0 zgrab/0.x (compatible; Researchscan/foo-twiddle; +http://researchscan.comsys.rwth-aachen.de)
IP: 137.226.113.abc
robots.txt: no
First seen (by me): 27 January 2018
Protocol: HTTPS ONLY
where “foo-twiddle” * can be any of: Can I request that my server be excluded?Or, Option C, you could instruct your robot to recognize the established mechanism by which a site conveys the request “Do not crawl here”. Or, Option D, I could continue blocking you on header grounds without having to take any action at all. (Is this one of those ventures where a 403 response actually conveys just as much information as a 200? Possibly.)
To have your host or network excluded from future scans conducted by RWTH Aachen University, please contact researchscan@comsys.rwth-aachen.de with your IP address or CIDR block. Alternatively, you can configure your firewall to drop traffic from the subnet we use for scanning: 137.226.113.0/26
[edited by: keyplyr at 12:58 am (utc) on Aug 12, 2018]
[edit reason] splice clean-up [/edit]
This would be a nifty naming convention in a computer-science class where each student's robot had to have some unique identifier while being overall the same, but that doesn’t seem to apply here.I think that's exactly what it is, but there appears to be several schools participating in the project.