Forum Moderators: open

Message Too Old, No Replies

storm-crawler

Blackboard's new UA

         

keyplyr

10:07 pm on Apr 30, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



UA: Blackboard Safeassign/0.1 (a Storm-based Blackboard Safeassign web-crawler; https://github.com/DigitalPebble/storm-crawler; stormcrawler@digitalpebble.com)
Protocol: HTTP/1.1
Robots.txt: No
Host: blackboard.com
69.196.224.0 - 69.196.255.255
69.196.224.0/19

Blackboard, the educational technology company, is now using an agent (refer spam) from github.com.

AFAIK Blackboard UAs have never requested robots.txt. Even though their new UA calls itself a "web crawler" this has never been the behavior form Blackboard. I've only seen them do vertical requests.

lucy24

3:57 am on May 1, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



AFAIK Blackboard UAs have never requested robots.txt

Someone calling themselves
Blackboard Safeassign/0.1 (a Storm-based Blackboard Safeassign web-crawler; https://github.com/DigitalPebble/storm-crawler; stormcrawler@digitalpebble.com)

requested robots.txt twelve separate times last October 14. But on closer examination, all twelve came from those nefarious 52. and 54. ranges, not the usual 69.196.blahblah.

Hmm.

Sometimes I wonder if I should make exceptions for academic robots if they're engaged in a worthy cause like checking for plagiarism. But only if they give some clue that that's really what they are doing. Having "Blackboard" in your name isn't quite enough, even if they do seem to be concentrating their efforts on ebooks.

keyplyr

4:36 am on May 1, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sounds like a poser if coming from AWS. AFAIK Blackboard always comes from Blackboard ranges.

Having a somewhat edu themed site, I allow all the citation checkers, plagiarism scanners, and the school ranges (which sometimes takes research since often they are VPNs at Server Farms.) IMO Blackboard is worthy of access. Read up at their site.

keyplyr

8:52 am on May 1, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just spoke with the SafeAssign team at Blackboard. I suggested editing their UA to exclude the "github" attribute. They are considering it.

I rewrote my rules to allow this UA from Blackboard, but I know of more than a few site owners who block anything github, including myself.

lucy24

9:27 pm on May 1, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



IMO Blackboard is worthy of access.

I'll trust your judgement on this, then.

:: wandering off to see what headers are involved ::