Forum Moderators: open

Message Too Old, No Replies

Paracrawl

         

lucy24

6:09 pm on Jun 26, 2022 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have never seen this robot before, and hope never to see it again.

IP: 128.220.117.abc
UA: Mozilla/5.0 (compatible; Paracrawl; +http://statmt.org/paracrawl/robots.html)
robots.txt: asks but disregards
The URL in the UA first redirects to HTTPS and then gives a resounding 404. The hostname itself exists, and could be legit, but has no contact information.

The IP (full /16) appears to belong to Johns Hopkins, so I fired off a displeased email to noc@etcetera, but do not expect to hear back on a weekend--if ever.

I do not know if the robot would have slunk away dejected if it had found its name in robots.txt. I do know that it asked for every html and pdf file on the site, including the entire contents of roboted-out directories. (From this I learn that my site has approximately 725 pages. I also learned--ahem, cough-cough--that one obscure interior page still has a link to a page I moved almost a year ago. Oops.)

Brett_Tabke

7:24 pm on Jun 26, 2022 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



It is these folks I think:
[paracrawl.eu...]