Forum Moderators: open

Message Too Old, No Replies

CliniComp Webcrawler/Nutch-1.11

         

w3bmastine

10:32 am on Apr 22, 2016 (gmt 0)

10+ Year Member



UA: CliniComp Webcrawler/Nutch-1.11
Protocol: HTTP/1.0
Robots.txt: yes, but ignores it
Host: internap.com
NetRange: 63.251.0.0 - 63.251.255.255
CIDR: 63.251.0.0/16

Visited on 17/Apr/2016, disappeared after being served with a 403 for having 'nutch' in the UA string.

keyplyr

11:08 am on Apr 22, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I allow only a few Nutch variants through. At one time I didn't allow any, thinking if they were serious they would eventually write their own bot but many never did and a few are beneficial to my interests.

Don't know why CliniComp would have a web crawler though. That's the real question.

tangor

3:28 am on Apr 24, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I still block nutch. I've not yet seen one that is THAT beneficial. Then again, that's just me!

keyplyr

9:02 am on Apr 24, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Isn't everything "just me?"