Forum Moderators: open

Message Too Old, No Replies

HTML Analyzer

using Nutch

         

keyplyr

2:52 am on May 12, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




UA: HTML Analyzer/Nutch-1.12
Protocol: HTTP/1.1
Robots.txt: Yes
Host: University of Coimbra, Portugal
193.136.212.0 - 193.136.212.255
193.136.212.0/24

keyplyr

6:53 pm on May 13, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Although I block "Nutch" because of the wide & unaccountable usage, I do like the fact that the default setting supports robots.txt.

tangor

1:08 am on May 14, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I block nutch simply because more serious projects tend to use something other than nutch. However if one has the time and energy to evaluate each on its merits then it would make sense to poke holes when reasonable.

keyplyr

1:43 am on May 14, 2017 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Yes, I think I allow 5 or 6 to use Nutch.