Forum Moderators: open

Message Too Old, No Replies

Nutch from Looksmart

         

volatilegx

7:47 pm on Nov 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



First time I've seen the Nutch user agent coming from a Looksmart IP:

IP: 64.241.242.18
Host: sv-fw.looksmart.com
UA: NutchCVS/0.05 (Nutch; [nutch.org...] nutch-agent@lists.sourceforge.net)

jdMorgan

12:21 am on Nov 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Did it fetch robots.txt?

I warned Nutch.org about letting people use this thing without changing the user-agent name. The way they coded their robots.txt parser, it's near impossible to allow nutch.org while blocking unknown users of their code.

They're headed down the same path as Indy Library, Larbin, and the others... Good, useful 'bots ruined by their lack of legally-enforceable terms of use... :(

Jim

volatilegx

10:28 pm on Nov 7, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't know, Jim -- I have the info second hand.

Dan