Welcome to WebmasterWorld Guest from 184.108.40.206
Forum Moderators: goodroi
Different installations of the Nutch software may specify different agent names, but all should respond to the agent name "Nutch". Thus to ban all Nutch-based crawlers from your site, place the following in your robots.txt file:
# allowed bots here
# everyone else jump off a cliff
I block them and all bots with .htaccess
You block the ones that want to be seen and the rest are having a party on your server right now as I'm typing this as they zip right past .htaccess in stealth mode.
However, it's the best you can do with the tools Apache gives with the server.