Page is a not externally linkable
dirkz - 11:51 am on Oct 29, 2003 (gmt 0)
It's meant for apache. There's also a win32 version of it, but I don't know whether it's suitable for production servers. I have experienced both sides: The bot programmer and the site owner. A "good" leeching bot (in the eye of the leecher) will disguise its UA and never obeye a robots.txt. It's quite easy to modify existing bots in Perl and Python to do so. It's also very easy to write your own. On the other side of the fence, as a site owner I strongly recommend "traffic-shaping methods" in real time independent of UA and robots stuff based on "offending" IPs. It works like firewalls detecting intrusion attempts and DOS attacks, only on a higher level (HTTP). Btw, from my experience a lot of leechers use sophisticated Perl/Python solutions. Sometimes I feel like telling them about wget and its mirroring options. Leecher's life could be so simple :)
Does this all only work on Linux/Unix hosting? Some sites fall prey to constant file pilfering, leeching and unwanted mass downloads