Forum Moderators: phranque

Message Too Old, No Replies

Best script to limit # page downloaded by a user/bot

script for battling site scraper by monitoring # of page requested

         

tonyd32

1:07 pm on Dec 6, 2006 (gmt 0)

10+ Year Member



I ban bots by looking at their Headers.
I ban bots by monitoring unauthorized access in a no-access directory restricted by robots.txt
I ban the bots that click on links hidden to 'normal' users...
I ban bots that use an IP address from a known IP lists (proxies and Tor ntework)

but it does not seem to be enough! So I'm planning to use a script that limits the number of page a user can download. An example of a such script is the one used by Yahoo to limit the # of searches per day and per IP.

Does anyone have a recommendation for such a script for Apache? I'm looking for a generic script, probably Perl based. If you use such a script what there a huge impact on the server access time for the other (normal) users?

I'll let you know the results of my research. Thanks.
-T