Forum Moderators: phranque

Message Too Old, No Replies

How to prevent mass downloading?

and don't harm to search engines spiders.

         

a_sh

11:12 am on May 12, 2006 (gmt 0)

10+ Year Member



Users with highspeed channels and "website download" programs forces my website to go almost down from time to time.
Are there any free solutions to limit such activity without harm to search spiders, like googlebot?

stapel

8:23 pm on May 12, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In your .htaccess file, you can ban downloaders using code such as:

    RewriteEngine on
    RewriteCond %{HTTP_USER_AGENT} ^.*attach.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Ants.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*BackWeb.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Bandit.*$ [NC,OR]
    ...
    RewriteCond %{HTTP_USER_AGENT} ^.*Wget.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Whacker.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Widow.*$ [NC,OR]
    RewriteCond %{X-moz} ^prefetch [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Microsoft\Data\Access\Internet\Publishing\Provider.*$ [NC]
    RewriteRule ^.* - [F]

The "NC" means "no case", so upper-versus-lower case letters won't matter. The "OR" means "or", so omit this from your last listing. The last line, the "rule", says "fail", so they won't get your pages.

Hope that helps a bit.

Eliz.

P.S. No, there is no requirement that things be listed alphabetically. But it can make it easier for you to review your listing later, if necessary.

a_sh

7:14 am on May 15, 2006 (gmt 0)

10+ Year Member



It's fine, but most downloaders didn't supply proper user-agent headers. They are IE, Mozilla,.... according to my website logs.

Are there solutions with determination of high load from IP, or banning due to hit to honeytrap....?

stapel

6:37 pm on May 15, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you don't want to depend on the user-agent being declared, then, yes, you'll probably need some sort of honeypot link near the top of your pages.

Eliz.