Forum Moderators: phranque

Message Too Old, No Replies

How to limit multiple requests for the same file?

download getting requested 100s of times per minute from some clients

         

stevenp

1:25 pm on Nov 27, 2007 (gmt 0)

10+ Year Member



Hello all,
I'm serving a .pdf download at my website. Periodically there will be a connection that requests (and gets) the file up to 200 times per minute from the same IP. The server responses are all 200s, not 206s, which makes me think they are not coming from "normal" download managers/browser optimizers requesting the file in parts. And most clients just ask for it once and get it once.

This is hammering my bandwidth, so I wonder what I can do about it. The abusive (to my mind) requests come from many different IP ranges, so ban by IP is not a solution. "Download Master" was the user-agent from which I first noticed this happening, so that and other agents are banned in the download directory's ht-access. However, it still happens from apparently innocent user agents.

Recently one connection was served a 403 for being "Download Master", and then changed its user agent to an apparently normal browser and went off on a multiple-request spree again. Is this a sign of some kind of attack? And if so what can I do about it?

I considered disabling KeepAlive but that would be sitewide, right? You can't turn it off on a per-directory basis?

Any advice would be very welcome. Thanks.

PS - installing mod_bw or mod_limitipconn is not an option with my current hosting.

vincevincevince

3:06 pm on Nov 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Change the file URL to a perl/php/... script which does the following:
  • Generate a big random number/alphanumeric $rand
  • Create a symlink from /downloads/$rand to /otherProtectedPath/theRealFile
  • Output a header stating Location: /downloads/$rand
  • Flush
  • Wait a second or two
  • Add the user's IP address to an .htaccess ban in /downloads/

    The download will have already started, so the .htaccess will not reapply. You are streaming the file directly from a file, so you've dropped the additional scripting overhead. Run a periodic function to purge old IPs.

  • jdMorgan

    3:18 pm on Nov 27, 2007 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Your options are limited. You could rewrite requests for that PDF file to a PHP or PERL script which uses "flock" file-locking to ensure that only one instance of itself can be executed at one time, then "includes" your PDF file and sends it to the client, and then executes a one-second delay at the end of the script before it unlocks itself.

    This would limit all client downloads of that PDF file to one per second.

    I can't think of anything else at the moment if the standard 'bandwidth limiters' are not an option, and you're seeing both IP and user-agent switching from these clients.

    You can turn off keep-alive on a per-directory/per-file basis in .htaccess if you like. Here's an example from one of my servers:


    # Disconnect client after 403 response
    <FilesMatch "^custom-403\.html$">
    SetEnv nokeepalive
    </FilesMatch>

    If you don't enclose the SetEnv directive in a <FilesMatch> or <Files> container, then it would apply to all files in the directory in which this .htaccess code resides, and to all subdirectories of this directory as well.

    Jim

    stevenp

    3:28 pm on Nov 27, 2007 (gmt 0)

    10+ Year Member



    Thanks for the reply, vince^3. I don't really understand it, but I might be able to find someone who does. ;)

    In the mean time, is it possible that serving the bots or whatever they are a custom 404 instead of a 403 might confuse them and make them give up? Or is fiddling with the htaccess by itself basically not going to be sufficient for this problem?

    EDIT: jpmorgan, thanks for your response as well. Perhaps more hopeful. There's only one file in my download directory so that could work, and I wouldn't need the files container. Is there a substantive difference between writing "SetEnv nokeepalive" and "KeepAlive Off"?

    EDIT2: Ah, does setenv restrict it to that directory as I want?
    Thanks!

    stevenp

    9:03 am on Nov 28, 2007 (gmt 0)

    10+ Year Member



    Since implementing the
    SetEnv nokeepalive
    in the download directory's htaccess 12 hours ago I haven't seen any more request floods. :)

    Thanks very much for your help!