Pattern for rate limiting for submissions

Forum Moderators: phranque

Message Too Old, No Replies

Pattern for rate limiting for submissions

NickMNS

10:35 pm on Mar 13, 2020 (gmt 0)

I'm developing a site that will allow unregistered users to submit data to the server. I would like to rate limit this submission.

My first instinct is simply to save the user's ip in my DB with a ttl index (time to live). While the ip exists in the db the user is blocked from submitting. But my concern is that you could have two user sharing an IP, eg: two people in same home or office.

Is there a pattern or standard practice on how to profile user's for the purpose of rate limiting?

Dimitri

10:50 pm on Mar 13, 2020 (gmt 0)

cookie ?

NickMNS

11:00 pm on Mar 13, 2020 (gmt 0)

cookie ?

Cookies live on the user's computer and can be modified by the user. Possibly cookie and IP.

lammert

11:22 pm on Mar 13, 2020 (gmt 0)

Cookies can be deleted, and when your server (now or in the future) will be accepting IPv6, using the IP address is also useless because each user can switch between billions of IP addresses.

To keep malicious players out you need more sophisticated systems like behavioral pattern matching which I use on a few sites. That only works if a visitor submits data to your site in a manual way. Behavioral pattern matching looks at the real actions of a user (path through the site, time per step, mouse movements, etc) and is difficult to mimic by robots.

If your site will allow automatic data submission, behavioral pattern matching won't work, In that case, I would look into a system with a public API key and a secret code, where you store an SHA-512 hash value in your database. The API key and secret code can be generated anonymously by the users. Having an API key per user also gives you the option to selectively deny (ab)users access to your site.

NickMNS

4:47 pm on Mar 14, 2020 (gmt 0)

That only works if a visitor submits data to your site in a manual way.

Only manual submission is possible.

The "behavioral pattern matching" you describe is rather elaborate, and would likely require more computation than what it is meant to prevent. My goal is not to prevent bots from submitting forms outright, but it is simply to rate limit any submissions (human and bots) to prevent them from overloading the server.

NickMNS

5:14 pm on Mar 17, 2020 (gmt 0)

I've been thinking long and hard about this. I have come up with scheme that is, unconventional, but I think will provide the protection I need. The basic idea is to stream the server's response and block the submit button until the stream is complete. It is essentially the same as implementing a setTimeout function but from the server.

I will also add a key, a random hash to the submission. When the form is submitted the hash and the user's session id must match, when the submit is processed, the server immediately sends back the expected data via a stream. The server sets a countdown timer (achieving the rate limiting) and once elapsed, a new random hash is sent and stream is closed. One hash provides the user with one submit. So if the user (bot or other) some how tries to bypass the disabled button, the server will not process the request.

Notes:
- Both the server and the client side JS are using async functions, so waiting responses should not cause any performance issues on either end.
- Each submit results in a write to the db and the creation of a new document (record). The goal is to limit these db writes.
- I am assuming that the matching of session id and hash will be sufficiently efficient as to not be a concern.
- I am assuming that a higher level malicious-bot protection system may be required. I'm lean on Cloudflare for this.

Any comments are welcome.

ClosedForLunch

12:23 am on Mar 18, 2020 (gmt 0)

In the past I have spent a lot of time coding various anti-abuse constructs for sites I have developed, prior to them going live.

In reality, once the sites were live, the anti-abuse code was rarely, sometimes never, invoked by user behaviour.

My view now is that user behaviour is pretty well always within expected bounds so my anti-abuse code now is initially quite basic. If any behavioural anomalies crop up (by idiots or bots) I will then tweak the code to mitigate the particular abuse.

So, maybe wait and see how your users interact with the live site, and if your server is actually likely to get overloaded, before getting too deep into this right now.