Rate limiting with memcached
7th January 2009
On Monday, several high profile “celebrity” Twitter accounts started spouting nonsense, the victims of stolen passwords. Wired has the full story—someone ran a dictionary attack against a Twitter staff member, discovered their password and used Twitter’s admin tools to reset the passwords on the accounts they wanted to steal.
The Twitter incident got me thinking about rate limiting again. I’ve been wanting a good general solution to this problem for quite a while, for API projects as well as security. Django Snippets has an answer, but it works by storing access information in the database and requires you to run a periodic purge command to clean up the old records.
I’m strongly averse to writing to the database for every hit. For most web applications reads scale easily, but writes don’t. I also want to avoid filling my database with administrative gunk (I dislike database backed sessions for the same reason). But rate limiting relies on storing state, so there has to be some kind of persistence.
Using memcached counters
I think I’ve found a solution, thanks to memcached and in particular the incr command. incr lets you atomically increment an already existing counter, simply by specifying its key. add can be used to create that counter—it will fail silently if the provided key already exists.
Let’s say we want to limit a user to 10 hits every minute. A naive implementation would be to create a memcached counter for hits from that user’s IP address in a specific minute. The counter key might look like this:
ratelimit_72.26.203.98_2009-01-07-21:45
Increment that counter for every hit, and if it exceeds 10 block the request.
What if the user makes ten requests all in the last second of the minute, then another ten a second later? The rate limiter will let them off. For many cases this is probably acceptable, but we can improve things with a slightly more complex strategy. Let’s say we want to allow up to 30 requests every five minutes. Instead of maintaining one counter, we can maintain five—one for each of the past five minutes (older counters than that are allowed to expire). After a few minutes we might end up with counters that look like this:
ratelimit_72.26.203.98_2009-01-07-21:45 = 13 ratelimit_72.26.203.98_2009-01-07-21:46 = 7 ratelimit_72.26.203.98_2009-01-07-21:47 = 11
Now, on every request we work out the keys for the past five minutes and use get_multi to retrieve them. If the sum of those counters exceeds the maximum allowed for that time period, we block the request.
Are there any obvious flaws to this approach? I’m pretty happy with it—it cleans up after itself (old counters quietly expire from the cache), it shouldn’t use much resources (just five active cache keys per unique IP address at any one time) and if the cache is lost the only snag is that a few clients might go slightly over their rate limit. I don’t think it’s possible for an attacker to force the counters to expire early.
An implementation for Django
I’ve put together an example implementation of this algorithm using Django, hosted on GitHub. The readme.txt file shows how it works—basic usage is via a simple decorator:
from ratelimitcache import ratelimit
@ratelimit(minutes = 3, requests = 20)
def myview(request):
# ...
return HttpResponse('...')
Python decorators are typically functions, but ratelimit
is actually a class. This means it can be customised by subclassing it, and the class provides a number of methods designed to be over-ridden. I’ve provided an example of this in the module itself—ratelimit_post, a decorator which only limits on POST requests and can optionally couple the rate limiting to an individual POST field. Here’s the complete implementation:
class ratelimit_post(ratelimit):
"Rate limit POSTs - can be used to protect a login form"
key_field = None # If provided, this POST var will affect the rate limit
def should_ratelimit(self, request):
return request.method == 'POST'
def key_extra(self, request):
# IP address and key_field (if it is set)
extra = super(ratelimit_post, self).key_extra(request)
if self.key_field:
value = sha.new(request.POST.get(self.key_field, '')).hexdigest()
extra += '-' + value
return extra
And here’s how you would use it to limit the number of times a specific IP address can attempt to log in as a particular user:
@ratelimit_post(minutes = 3, requests = 10, key_field = 'username')
def login(request):
# ...
return HttpResponse('...')
The should_ratelimit()
method is called before any other rate limiting logic. The default implementation returns True, but here we only want to apply rate limits to POST requests. The key_extra()
method is used to compose the keys used for the counter—by default this just includes the request’s IP address, but in ratelimit_post
we can optionally include the value of a POST field (for example the username). We could include things like the request path here to apply different rate limit counters to different URLs.
Finally, the readme.txt includes ratelimit_with_logging
, an example that over-rides the disallowed()
view returned when a rate limiting condition fails and writes an audit note to a database (less overhead than writing for every request).
I’ve been a fan of customisation via subclassing ever since I got to know the new Django admin system, and I’ve been using it in a bunch of projects. It’s a great way to create reusable pieces of code.
More recent articles
- Six short video demos of LLM and Datasette projects - 22nd January 2025
- DeepSeek-R1 and exploring DeepSeek-R1-Distill-Llama-8B - 20th January 2025
- My AI/LLM predictions for the next 1, 3 and 6 years, for Oxide and Friends - 10th January 2025