I don't remember where I read it but I now understand the statement that an API without rate limiting is an exception generator waiting to go off. More specifically, about a week ago I noticed an increased load on my server and several notifications from my uptime monitor. Here's what happened, and what I did to prevent it from happening again.


Digging into the access logs I could see that there were large bursts of API requests for an expensive resource. Since I'm behind CloudFlare, first thing I looked for was if they offered any kind of rate limiting, and they did, allthough not for free, so I decided to see what I could make using Nginx.

Rate limiting with Nginx behind CloudFlare

Since CloudFlare proxy all my requests the IP address showing up for Nginx will actually be one of theirs, and not the users'. Therefore, trying to rate limit based on these addresses is probably not what you generally want. However, the fix was pretty easy and included use of X-Forwarded-For header that CloudFlare provides.

To the point, adding this to your /etc/nginx/nginx.conf will create a rate limited zone you can use in your config files.

http {
  limit_req_zone $http_x_forwarded_for zone=yourzone:10m rate=5r/s;

It basically says that requests in this zone is limitted to 5 requests per second and that we will use a 10MB data structure to keep track of things.

Next, add this to the configuration location block that you want to limit.

server {
  location / {
    limit_req zone=yourzone burst=10;

You can find more info about these commands in Nginx documentation here.

Test it first

Make sure to test your rate limiting before shipping it. I could cause unexpected issues in your app or for other API clients.

Rate limiting based on user contexts

In addition to this, you can also add rate limiting to your application layer and limit requests per user etc. One way to do this is to use the token bucket algorithm as described here: