Choose express-rate-limit when: Single-server Express API with standard IP-based rate limits Quick setup with minimal configuration You don't need Redis yet (or you can add the Redis store adapter later) Protecting specific routes from brute force (auth endpoints, password reset) Choose rate-limiter-flexible when: Multi-server or load-balanced deployment requiring shared state You need true sliding window algorithms (no burst exploitation at window boundaries) Multi-tier limits (burst + sustaine

Best Express Rate Limiting Packages 2026

TL;DR

express-rate-limit for the standard Express API rate limiting use case. rate-limiter-flexible when you need Redis-backed distributed rate limiting, advanced sliding windows, or multi-tier limits. Bottleneck for outbound request throttling. Most Express applications need only express-rate-limit — it covers IP-based limits in 10 lines of code with no infrastructure dependencies. If you're running multiple Express servers behind a load balancer and need limits to be shared across instances, rate-limiter-flexible with Redis is the answer. Bottleneck is a different category: it throttles your app's outbound calls to external APIs, not inbound requests.

Quick Comparison

	express-rate-limit v7	rate-limiter-flexible v5	Bottleneck v2
Weekly Downloads	~4M	~1.8M	~1.2M
GitHub Stars	~11K	~5K	~2K
Bundle Size	~15KB	~40KB	~22KB
Redis Support	Via external store	Built-in	Built-in
Distributed (multi-server)	Needs store adapter	Yes	Yes (queue)
Primary Use Case	Inbound API limits	Inbound + distributed	Outbound throttling
Sliding Window	Approximate	True sliding window	Yes (reservoir)
Per-User Limits	Yes (custom key)	Yes	Yes
TypeScript	Yes	Yes	Yes
License	MIT	ISC	MIT

express-rate-limit: The Standard Choice

express-rate-limit is the default npm package for Express rate limiting. It's used in over 150,000 public repositories and covers the most common use case — IP-based request throttling — in minimal code:

import rateLimit from 'express-rate-limit';

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  limit: 100,               // max 100 requests per window
  standardHeaders: 'draft-8', // Return RateLimit headers per RFC 6585
  legacyHeaders: false,
  // Optional: custom message
  message: { error: 'Too many requests, please try again later.' },
});

app.use('/api', limiter);

For most single-server applications, this is all you need. The response headers tell clients exactly when they can retry:

RateLimit-Limit: 100
RateLimit-Remaining: 47
RateLimit-Reset: 2026-04-13T15:30:00.000Z
Retry-After: 847

Custom Keys: Per-User and Per-Route Limits

The keyGenerator option lets you rate-limit by anything — user ID, API key, or a combination:

const apiKeyLimiter = rateLimit({
  windowMs: 60 * 1000, // 1 minute
  limit: 60,
  keyGenerator: (req) => {
    // Limit by API key when present, fall back to IP
    return req.headers['x-api-key'] || req.ip;
  },
  skip: (req) => {
    // Don't rate limit internal health checks
    return req.path === '/health';
  },
});

// More aggressive limit for auth endpoints
const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  limit: 10, // only 10 login attempts per 15 min
  keyGenerator: (req) => req.ip,
});

app.post('/auth/login', authLimiter, loginHandler);
app.post('/auth/register', authLimiter, registerHandler);
app.use('/api', apiKeyLimiter);

The Distributed Limitation

express-rate-limit's default memory store is in-process — each running instance has its own counter. If you deploy three Express servers behind a load balancer, each tracks limits independently, effectively tripling your rate limit from the user's perspective.

For single-server deployments, this is fine. For distributed setups, you need a shared store:

import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
import { createClient } from 'redis';

const client = createClient({ url: process.env.REDIS_URL });
await client.connect();

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  limit: 100,
  store: new RedisStore({
    sendCommand: (...args) => client.sendCommand(args),
  }),
});

The rate-limit-redis adapter works well, but at this point you're adding Redis infrastructure anyway — which is when rate-limiter-flexible becomes worth evaluating.

rate-limiter-flexible: Distributed and Production-Grade

rate-limiter-flexible is designed for production multi-server setups from the start. It supports Redis, Memcached, MongoDB, and Postgres as backing stores, and its rate limiting algorithms are more sophisticated than express-rate-limit's fixed window:

import { RateLimiterRedis } from 'rate-limiter-flexible';
import { createClient } from 'redis';

const redisClient = createClient({ url: process.env.REDIS_URL });

const rateLimiter = new RateLimiterRedis({
  storeClient: redisClient,
  keyPrefix: 'api',
  points: 100,       // 100 requests
  duration: 900,     // per 15 minutes
  blockDuration: 60, // block for 1 minute if exceeded
});

// Express middleware:
const rateLimiterMiddleware = async (req, res, next) => {
  try {
    await rateLimiter.consume(req.ip);
    next();
  } catch (rejRes) {
    const secs = Math.round(rejRes.msBeforeNextReset / 1000) || 1;
    res.set('Retry-After', String(secs));
    res.status(429).json({ error: 'Too many requests' });
  }
};

app.use('/api', rateLimiterMiddleware);

True Sliding Window Algorithm

express-rate-limit uses a fixed window by default: if the window is 15 minutes and you send 100 requests at 14:59, the limit resets at 15:00 and you can send 100 more immediately — allowing a burst of 200 requests in ~1 minute at window boundaries.

rate-limiter-flexible's sliding window algorithm prevents this burst pattern:

import { RateLimiterRedis } from 'rate-limiter-flexible';

// Sliding window: no burst at window boundaries
const rateLimiter = new RateLimiterRedis({
  storeClient: redisClient,
  points: 100,
  duration: 900,
  // No explicit window type needed — sliding window is the default
  // Uses token bucket / leaky bucket internals
});

With a true sliding window, the 100-request limit applies across any rolling 15-minute period, not just calendar windows. This prevents the burst exploitation that's possible with fixed window rate limiters.

Multi-Tier Rate Limiting

rate-limiter-flexible supports stacking multiple limiters for tiered enforcement — for example, a per-second burst limit combined with a per-day sustained limit:

import { RateLimiterRedis, RateLimiterUnion } from 'rate-limiter-flexible';

const perSecond = new RateLimiterRedis({
  storeClient: redisClient,
  keyPrefix: 'per_second',
  points: 5,      // 5 requests per second (burst)
  duration: 1,
});

const perDay = new RateLimiterRedis({
  storeClient: redisClient,
  keyPrefix: 'per_day',
  points: 10000,  // 10,000 per day (sustained)
  duration: 86400,
});

const combined = new RateLimiterUnion(perSecond, perDay);

app.use('/api', async (req, res, next) => {
  try {
    await combined.consume(req.ip);
    next();
  } catch {
    res.status(429).json({ error: 'Rate limit exceeded' });
  }
});

This pattern is common in public APIs that need both burst protection and daily usage caps. Express-rate-limit requires two separate middleware instances for this; rate-limiter-flexible's RateLimiterUnion handles it atomically.

Bottleneck: Outbound Rate Limiting

Bottleneck solves a different problem: controlling how fast your application sends requests to external services. If you're calling a third-party API that enforces rate limits, Bottleneck prevents you from exceeding their limits rather than limiting who can call you.

import Bottleneck from 'bottleneck';

// GitHub API allows 5,000 requests/hour for authenticated users
const limiter = new Bottleneck({
  maxConcurrent: 10,         // max 10 in-flight requests at once
  minTime: 720,              // minimum 720ms between requests (~83/minute)
  reservoir: 5000,           // initial bucket of 5000 requests
  reservoirRefreshAmount: 5000,
  reservoirRefreshInterval: 60 * 60 * 1000, // refill hourly
});

// Wrap any async function:
const fetchGitHubRepo = limiter.wrap(async (owner, repo) => {
  const response = await fetch(`https://api.github.com/repos/${owner}/${repo}`);
  return response.json();
});

// These calls are automatically throttled:
const results = await Promise.all([
  fetchGitHubRepo('facebook', 'react'),
  fetchGitHubRepo('vercel', 'next.js'),
  fetchGitHubRepo('tailwindlabs', 'tailwindcss'),
  // ... 100 more — Bottleneck queues them automatically
]);

Bottleneck also supports Redis for distributed outbound throttling — useful when you have multiple servers all calling the same external API:

const limiter = new Bottleneck({
  maxConcurrent: 5,
  minTime: 200,
  id: 'github-api',            // shared queue ID
  datastore: 'ioredis',        // distributed queue via Redis
  clearDatastore: false,
  clientOptions: {
    host: process.env.REDIS_HOST,
    port: 6379,
  },
});

If you're scraping data, calling LLM APIs, or hitting payment processors from multiple workers, Bottleneck is the right tool. It's not a replacement for inbound rate limiting — it's a complement to it.

When to Use Which

Choose express-rate-limit when:

Single-server Express API with standard IP-based rate limits
Quick setup with minimal configuration
You don't need Redis yet (or you can add the Redis store adapter later)
Protecting specific routes from brute force (auth endpoints, password reset)

Choose rate-limiter-flexible when:

Multi-server or load-balanced deployment requiring shared state
You need true sliding window algorithms (no burst exploitation at window boundaries)
Multi-tier limits (burst + sustained) on the same endpoint
You're already using Redis and want a single package that handles all store types
Building a public API with strict SLA guarantees on rate limits

Choose Bottleneck when:

Throttling your app's outbound requests to external APIs
Controlling concurrency and request queuing
Respecting third-party API rate limits from multiple workers
Any use case involving "how fast can we call this service"

Production Decision Checklist

Requirement	Package to start with	Implementation note
Basic per-IP throttling for one Express server	express-rate-limit	Use `standardHeaders`, disable legacy headers, and keep auth routes on a stricter limiter than general API routes.
Per-user or per-API-key limits	express-rate-limit	Provide a `keyGenerator` that prefers authenticated user ID or API key, then falls back to IP only for anonymous traffic.
Multiple API servers behind a load balancer	express-rate-limit + Redis store or rate-limiter-flexible	In-memory counters are not shared. Add Redis before scaling horizontally.
Burst plus sustained quota enforcement	rate-limiter-flexible	Its multi-limiter patterns are better when a public API needs per-second, per-minute, and per-day policies.
Throttling calls your server makes to another API	Bottleneck	This is outbound concurrency and queue management, not inbound abuse protection.
Bot attacks, credential stuffing, or abuse investigations	rate-limiter-flexible plus app-level telemetry	You need block duration, risk-based keys, and structured events; middleware alone is not an abuse program.

The critical choice is not the npm download count. It is whether the counter state must survive process restarts and whether every server instance must see the same quota. If yes, choose a Redis-backed design from the start.

Implementation Checklist

Define separate limits for anonymous reads, authenticated writes, auth endpoints, and webhook callbacks.
Use a stable key: user ID for logged-in users, API key for programmatic clients, IP only as a fallback.
Return modern RateLimit-* and Retry-After headers so clients can back off gracefully.
Exempt health checks, internal cron probes, and trusted webhooks only after authentication or signature verification.
Log limit hits with route, key type, and user/account context; never log raw secrets or full API keys.
Add Redis before horizontal scaling, or the same user will receive a separate quota on each instance.

For adjacent Express security decisions, see helmet vs cors vs express-rate-limit: Express Security Packages 2026 and bcrypt vs argon2 vs scrypt password hashing.

Practical Setup: express-rate-limit + Redis for Production

The most common production setup combines express-rate-limit's familiar API with the Redis store for distributed enforcement:

import express from 'express';
import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
import { createClient } from 'redis';

const app = express();
const redisClient = createClient({ url: process.env.REDIS_URL });
await redisClient.connect();

const globalLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  limit: 500,
  standardHeaders: 'draft-8',
  legacyHeaders: false,
  store: new RedisStore({
    sendCommand: (...args) => redisClient.sendCommand(args),
  }),
});

const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  limit: 10,
  store: new RedisStore({
    sendCommand: (...args) => redisClient.sendCommand(args),
    prefix: 'auth',
  }),
});

app.use('/api', globalLimiter);
app.post('/auth/login', authLimiter);
app.post('/auth/register', authLimiter);

This pattern scales to most production Express applications without requiring rate-limiter-flexible's additional complexity. Graduate to rate-limiter-flexible if you need sliding windows, multi-tier enforcement, or abuse-response workflows that block keys for progressively longer periods.

Final Verdict for Express APIs

For a normal Express API, start with express-rate-limit and configure it carefully. The package is boring in the best way: low setup cost, recognizable middleware, clear headers, and enough extension points for user/API-key based limits. Add the Redis store before horizontal scaling.

Choose rate-limiter-flexible when rate limiting is part of the product surface — paid plans, public API quotas, login-defense workflows, or multi-region abuse controls. Choose Bottleneck when your Express app is the client and needs to avoid exhausting GitHub, Stripe, OpenAI, or another upstream provider's quota.

The 2026 JavaScript Stack Cheatsheet