TL;DR
p-limit is the go-to for simple concurrency limits — run up to N async functions at once. p-queue adds a full priority queue on top of p-limit — items can be dequeued in order, prioritized, and paused/resumed. Bottleneck is the most powerful option — handles rate limiting (requests per second), reservoir refills, and cluster-wide limits via Redis. For limiting concurrent API calls in a batch, use p-limit. For a job queue with priorities, use p-queue. For API rate limiting that respects external service limits, use Bottleneck.
Key Takeaways
- p-limit: ~100M weekly downloads — simplest API, limit N concurrent promises
- p-queue: ~20M weekly downloads — priority queue, pause/resume, concurrency + rate limit
- bottleneck: ~3M weekly downloads — reservoir-based rate limiting, Redis cluster support
- p-limit is the default for batch processing (e.g., fetch 100 URLs, max 5 at a time)
- p-queue adds ordering and priority when processing matters beyond raw concurrency
- Bottleneck for respecting external API rate limits (GitHub: 5000 req/hr, npm: 100 req/min)
Download Trends
| Package | Weekly Downloads | Approach | Priority Queue | Rate Limit | Redis |
|---|---|---|---|---|---|
p-limit | ~100M | Concurrency cap | ❌ | ❌ | ❌ |
p-queue | ~20M | Priority queue | ✅ | ✅ | ❌ |
bottleneck | ~3M | Reservoir | ❌ | ✅ Excellent | ✅ |
p-limit
p-limit is the simplest concurrency control — wraps async functions to run at most N simultaneously.
Basic usage
import pLimit from "p-limit"
const limit = pLimit(5) // Max 5 concurrent
// Batch fetch with concurrency limit:
const packageNames = ["react", "vue", "angular", "svelte", "solid-js", "qwik", "preact", "lit"]
const results = await Promise.all(
packageNames.map((name) =>
limit(() => fetchPackageData(name))
// Without pLimit: all 8 fetch calls start simultaneously
// With pLimit(5): only 5 run at once, next starts when one completes
)
)
Practical: rate-limited API calls
import pLimit from "p-limit"
// GitHub API allows 5000 requests/hr when authenticated
// Use pLimit to avoid hammering the API:
const limit = pLimit(10) // 10 concurrent requests max
async function fetchGithubStats(repos: string[]) {
return Promise.all(
repos.map((repo) =>
limit(async () => {
const response = await fetch(`https://api.github.com/repos/${repo}`, {
headers: { Authorization: `Bearer ${process.env.GITHUB_TOKEN}` },
})
if (response.status === 403) {
// Rate limited — read retry-after header:
const retryAfter = parseInt(response.headers.get("retry-after") || "60")
await new Promise((r) => setTimeout(r, retryAfter * 1000))
// Retry... (use a retry library like p-retry for this)
}
return response.json()
})
)
)
}
Multiple limiters for different resources
import pLimit from "p-limit"
// Different limits for different API endpoints:
const dbLimit = pLimit(20) // Database: 20 concurrent queries
const npmApiLimit = pLimit(5) // npm API: 5 concurrent
const fileLimit = pLimit(3) // File system: 3 concurrent
// Use appropriate limiter per operation:
const [dbData, npmData, files] = await Promise.all([
...packages.map((p) => dbLimit(() => db.query(p))),
...packages.map((p) => npmApiLimit(() => fetchNpmData(p))),
...packages.map((p) => fileLimit(() => readCacheFile(p))),
])
Queue status
const limit = pLimit(3)
// Check how many are queued/active:
console.log(limit.activeCount) // Currently running
console.log(limit.pendingCount) // Waiting to run
limit.clearQueue() // Cancel all pending (won't stop active)
p-queue
p-queue is a full-featured priority queue — useful when you need ordering, priorities, or pause/resume capabilities.
Basic usage
import PQueue from "p-queue"
const queue = new PQueue({ concurrency: 5 })
// Add tasks to queue:
queue.add(() => fetchPackageData("react"))
queue.add(() => fetchPackageData("vue"))
// Add with priority (higher = runs first):
queue.add(() => fetchPackageData("critical-package"), { priority: 10 })
queue.add(() => fetchPackageData("low-priority"), { priority: 0 })
queue.add(() => fetchPackageData("normal"), { priority: 5 })
// Wait for all tasks to complete:
await queue.onIdle()
console.log("All tasks complete")
Rate limiting (per interval)
const queue = new PQueue({
concurrency: 1, // 1 at a time
intervalCap: 10, // Max 10 per interval
interval: 1000, // 1 second interval
carryoverConcurrencyCount: true, // Count carries over between intervals
})
// Now: max 10 requests per second, 1 at a time
// Great for APIs with rate limits like 600/min or 10/sec
// Batch 100 npm API calls at 10/sec:
const packageNames = Array.from({ length: 100 }, (_, i) => `package-${i}`)
const results: unknown[] = []
packageNames.forEach((name) => {
queue.add(async () => {
const data = await fetchNpmPackage(name)
results.push(data)
})
})
await queue.onIdle()
// Takes ~10 seconds: 100 packages / 10 per second
Pause, resume, and size
const queue = new PQueue({ concurrency: 3 })
// Add work:
for (let i = 0; i < 50; i++) {
queue.add(() => processPackage(i))
}
// Pause (currently active tasks continue, new ones don't start):
queue.pause()
console.log("Queue paused. Pending:", queue.pending)
// Do something...
await doMaintenanceTask()
// Resume:
queue.start()
console.log("Queue resumed")
// Progress monitoring:
queue.on("active", () => {
console.log(`Working on item. Active: ${queue.concurrency}. Pending: ${queue.pending}`)
})
queue.on("idle", () => {
console.log("Queue is idle — all done!")
})
queue.on("error", (err, task) => {
console.error("Task failed:", err)
})
// Wait for a specific number of items:
await queue.onSizeLessThan(5) // Wait until fewer than 5 items are queued
console.log("Almost done!")
Timeout per task
const queue = new PQueue({
concurrency: 5,
timeout: 5000, // Each task must complete within 5 seconds
throwOnTimeout: true, // Throw error if task times out
})
queue.add(async () => {
// This will throw if it takes more than 5 seconds:
return await slowOperation()
})
Bottleneck
Bottleneck uses a token/reservoir model — perfect for respecting external API rate limits exactly.
Basic rate limiting
import Bottleneck from "bottleneck"
// GitHub API: 5000 requests per hour
// = 5000 / 3600 = 1.39 per second
const limiter = new Bottleneck({
maxConcurrent: 10,
minTime: Math.ceil(3600000 / 5000), // ~720ms between requests
// This enforces 5000 requests/hour exactly
})
const wrappedFetch = limiter.wrap(async (url: string) => {
const response = await fetch(url)
return response.json()
})
// Now calls are automatically rate-limited:
const data1 = await wrappedFetch("https://api.github.com/repos/facebook/react")
const data2 = await wrappedFetch("https://api.github.com/repos/vuejs/core")
// Automatically spaced to respect rate limits
Reservoir (token bucket)
// npm registry: 100 requests per minute
const limiter = new Bottleneck({
reservoir: 100, // Start with 100 tokens
reservoirRefreshAmount: 100, // Refill 100 tokens...
reservoirRefreshInterval: 60 * 1000, // ...every 60 seconds
maxConcurrent: 20,
})
// Queue 200 requests — Bottleneck handles the pacing:
const packageNames = Array.from({ length: 200 }, (_, i) => `npm-package-${i}`)
const results = await Promise.all(
packageNames.map((name) =>
limiter.schedule(() => fetchNpmPackage(name))
)
)
// Takes ~2 minutes: 200 packages / 100 per minute
Cluster-wide rate limiting with Redis
import Bottleneck from "bottleneck"
import Redis from "ioredis"
// Shared rate limiting across multiple server instances:
const limiter = new Bottleneck({
id: "npm-api-limiter", // Unique ID for this limiter
maxConcurrent: 10,
minTime: 100,
// Redis datastore — shared across all instances:
datastore: "ioredis",
clearDatastore: false,
clientOptions: {
host: process.env.REDIS_HOST,
port: 6379,
},
})
// All 5 server instances share one rate limit counter in Redis
// No more "we have 5 servers and hit the API limit 5x faster" problems
Events and monitoring
const limiter = new Bottleneck({
maxConcurrent: 5,
reservoir: 1000,
reservoirRefreshAmount: 1000,
reservoirRefreshInterval: 3600000, // 1 hour
})
// Monitor reservoir level:
limiter.on("depleted", (empty) => {
if (empty) {
console.warn("API rate limit reservoir empty — requests queuing")
}
})
limiter.on("error", (error) => {
console.error("Bottleneck error:", error)
})
// Get current counts:
const { RUNNING, QUEUED, DONE } = await limiter.counts()
console.log(`Running: ${RUNNING}, Queued: ${QUEUED}, Done: ${DONE}`)
// Check remaining reservoir tokens:
const reservoir = await limiter.currentReservoir()
console.log(`Remaining API calls this hour: ${reservoir}`)
Feature Comparison
| Feature | p-limit | p-queue | Bottleneck |
|---|---|---|---|
| Concurrency cap | ✅ | ✅ | ✅ |
| Rate limiting (per time) | ❌ | ✅ intervalCap | ✅ Excellent |
| Token bucket / reservoir | ❌ | ❌ | ✅ |
| Priority queue | ❌ | ✅ | ❌ |
| Pause / Resume | ❌ | ✅ | ✅ |
| Redis / cluster support | ❌ | ❌ | ✅ |
| Timeout per task | ❌ | ✅ | ✅ |
| Bundle size | ~1KB | ~5KB | ~20KB |
| TypeScript | ✅ | ✅ | ✅ @types |
| Wrap existing function | ❌ | ❌ | ✅ .wrap() |
| Event monitoring | ❌ | ✅ events | ✅ events |
When to Use Each
Choose p-limit if:
- You need simple concurrency: "max 10 async functions at once"
- Batch processing a list of items (file reads, API calls, DB queries)
- Tiny bundle size is important (~1KB)
- No priorities, ordering, or rate limits needed
Choose p-queue if:
- You need to prioritize some tasks over others
- Pause/resume control is needed (e.g., backpressure from downstream)
- Rate limiting at a fixed rate (10/second) with a proper queue
- Background job processing in a single-process app
Choose Bottleneck if:
- Respecting external API rate limits precisely (GitHub: 5000/hr, Stripe: 100/s)
- Multi-server environment where rate limits must be shared (Redis)
- Complex scenarios: reservoirs, retry strategies, token buckets
- You need to wrap an existing function with rate limiting
Production Rate Limiting Patterns for External APIs
Real-world external API rate limits are more nuanced than simple "N requests per second." GitHub's REST API allows 5,000 requests per hour authenticated, with secondary rate limits that trigger when too many requests arrive within a short time window even if the hourly budget has not been exhausted. The secondary limits are not published precisely, but hitting them results in 403 responses with Retry-After headers. Bottleneck handles this pattern well through its minTime option (minimum milliseconds between requests), which spaces out requests to avoid bursting. npm's public registry has rate limits that vary by IP and by request type; registry API calls are more restricted than download operations. Stripe's API allows up to 100 read requests and 100 write requests per second, with a test mode rate limit of 25 reads/second. For each of these external services, Bottleneck's reservoir model maps naturally to the service's credit-based rate limit system — the reservoir represents remaining credits, and the refresh interval represents the reset window. p-queue's intervalCap can model the same scenarios but with less precision for burst-aware rate limits.
Error Handling and Retry Integration
None of the three libraries build in retry logic — they control concurrency and rate, but when an operation fails (network timeout, API error, 429 Too Many Requests), the calling code is responsible for deciding whether to retry. This is by design: retry logic and concurrency control are orthogonal concerns. The recommended pattern is to wrap operations with p-retry (from the same sindresorhus ecosystem as p-limit and p-queue) for retry logic and use p-limit or p-queue for concurrency control. Bottleneck's .schedule() method can be used as the retry target — schedule a Bottleneck-wrapped function, and if it fails, p-retry will schedule it again (with exponential backoff), and Bottleneck will enforce the rate limit on the retry attempts. For production systems that call unreliable external services, the combination of p-limit + p-retry or Bottleneck + p-retry handles the full lifecycle: concurrency limiting, rate limiting, exponential backoff, and maximum retry count, each concern handled by a focused library.
Memory and Resource Management
In long-running processes, the queue size can grow unboundedly if tasks are enqueued faster than they are processed. This is a production concern that developers often discover in staging when a load test reveals the queue growing to millions of items, exhausting Node.js heap. p-limit has no built-in queue size limit — the pendingCount property lets you observe the queue, but there is no automatic rejection when it grows too large. p-queue has a throwOnTimeout option and a timeout option per task, but no global queue size cap. Bottleneck has a highWater option that controls the maximum number of queued requests — when the queue exceeds this limit, Bottleneck rejects new requests (or drops the oldest, based on strategy). This makes Bottleneck the most robust choice for systems where the producer can outpace the consumer indefinitely. For services that scrape data or process user-triggered events, implementing backpressure — sending a 503 or slowing down the producer when the queue is full — prevents memory exhaustion better than any queue size limit.
Observability and Monitoring in Production
Understanding what your concurrency control layer is doing in production is important for diagnosing performance bottlenecks. p-limit exposes activeCount and pendingCount as synchronous properties — these can be emitted as metrics at regular intervals to understand whether the limiter is the bottleneck (high pendingCount) or the upstream service is (high activeCount but low throughput). p-queue emits active and idle events and exposes queue.size and queue.pending, making it more observable than p-limit. Bottleneck is the most instrumentation-friendly: limiter.counts() returns a promise resolving to { RECEIVED, QUEUED, RUNNING, EXECUTING, DONE }, and the depleted, dropped, error, and debug events provide hooks for monitoring systems. For services built on Bottleneck that call shared external APIs, integrating Bottleneck's metrics into your observability stack (via Datadog custom metrics, Prometheus gauges, or Cloudwatch metrics) gives visibility into rate limit utilization and queue depth that is essential for capacity planning.
Combining Libraries for Complex Scenarios
Real production systems often combine these libraries. A common pattern is using p-limit for database query concurrency (limit to your connection pool size), p-queue for prioritized background job processing, and Bottleneck for third-party API calls that have official rate limits. These three concurrency layers operate independently and stack without conflict: an operation moving through the system encounters p-queue's prioritization when entering the processing pipeline, p-limit's concurrency cap when accessing the database, and Bottleneck's rate limiting when making the external API call. Composition like this is possible because all three libraries operate at the function call level — they wrap async functions and control when those functions execute, making them composable with other async patterns including stream processing, WebSocket message handling, and event-driven architectures. The key to successful composition is ensuring that each library's concurrency limit is set with knowledge of the others to avoid creating bottlenecks at the wrong layer.
Methodology
Download data from npm registry (weekly average, February 2026). Feature comparison based on p-limit v6.x, p-queue v8.x, and Bottleneck v2.x.
Compare async and concurrency packages on PkgPulse →
See also: cac vs meow vs arg 2026 and Ink vs @clack/prompts vs Enquirer, acorn vs @babel/parser vs espree.