TL;DR
autocannon is the fastest Node.js HTTP benchmarking tool — minimal setup, measures raw requests/second and latency percentiles, perfect for quick API benchmarks. k6 is the professional load testing tool — write tests in JavaScript, built-in metrics, cloud execution, threshold assertions, and CI integration. artillery is the full-featured load testing framework — YAML-driven scenarios, HTTP + WebSocket + gRPC, plugin ecosystem. In 2026: autocannon for quick benchmarks during development, k6 for CI/CD load testing pipelines, artillery for complex multi-step user flow testing.
Key Takeaways
- autocannon: ~400K weekly downloads — fastest Node.js HTTP benchmarker, CLI + programmatic API
- k6: ~200K weekly downloads (npm) — Go binary, JavaScript test scripts, cloud execution support
- artillery: ~150K weekly downloads — Node.js, YAML + JS scenarios, HTTP/WS/gRPC plugins
- autocannon is for raw throughput benchmarking — "how many req/s can my server handle?"
- k6 is for load testing with assertions — "does my API meet SLAs under 500 concurrent users?"
- artillery is for realistic user flow simulation — multi-step scenarios with think time
Why Load Testing?
Without load testing:
- Deploy → production crashes at 1000 concurrent users
- No idea where the bottleneck is
- p99 latency is unknown until users complain
With load testing:
- Know max throughput before deploying
- Find memory leaks under sustained load
- Establish latency SLAs with confidence
- Catch regressions in CI before they reach production
Key metrics:
Throughput → requests per second (req/s)
Latency p50 → median response time
Latency p95 → 95% of requests finish in < X ms
Latency p99 → 99% of requests finish in < X ms
Error rate → % of requests that fail
Concurrent → simultaneous users/connections
autocannon
autocannon — fast Node.js HTTP benchmarking:
CLI usage
npm install -g autocannon
# Basic benchmark — 10 seconds, 10 concurrent connections:
autocannon http://localhost:3000/api/packages
# Custom duration and connections:
autocannon -d 30 -c 50 http://localhost:3000/api/packages/react
# -d: duration in seconds
# -c: concurrent connections
# -p: pipelining factor (requests per connection)
# POST request with JSON body:
autocannon \
-m POST \
-H "Content-Type: application/json" \
-b '{"name":"react","version":"19.0.0"}' \
http://localhost:3000/api/packages
# With auth header:
autocannon \
-H "Authorization: Bearer YOUR_TOKEN" \
http://localhost:3000/api/protected
Sample output
Running 10s test @ http://localhost:3000/api/packages/react
10 connections
┌─────────┬───────┬───────┬───────┬───────┬───────────┬───────────┬───────┐
│ Stat │ 2.5% │ 50% │ 97.5% │ 99% │ Avg │ Stdev │ Max │
├─────────┼───────┼───────┼───────┼───────┼───────────┼───────────┼───────┤
│ Latency │ 2 ms │ 3 ms │ 8 ms │ 12 ms │ 3.45 ms │ 2.12 ms │ 87 ms │
└─────────┴───────┴───────┴───────┴───────┴───────────┴───────────┴───────┘
┌───────────┬─────────┬─────────┬─────────┬─────────┬─────────┐
│ Stat │ 1% │ 2.5% │ 50% │ 97.5% │ Avg │
├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Req/Sec │ 2,701 │ 2,701 │ 2,881 │ 3,023 │ 2,864.4 │
├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Bytes/Sec │ 1.48 MB │ 1.48 MB │ 1.58 MB │ 1.65 MB │ 1.57 MB │
└───────────┴─────────┴─────────┴─────────┴─────────┴─────────┘
28,644 requests in 10.01s, 15.7 MB read
Programmatic API
import autocannon from "autocannon"
async function benchmarkEndpoint(url: string) {
const result = await autocannon({
url,
connections: 50,
duration: 30,
pipelining: 1,
headers: {
"Authorization": "Bearer test-token",
"Content-Type": "application/json",
},
requests: [
{
method: "GET",
path: "/api/packages/react",
},
{
method: "GET",
path: "/api/packages/vue",
},
],
})
console.log(`Requests/sec: ${result.requests.average}`)
console.log(`Latency p99: ${result.latency.p99} ms`)
console.log(`Errors: ${result.errors}`)
// Fail if too slow:
if (result.latency.p99 > 100) {
throw new Error(`p99 latency ${result.latency.p99}ms exceeds 100ms SLA`)
}
return result
}
Compare before/after
# Baseline (before optimization):
autocannon -d 30 -c 100 http://localhost:3000/api/packages > before.txt
# After optimization:
autocannon -d 30 -c 100 http://localhost:3000/api/packages > after.txt
# Or use autocannon-compare:
npx autocannon-compare before.txt after.txt
k6
k6 — professional load testing:
Installation
# macOS:
brew install k6
# Or via npm (wrapper):
npm install -g k6
Basic test script
// load-test.js
import http from "k6/http"
import { check, sleep } from "k6"
// Test configuration:
export const options = {
vus: 100, // Virtual users (concurrent)
duration: "30s", // Test duration
// SLA thresholds — test FAILS if these are exceeded:
thresholds: {
http_req_duration: ["p(95)<200", "p(99)<500"], // 95% < 200ms, 99% < 500ms
http_req_failed: ["rate<0.01"], // Error rate < 1%
},
}
export default function () {
// Each VU runs this function repeatedly:
const response = http.get("http://localhost:3000/api/packages/react")
check(response, {
"status is 200": (r) => r.status === 200,
"body contains name": (r) => r.json("name") === "react",
"response time < 300ms": (r) => r.timings.duration < 300,
})
sleep(1) // Think time between requests (simulate real users)
}
Load ramping
// Gradually ramp up, hold, then ramp down:
export const options = {
stages: [
{ duration: "30s", target: 50 }, // Ramp up to 50 VUs over 30s
{ duration: "2m", target: 100 }, // Ramp up to 100 VUs over 2m
{ duration: "1m", target: 100 }, // Hold 100 VUs for 1m
{ duration: "30s", target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ["p(95)<500"],
},
}
Multi-endpoint scenario
import http from "k6/http"
import { check, group, sleep } from "k6"
export const options = {
vus: 50,
duration: "60s",
thresholds: {
"http_req_duration{endpoint:packages}": ["p(95)<200"],
"http_req_duration{endpoint:compare}": ["p(95)<500"],
},
}
export default function () {
group("Browse packages", () => {
const packages = http.get("http://localhost:3000/api/packages", {
tags: { endpoint: "packages" },
})
check(packages, { "packages OK": (r) => r.status === 200 })
sleep(0.5)
})
group("Compare packages", () => {
const compare = http.get(
"http://localhost:3000/api/compare?a=react&b=vue",
{ tags: { endpoint: "compare" } }
)
check(compare, { "compare OK": (r) => r.status === 200 })
sleep(1)
})
}
Run and output
# Run locally:
k6 run load-test.js
# Run with web dashboard:
k6 run --out dashboard=open load-test.js
# Output to JSON:
k6 run --out json=results.json load-test.js
# CI integration:
k6 run load-test.js
# Exit code 0 = all thresholds passed
# Exit code 1 = threshold failed → CI pipeline fails
artillery
artillery — load testing framework:
Installation
npm install -g artillery
YAML scenario
# load-test.yml
config:
target: "http://localhost:3000"
phases:
- duration: 60
arrivalRate: 10 # 10 new users/second for 60 seconds
- duration: 120
arrivalRate: 50 # Ramp to 50 users/second
defaults:
headers:
Authorization: "Bearer {{ $processEnvironment.API_TOKEN }}"
Content-Type: "application/json"
scenarios:
- name: "Browse and compare packages"
weight: 70 # 70% of users do this
flow:
- get:
url: "/api/packages"
- think: 2 # Wait 2 seconds (simulate reading)
- get:
url: "/api/packages/{{ package }}"
qs:
package: "react"
- think: 1
- get:
url: "/api/compare"
qs:
a: "react"
b: "vue"
- name: "Search packages"
weight: 30 # 30% of users search
flow:
- get:
url: "/api/search"
qs:
q: "form validation"
- think: 3
Run
# Run the test:
artillery run load-test.yml
# Run with custom report:
artillery run --output results.json load-test.yml
artillery report results.json # Opens HTML report
# Quick test (no config file):
artillery quick --count 100 --num 20 http://localhost:3000/api/packages
With JavaScript logic
# load-test.yml
config:
target: "http://localhost:3000"
phases:
- duration: 60
arrivalRate: 10
processor: "./load-test-helpers.js"
scenarios:
- flow:
- function: "generatePackageName"
- get:
url: "/api/packages/{{ packageName }}"
capture:
- json: "$.name"
as: "capturedName"
- log: "Got package: {{ capturedName }}"
// load-test-helpers.js
const PACKAGES = ["react", "vue", "svelte", "angular", "solid"]
module.exports = {
generatePackageName(context, events, done) {
context.vars.packageName =
PACKAGES[Math.floor(Math.random() * PACKAGES.length)]
return done()
},
}
Feature Comparison
| Feature | autocannon | k6 | artillery |
|---|---|---|---|
| Primary use | Benchmarking | Load testing | Scenario testing |
| Language | Node.js | Go + JS | Node.js |
| Script format | JS / CLI | JavaScript | YAML + JS |
| Threshold assertions | ❌ | ✅ | ✅ |
| Load ramping | ❌ | ✅ | ✅ |
| Think time / pacing | ❌ | ✅ (sleep) | ✅ (think) |
| WebSocket testing | ❌ | ✅ | ✅ (plugin) |
| gRPC testing | ❌ | ✅ | ✅ (plugin) |
| Cloud execution | ❌ | ✅ (k6 Cloud) | ✅ (Artillery Cloud) |
| HTML report | ❌ | ✅ | ✅ |
| CI integration | ✅ (exit codes) | ✅ | ✅ |
| Weekly downloads | ~400K | ~200K | ~150K |
When to Use Each
Choose autocannon if:
- Quick benchmark during development ("how fast is my new cache layer?")
- Compare two implementations — A/B performance testing
- Need programmatic API in Node.js benchmark scripts
- Measure raw throughput and latency of a single endpoint
Choose k6 if:
- CI/CD load testing with pass/fail thresholds
- Realistic load patterns with staged ramp-up/ramp-down
- Multi-endpoint tests with complex JavaScript logic
- Need cloud execution (k6 Cloud) for distributed load
Choose artillery if:
- Multi-step user flow scenarios (browse → search → checkout)
- WebSocket or gRPC alongside HTTP testing
- Team prefers YAML config over JavaScript test scripts
- Need HTML reports for stakeholders
Establishing Meaningful Baseline Performance Metrics
Before running load tests, you need a reliable baseline to compare against. Run autocannon or k6 in a controlled environment with fixed concurrency settings and measure p50, p95, and p99 latencies along with requests-per-second. The baseline means nothing without controlling variables: database size, cache state, network conditions, and whether the application JIT-compiled its hot paths. Node.js applications are particularly sensitive to warm-up effects — the V8 JIT compiler optimizes frequently executed code paths after several thousand executions, so measurements taken in the first few seconds of a load test will show higher latencies than steady-state. Run your autocannon benchmark for at least 30 seconds and discard the first few seconds of results, or use k6's staged ramping to reach steady state before measuring. Store baselines in version control alongside test configurations so regressions are immediately visible.
Load Testing Database-Backed APIs
Most API load tests reveal database bottlenecks rather than application server limits. A Node.js API serving cached responses can handle tens of thousands of requests per second, while the same API making a database query per request typically caps at a few hundred per second limited by database connection pool size. When autocannon results show high latency and low throughput simultaneously, inspect the database connection pool metrics — if all connections are in use and requests are queuing, you've found your bottleneck. PostgreSQL's pg_stat_activity view shows active connections and their states. The recommended pattern is sizing the connection pool to match your database's max_connections minus headroom for admin operations, then scaling horizontal application instances proportionally. k6's custom metrics API lets you emit database-layer metrics alongside HTTP metrics for correlation in the results.
CI/CD Integration Patterns
Integrating load tests into continuous deployment pipelines requires balancing thoroughness with pipeline speed. Full load tests taking 10+ minutes at high concurrency are impractical for every commit — reserve these for release gates or nightly jobs. For per-PR validation, run abbreviated tests: 30 seconds at 20-50 concurrent users covering critical endpoints, with k6 threshold assertions that fail the pipeline if latency or error rates regress. k6's exit codes integrate cleanly with GitHub Actions, GitLab CI, and Jenkins — a non-zero exit code fails the pipeline automatically. Store k6 results as JSON artifacts and track p95 latency trends over time using a time-series dashboard to catch gradual performance regressions that don't trigger single-run thresholds. Artillery's HTML report format is useful for sharing results with stakeholders who need visual summaries without raw metrics access.
Testing Authenticated APIs and Stateful Scenarios
Production APIs require authentication, which complicates load testing. For JWT-based APIs, generate a pool of test tokens in advance and distribute them across virtual users — never reuse a single token across all users since this may hit per-token rate limits and doesn't reflect realistic traffic patterns. k6's setup() function runs before the test and can perform authentication flows to acquire tokens that are then shared across virtual users. For session-based authentication, k6 stores cookies per virtual user automatically when you use the HTTP client, enabling realistic multi-step flows: login → browse → purchase → logout. artillery's YAML scenario format handles authentication flows well with its capture directive: extract a CSRF token from the login page response and include it in the subsequent POST. For OAuth flows, mock the OAuth provider in load test environments rather than hitting real identity providers.
Interpreting Latency Percentiles for SLAs
p99 latency — the latency experienced by the 99th-percentile slowest request — is the most important metric for service level agreements but is also the most sensitive to outliers. A single slow database query, a garbage collection pause, or a DNS timeout can spike p99 significantly above p50. When defining SLA thresholds in k6, use p99 thresholds for user-facing APIs (users notice latency above ~300ms for interactive actions) and p95 for backend-to-backend APIs. The difference between p95 and p99 reveals tail latency distribution: a small gap indicates consistent performance, a large gap indicates periodic spikes (GC pauses, lock contention, cache invalidations). If your p99 is 10x your p50, investigate with distributed tracing rather than tuning load test parameters — no amount of connection pooling or caching fixes unpredictable tail latency without understanding the root cause.
Choosing the Right Tool for Your Testing Workflow
The three tools map to three distinct points in the development and deployment lifecycle. Use autocannon during local development immediately after implementing a new endpoint or optimization: the 10-second benchmark with a single command gives you immediate feedback on whether your change improved or degraded performance. Add k6 to your CI/CD pipeline as a performance gate that runs against a staging environment before every deployment — the JavaScript test scripts version-control well alongside your application code and the threshold assertions produce clear pass/fail signals. Use artillery for periodic realistic load simulation against production-like environments, testing the multi-step user flows that reflect actual usage patterns. Teams with mature performance cultures typically use all three: autocannon for fast local feedback, k6 for automated CI gates, and artillery for periodic comprehensive scenario testing that validates end-to-end user flows at realistic concurrency levels.
Compare testing and performance tooling packages on PkgPulse →
See also: supertest vs fastify.inject vs hono/testing and cac vs meow vs arg 2026, archiver vs adm-zip vs JSZip (2026).