Skip to main content

Guide

pm2 vs node:cluster vs tsx watch 2026

Compare pm2, Node.js cluster module, and tsx watch for process management and auto-restart in Node.js. Clustering, zero-downtime reload, and log management.

·PkgPulse Team·
0

TL;DR

pm2 is the production process manager for Node.js — handles clustering, auto-restart on crash, log rotation, zero-downtime reload, and monitoring dashboard. node:cluster is Node.js's built-in module — forks your app across CPU cores manually, you build the restart logic yourself. tsx watch (or node --watch) is for development only — auto-restarts on file changes during development. In 2026: pm2 for production deployments on VPS/VMs, node:cluster if you want zero dependencies, and tsx watch/node --watch for development.

Key Takeaways

  • pm2: ~3M weekly downloads — production-grade, clustering + restart + logs + monitoring in one tool
  • node:cluster: built-in — manual clustering across CPU cores, no external dependency
  • tsx watch / node --watch: development only — file-change detection, auto-restart
  • pm2 zero-downtime reload (pm2 reload) — restarts workers one at a time, never drops requests
  • node:cluster gives you full control but requires you to handle worker death, signal forwarding, and graceful shutdown manually
  • In Docker/Kubernetes: you often DON'T need pm2 — the orchestrator handles restarts and scaling

When Process Management Matters

Development:
  Need: auto-restart on file changes
  Solution: tsx watch, node --watch, or nodemon

Production (VPS / bare metal):
  Need: crash recovery, clustering, log management
  Solution: pm2, systemd + node:cluster

Production (Docker / Kubernetes):
  Need: single process per container
  Solution: node dist/index.js (orchestrator handles restart)
  PM2 is unnecessary — Kubernetes restarts crashed pods automatically

pm2

pm2 — production process manager:

Quick start

npm install -g pm2

# Start in cluster mode (all CPU cores):
pm2 start dist/index.js -i max --name "pkgpulse-api"

# Or with specific number of instances:
pm2 start dist/index.js -i 4 --name "pkgpulse-api"
// ecosystem.config.cjs
module.exports = {
  apps: [
    {
      name: "pkgpulse-api",
      script: "dist/index.js",
      instances: "max",         // Use all CPU cores
      exec_mode: "cluster",     // Cluster mode (vs fork)
      max_memory_restart: "500M",  // Restart if memory exceeds 500MB
      env: {
        NODE_ENV: "production",
        PORT: 3000,
      },
      // Log management:
      log_date_format: "YYYY-MM-DD HH:mm:ss Z",
      error_file: "logs/error.log",
      out_file: "logs/output.log",
      merge_logs: true,
      // Auto-restart:
      watch: false,             // Don't watch files in production
      autorestart: true,        // Restart on crash
      max_restarts: 10,         // Max restart count in restart_delay window
      restart_delay: 4000,      // Wait 4s between restarts
    },
    {
      name: "pkgpulse-worker",
      script: "dist/worker.js",
      instances: 2,
      exec_mode: "cluster",
      cron_restart: "0 */6 * * *",  // Restart every 6 hours (memory leak protection)
    },
  ],
}
# Start from ecosystem file:
pm2 start ecosystem.config.cjs

# Zero-downtime reload (restarts workers one at a time):
pm2 reload pkgpulse-api

# Stop:
pm2 stop pkgpulse-api

# Delete from pm2 process list:
pm2 delete pkgpulse-api

Zero-downtime reload

# Standard restart — kills all, starts all (brief downtime):
pm2 restart pkgpulse-api

# Graceful reload — one worker at a time (zero downtime):
pm2 reload pkgpulse-api

# How it works:
# 1. pm2 sends SIGINT to worker 1
# 2. Worker 1 finishes in-flight requests
# 3. Worker 1 exits, pm2 starts replacement worker 1'
# 4. Worker 1' is ready → pm2 sends SIGINT to worker 2
# 5. Repeat until all workers are replaced
# → At NO point are zero workers running

Graceful shutdown in your app

// Your app must handle SIGINT for graceful reload:
import { createServer } from "node:http"

const server = createServer(app)
server.listen(3000)

process.on("SIGINT", () => {
  console.log("Received SIGINT, shutting down gracefully...")
  server.close(() => {
    // Close database connections:
    db.end()
    process.exit(0)
  })

  // Force exit after 10 seconds:
  setTimeout(() => process.exit(1), 10_000)
})

Monitoring

# Real-time dashboard:
pm2 monit

# Process list:
pm2 list

# Detailed info:
pm2 show pkgpulse-api

# Logs:
pm2 logs pkgpulse-api --lines 100

# Startup script (survive reboots):
pm2 startup     # Generates systemd/launchd script
pm2 save        # Saves current process list

node:cluster

node:cluster — built-in clustering:

Basic clustering

import cluster from "node:cluster"
import { availableParallelism } from "node:os"
import { createServer } from "node:http"
import { app } from "./app.js"

const numCPUs = availableParallelism()

if (cluster.isPrimary) {
  console.log(`Primary ${process.pid} starting ${numCPUs} workers`)

  // Fork workers:
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork()
  }

  // Restart crashed workers:
  cluster.on("exit", (worker, code, signal) => {
    console.warn(`Worker ${worker.process.pid} died (${signal || code}). Restarting...`)
    cluster.fork()
  })
} else {
  // Workers share the same TCP port:
  const server = createServer(app)
  server.listen(3000, () => {
    console.log(`Worker ${process.pid} listening on :3000`)
  })
}

Zero-downtime reload (manual)

import cluster from "node:cluster"

if (cluster.isPrimary) {
  // Graceful reload — restart workers one at a time:
  process.on("SIGUSR2", async () => {
    const workers = Object.values(cluster.workers!)

    for (const worker of workers) {
      if (!worker) continue

      // Start new worker:
      const replacement = cluster.fork()

      // Wait for replacement to be ready:
      await new Promise<void>((resolve) => {
        replacement.on("listening", () => resolve())
      })

      // Disconnect old worker (finish in-flight requests):
      worker.disconnect()

      // Wait for old worker to exit:
      await new Promise<void>((resolve) => {
        worker.on("exit", () => resolve())
      })

      console.log(`Replaced worker ${worker.process.pid}${replacement.process.pid}`)
    }
  })
}

// Trigger reload:
// kill -USR2 <primary-pid>

IPC between primary and workers

if (cluster.isPrimary) {
  // Broadcast to all workers:
  function broadcast(message: unknown) {
    for (const worker of Object.values(cluster.workers!)) {
      worker?.send(message)
    }
  }

  // Listen for worker messages:
  cluster.on("message", (worker, message) => {
    console.log(`Worker ${worker.process.pid}: ${JSON.stringify(message)}`)
    if (message.type === "cache-invalidate") {
      broadcast({ type: "cache-invalidate", key: message.key })
    }
  })
} else {
  // Worker sends message to primary:
  process.send?.({ type: "cache-invalidate", key: "packages:react" })

  // Worker receives broadcast:
  process.on("message", (message) => {
    if (message.type === "cache-invalidate") {
      cache.delete(message.key)
    }
  })
}

tsx watch / node --watch (Development)

node --watch (built-in, Node.js 18+)

# Built-in file watcher — no dependencies:
node --watch dist/index.js

# With specific glob:
node --watch-path=src dist/index.js

# Preserve console output:
node --watch-preserve-output dist/index.js

tsx watch (TypeScript)

npm install -D tsx

# Watch and restart on file changes:
tsx watch src/index.ts

# With custom ignore patterns:
tsx watch --ignore node_modules --ignore dist src/index.ts

# In package.json:
{
  "scripts": {
    "dev": "tsx watch src/index.ts",
    "start": "node dist/index.js"
  }
}

Feature Comparison

Featurepm2node:clustertsx watch
EnvironmentProductionProductionDevelopment
Clustering✅ (automatic)✅ (manual)
Auto-restart on crashManual
Zero-downtime reload✅ (pm2 reload)Manual
File watching
Log management
Memory limit restart
Monitoring dashboard
Startup on bootManual (systemd)
DependenciesExternalBuilt-inDev dependency
Weekly downloads~3Mbuilt-in~5M

When to Use Each

Choose pm2 if:

  • Deploying on VPS, bare metal, or EC2 instances
  • Need clustering + auto-restart + log management in one tool
  • Want zero-downtime deployments with pm2 reload
  • Running multiple apps on the same server

Choose node:cluster if:

  • Want zero external dependencies — pure Node.js
  • Need fine-grained control over worker lifecycle and IPC
  • Building custom process management (job queue workers, etc.)
  • Already using systemd for process supervision

Choose tsx watch / node --watch if:

  • Local development with auto-restart on file changes
  • tsx watch for TypeScript, node --watch for JavaScript
  • Never use in production

Skip all of these if:

  • Using Docker + Kubernetes — k8s handles restart, scaling, and health checks
  • Using serverless (Vercel, Lambda) — no long-running process to manage
  • Run ONE process per container, let the orchestrator handle the rest

pm2 in Docker and CI/CD Environments

One of the most common misconceptions about pm2 is that it belongs everywhere Node.js runs. In container-first deployments, pm2 is often counterproductive. When you run a Docker container, the orchestrator — Kubernetes, ECS, or Docker Swarm — already handles crash recovery, health checks, and replica scaling. Adding pm2 inside that container creates two competing process managers and complicates signal forwarding (SIGTERM from Kubernetes must reach your Node.js process directly, not be absorbed by pm2 in a way that delays graceful shutdown).

The correct container pattern is a single Node.js process per container with a proper SIGTERM handler. Kubernetes sends SIGTERM before SIGKILL and waits for the terminationGracePeriodSeconds window. If your app handles SIGTERM by closing the HTTP server, draining in-flight requests, and disconnecting from the database, you get zero-downtime rolling deploys without pm2.

Where pm2 genuinely shines is on long-running VPS or bare-metal servers where no orchestrator is present. On a single DigitalOcean droplet running an API server and a background worker, pm2 replaces systemd service files with a much simpler interface. The ecosystem.config.cjs file captures the full runtime configuration in version-controlled form, and pm2 startup + pm2 save makes the process list survive reboots. The pm2 monit dashboard provides memory and CPU trends without installing additional observability tooling.


Graceful Shutdown Patterns Across All Three

Getting graceful shutdown right is critical regardless of which process management approach you choose. The core requirement is that your HTTP server stops accepting new connections, drains all in-flight requests, and then exits cleanly — otherwise load balancers report errors during deploys.

With pm2 in cluster mode, each worker receives a SIGINT signal during pm2 reload. Your app should call server.close() and then wait for open connections to finish. The keep_alive_timeout setting in your HTTP server affects how long connections stay open; setting it to 0 for graceful shutdown ensures workers drain quickly.

With node:cluster, graceful reload is entirely manual. The zero-downtime pattern sends SIGUSR2 to the primary, which forks new workers, waits for them to emit a listening event, then disconnects old workers via worker.disconnect(). The disconnect triggers SIGINT in the worker, and your same shutdown handler fires. The advantage is explicit control over timing — you can add custom readiness checks before the old worker is disconnected.

With tsx watch and node --watch, graceful shutdown is irrelevant for development purposes, but the patterns you establish during development carry over to production. Using tsx watch during development means your SIGINT handler is exercised on every file-save restart, which surfaces shutdown bugs before they reach production. Teams that use tsx watch without a shutdown handler often discover they have unclosed database connections only after deploying to a pm2-managed server.


Choosing Between pm2 Cluster Mode and Application-Level Clustering

When you run pm2 in cluster mode, pm2 forks your application N times and distributes incoming TCP connections across those processes using the Node.js cluster module internally. This means pm2 cluster mode is actually built on top of node:cluster — the distinction is that pm2 manages the lifecycle while node:cluster gives you direct access to the IPC channel and worker events.

For stateless HTTP APIs, pm2 cluster mode is almost always the right choice. The ecosystem.config.cjs file handles everything: number of instances, environment variables, memory restart thresholds, and log rotation. There is no application code to maintain.

For stateful workloads, node:cluster's IPC channel becomes valuable. If workers need to share a distributed cache invalidation signal, a rate-limit counter, or a configuration reload event, the primary-to-worker message passing is the correct mechanism. Redis pub/sub is another option, but for coordinating workers on the same machine, IPC has lower latency and zero external dependencies. Building this on top of node:cluster directly — rather than trying to use pm2 inter-process messaging — gives cleaner, more testable code. Use pm2 for the outer lifecycle management while your application handles its own worker coordination via the cluster IPC channel.


Logging and Log Aggregation in Production

pm2's built-in log management (pm2 logs, pm2 flush, log rotation with pm2-logrotate) covers the most common single-server logging needs, but it doesn't solve centralized log aggregation across multiple instances. For multi-server deployments, pm2 logs stay local to each machine unless you integrate with an external log shipper. The recommended pattern is to configure your Node.js application to write structured JSON logs to stdout and stderr, let pm2 capture those streams per its configuration, and then use a log shipper (Filebeat, Fluentd, or a sidecar container) to forward them to a central store like Elasticsearch, Loki, or CloudWatch. This keeps pm2 responsible for process management while delegating log aggregation to a dedicated tool. With node:cluster and a single managed process, the same stdout/stderr approach applies without any pm2-specific configuration. In Docker environments, both stdout and stderr from any process (pm2 or not) are captured by the container runtime's logging driver and can be routed to any backend without application-level configuration.

Methodology

Download data from npm registry (weekly average, February 2026). Feature comparison based on pm2 v5.x, Node.js 22 cluster module, and tsx v4.x.

Compare process management and developer tooling on PkgPulse →

See also: h3 vs polka vs koa 2026 and proxy-agent vs global-agent vs hpagent, better-sqlite3 vs libsql vs sql.js.

The 2026 JavaScript Stack Cheatsheet

One PDF: the best package for every category (ORMs, bundlers, auth, testing, state management). Used by 500+ devs. Free, updated monthly.