PM2 Process Management for Node.js in Production

PM2 process management for Node.js production: ecosystem file configuration, zero-downtime restarts, log management, monitoring, clustering, and deployment workflows.

E
ECOSIRE Research and Development Team
|March 19, 20269 min read1.9k Words|

PM2 Process Management for Node.js in Production

When your Node.js application crashes at 2 AM, PM2 is the difference between it restarting automatically and your users seeing a blank page until you wake up. PM2 is a battle-tested process manager that handles automatic restarts, clustering for multi-core utilization, log aggregation, and zero-downtime deploys — all with a single configuration file that lives in your repository.

This guide covers a production PM2 setup managing 5 Node.js processes simultaneously: Next.js (frontend), NestJS (API), Docusaurus (docs), and two brand sites. The patterns apply equally to single-process deployments.

Key Takeaways

  • The ecosystem.config.cjs file (CommonJS, not .js) works with both ESModule and CommonJS projects
  • --update-env flag is required when restarting to pick up new environment variables
  • Never use pm2 restart all without --update-env after updating .env.local
  • watch: false in production — file watching causes infinite restart loops with build outputs
  • max_memory_restart provides automatic memory leak protection without killing the process permanently
  • node_args: '--max-old-space-size=4096' prevents OOM crashes on memory-intensive operations
  • PM2 logs rotate with pm2-logrotate module — install it immediately after PM2 itself
  • pm2 save and pm2 startup persist your process list across server reboots

Installation

# Install PM2 globally
npm install -g pm2

# Install the log rotation module immediately
pm2 install pm2-logrotate

# Configure log rotation
pm2 set pm2-logrotate:max_size 50M
pm2 set pm2-logrotate:retain 7
pm2 set pm2-logrotate:compress true
pm2 set pm2-logrotate:dateFormat YYYY-MM-DD

Ecosystem Configuration File

The ecosystem.config.cjs file (CommonJS format to work with both ESM and CJS projects) defines all your processes:

// ecosystem.config.cjs
module.exports = {
  apps: [
    // ─── Next.js Frontend ────────────────────────────────────────────
    {
      name: 'ecosire-web',
      script: 'node_modules/.bin/next',
      args: 'start',
      cwd: '/opt/ecosire/app/apps/web',
      instances: 1,        // Single instance — Next.js handles its own multi-threading
      exec_mode: 'fork',
      env: {
        NODE_ENV: 'production',
        PORT: 3000,
      },
      // Memory management
      max_memory_restart: '1G',
      node_args: '--max-old-space-size=1024',
      // Logging
      out_file: '/var/log/pm2/ecosire-web.out.log',
      error_file: '/var/log/pm2/ecosire-web.err.log',
      merge_logs: true,
      log_date_format: 'YYYY-MM-DD HH:mm:ss Z',
      // Restart behavior
      watch: false,
      restart_delay: 3000,
      max_restarts: 10,
      min_uptime: '30s',   // Must stay up 30s to count as successful start
      autorestart: true,
      // Graceful shutdown
      kill_timeout: 30000, // 30 seconds to shut down gracefully
      wait_ready: true,    // Wait for process.send('ready')
      listen_timeout: 60000,
    },

    // ─── NestJS API ──────────────────────────────────────────────────
    {
      name: 'ecosire-api',
      script: 'dist/main.js',
      cwd: '/opt/ecosire/app/apps/api',
      instances: 2,         // Cluster mode for multi-core utilization
      exec_mode: 'cluster',
      env: {
        NODE_ENV: 'production',
        PORT: 3001,
      },
      max_memory_restart: '512M',
      node_args: '--max-old-space-size=512',
      out_file: '/var/log/pm2/ecosire-api.out.log',
      error_file: '/var/log/pm2/ecosire-api.err.log',
      merge_logs: true,
      log_date_format: 'YYYY-MM-DD HH:mm:ss Z',
      watch: false,
      restart_delay: 2000,
      max_restarts: 10,
      min_uptime: '20s',
      autorestart: true,
      kill_timeout: 15000,
      // Graceful cluster reload support
      listen_timeout: 30000,
    },

    // ─── Docusaurus Docs ─────────────────────────────────────────────
    {
      name: 'ecosire-docs',
      script: 'node_modules/.bin/docusaurus',
      args: 'serve',
      cwd: '/opt/ecosire/app/apps/docs',
      instances: 1,
      exec_mode: 'fork',
      env: {
        NODE_ENV: 'production',
        PORT: 3002,
      },
      max_memory_restart: '256M',
      out_file: '/var/log/pm2/ecosire-docs.out.log',
      error_file: '/var/log/pm2/ecosire-docs.err.log',
      merge_logs: true,
      log_date_format: 'YYYY-MM-DD HH:mm:ss Z',
      watch: false,
      restart_delay: 3000,
      max_restarts: 5,
      min_uptime: '30s',
      autorestart: true,
      kill_timeout: 10000,
    },

    // ─── Brand Site: Odovation ───────────────────────────────────────
    {
      name: 'odovation-web',
      script: 'node_modules/.bin/next',
      args: 'start',
      cwd: '/opt/ecosire/app/apps/odovation',
      instances: 1,
      exec_mode: 'fork',
      env: {
        NODE_ENV: 'production',
        PORT: 3010,
      },
      max_memory_restart: '512M',
      out_file: '/var/log/pm2/odovation-web.out.log',
      error_file: '/var/log/pm2/odovation-web.err.log',
      merge_logs: true,
      log_date_format: 'YYYY-MM-DD HH:mm:ss Z',
      watch: false,
      restart_delay: 3000,
      max_restarts: 10,
      min_uptime: '30s',
      autorestart: true,
    },

    // ─── Brand Site: MuhammadAmir ────────────────────────────────────
    {
      name: 'muhammadamir-web',
      script: 'node_modules/.bin/next',
      args: 'start',
      cwd: '/opt/ecosire/app/apps/muhammadamir',
      instances: 1,
      exec_mode: 'fork',
      env: {
        NODE_ENV: 'production',
        PORT: 3020,
      },
      max_memory_restart: '512M',
      out_file: '/var/log/pm2/muhammadamir-web.out.log',
      error_file: '/var/log/pm2/muhammadamir-web.err.log',
      merge_logs: true,
      log_date_format: 'YYYY-MM-DD HH:mm:ss Z',
      watch: false,
      restart_delay: 3000,
      max_restarts: 10,
      min_uptime: '30s',
      autorestart: true,
    },
  ],
};

Core PM2 Commands

# Start all processes from ecosystem file
pm2 start ecosystem.config.cjs

# Restart all (with updated environment variables)
pm2 restart ecosystem.config.cjs --update-env

# Graceful reload (zero-downtime for cluster mode)
pm2 reload ecosystem.config.cjs

# Stop all processes
pm2 stop all

# Delete all processes from PM2 registry
pm2 delete all

# Individual process management
pm2 restart ecosire-api
pm2 stop ecosire-docs
pm2 logs ecosire-web --lines 100

# Real-time monitoring dashboard
pm2 monit

# Status overview
pm2 status
pm2 list

Startup on Server Reboot

Without startup configuration, all PM2 processes are lost on server reboot:

# Generate and install the startup script for your init system
pm2 startup
# Copy the output command and run it (it looks like:)
# sudo env PATH=$PATH:/usr/bin pm2 startup systemd -u ubuntu --hp /home/ubuntu

# Save the current process list
pm2 save
# This creates ~/.pm2/dump.pm2 — processes are restored on reboot

# Verify startup works
pm2 resurrect  # Manually restore from dump.pm2

Every time you add or remove processes, run pm2 save again to update the dump file.


Zero-Downtime Deployments

For NestJS in cluster mode, PM2 supports true zero-downtime reloads:

# Reload restarts workers one at a time (zero-downtime)
# Old workers handle requests while new ones start
pm2 reload ecosire-api

# vs restart — kills all workers simultaneously (brief downtime)
pm2 restart ecosire-api

For Next.js (which runs in fork mode, single instance), zero-downtime requires a different approach. Use the wait_ready + listen_timeout configuration with a startup signal from your app:

// apps/web — this is handled automatically by Next.js
// But for NestJS, send the ready signal explicitly:

// apps/api/src/main.ts
async function bootstrap() {
  const app = await NestFactory.create(AppModule);
  await app.listen(3001);

  // Signal PM2 that the process is ready
  if (process.send) {
    process.send('ready');
  }
}

bootstrap();

Log Management

PM2 logs can fill your disk if not managed. Configure log rotation immediately:

# Install log rotation module
pm2 install pm2-logrotate

# Configuration
pm2 set pm2-logrotate:max_size 50M       # Rotate when log reaches 50MB
pm2 set pm2-logrotate:retain 7           # Keep 7 days of logs
pm2 set pm2-logrotate:compress true      # Gzip rotated logs
pm2 set pm2-logrotate:dateFormat YYYY-MM-DD
pm2 set pm2-logrotate:workerInterval 30  # Check rotation interval (seconds)
pm2 set pm2-logrotate:rotateInterval '0 0 * * *'  # Daily at midnight

Useful log commands:

# View all logs combined
pm2 logs

# View specific process logs
pm2 logs ecosire-api

# View with timestamps
pm2 logs --timestamp

# Flush all log files
pm2 flush

# Tail error logs only
pm2 logs ecosire-api --err --lines 200

Monitoring and Metrics

PM2 Plus (formerly Keymetrics) provides cloud-based monitoring. For self-hosted monitoring:

# Built-in terminal dashboard
pm2 monit

# Get JSON status for scripting/monitoring integration
pm2 jlist    # JSON process list
pm2 prettylist  # Formatted process list

# Integrate with your monitoring stack
pm2 set pm2-server-monit:interval 5  # Metrics collection interval

For production monitoring, expose PM2 metrics to Prometheus:

npm install -g pm2-prometheus-exporter
pm2 set pm2-prometheus-exporter:port 9209

# Scrape in Prometheus config:
# - job_name: pm2
#   static_configs:
#     - targets: ['localhost:9209']

Deployment Script Integration

A typical deployment sequence:

#!/bin/bash
# scripts/deploy-production.sh

set -e

echo "=== Starting deployment ==="

# 1. Pull latest code
git pull origin main

# 2. Install dependencies
pnpm install --frozen-lockfile

# 3. Build all apps (with Turbo remote cache)
TURBO_TOKEN="$TURBO_TOKEN" TURBO_TEAM="$TURBO_TEAM" \
  npx turbo run build

# 4. Run database migrations
pnpm --filter @ecosire/db db:migrate

# 5. Restart PM2 processes
# --update-env picks up changes in .env.local
pm2 restart ecosystem.config.cjs --update-env

# 6. Wait for processes to stabilize
sleep 10

# 7. Health checks
curl -f https://ecosire.com/ -o /dev/null -s || {
  echo "Web health check failed — rolling back"
  git revert HEAD --no-edit
  pm2 restart ecosystem.config.cjs --update-env
  exit 1
}

curl -f https://api.ecosire.com/api/health -o /dev/null -s || {
  echo "API health check failed — rolling back"
  git revert HEAD --no-edit
  pm2 restart ecosystem.config.cjs --update-env
  exit 1
}

# 8. Save process state
pm2 save

echo "=== Deployment complete ==="

Common Pitfalls and Solutions

Pitfall 1: Forgetting --update-env

After updating .env.local, running pm2 restart all without --update-env causes processes to restart with the old environment variables. Always use pm2 restart ecosystem.config.cjs --update-env.

Pitfall 2: Using watch: true in production

watch: true restarts the process when any file changes. In production, build outputs change every deploy — this causes infinite restart loops. Always set watch: false.

Pitfall 3: Not handling SIGTERM for graceful shutdown

PM2 sends SIGTERM when restarting/stopping. If your app doesn't handle it, PM2 waits kill_timeout milliseconds and sends SIGKILL — which can cause lost requests. Handle SIGTERM in NestJS:

// main.ts
const app = await NestFactory.create(AppModule);
await app.listen(3001);

// Graceful shutdown
process.on('SIGTERM', async () => {
  await app.close();
  process.exit(0);
});

Pitfall 4: Running out of PM2 log disk space

Without pm2-logrotate, PM2 logs grow indefinitely. A heavily-trafficked API can generate gigabytes of logs per day. Install pm2-logrotate immediately and set a reasonable max_size (50MB) and retain (7 days).

Pitfall 5: Losing processes after reboot

pm2 start does not persist processes across reboots. Always run pm2 startup + pm2 save after initial setup. If processes disappear after a reboot, run pm2 resurrect to restore from the saved dump.


Frequently Asked Questions

When should I use cluster mode vs. fork mode?

Use cluster mode for CPU-bound workloads (NestJS APIs with heavy computation, data processing). Cluster mode spawns instances worker processes and PM2 load-balances between them — leveraging all CPU cores. Use fork mode for I/O-bound workloads (Next.js, static file serving) or when the process doesn't support clustering (single-threaded scripts, Docusaurus serve). Next.js handles its own worker threads internally, so fork mode with instances: 1 is correct.

How do I run PM2 in a Docker container?

PM2 in Docker uses pm2-docker (or pm2-runtime) instead of pm2 to handle signals correctly. The runtime version doesn't daemonize (which would cause Docker to exit), properly forwards signals to child processes, and logs to stdout/stderr instead of files. Use CMD ["pm2-runtime", "ecosystem.config.cjs"] in your Dockerfile.

How do I monitor PM2 processes from a remote machine?

PM2 Plus (pay-per-process cloud service) provides a web dashboard. For self-hosted monitoring, expose PM2's metrics via the Prometheus exporter and visualize in Grafana. For simple status checks, you can SSH and run pm2 status, or expose the metrics via an HTTP endpoint that your monitoring system polls.

What's the difference between pm2 reload and pm2 restart?

pm2 restart kills all workers simultaneously and restarts them — there's a brief period with no running workers (downtime). pm2 reload is graceful: it starts new workers, waits for them to be ready, then shuts down old workers — zero downtime. Use pm2 reload for production deployments. Note: reload only works correctly in cluster mode; fork mode falls back to restart behavior.

How do I set different environment variables for different processes?

Each process in ecosystem.config.cjs has its own env and env_production sections. The env_production section is used when you pass --env production to PM2 commands. For secrets, never put them directly in the ecosystem file — set them in the system environment or .env.local file and let PM2 inherit them. The --update-env flag ensures PM2 re-reads environment variables when restarting.


Next Steps

PM2 is a fundamental part of any production Node.js deployment. ECOSIRE manages 5 PM2 processes in production — Next.js, NestJS, Docusaurus, and two brand sites — with automatic restarts, log rotation, and zero-downtime deployments on every push to main.

Whether you need DevOps engineering support, production deployment architecture, or help migrating to a containerized setup, explore our services to see how we can help.

E

Written by

ECOSIRE Research and Development Team

Building enterprise-grade digital products at ECOSIRE. Sharing insights on Odoo integrations, e-commerce automation, and AI-powered business solutions.

Chat on WhatsApp