PM2 Cluster on AWS EC2: Zero-Downtime Node.js Deployment Playbook

Cluster mode is still a practical option when you need predictable cost and simple operations. The key is disciplined process management, graceful shutdown, and repeatable release scripts.

1) Use an explicit PM2 ecosystem file

javascriptecosystem.config.cjs

1module.exports = {
2  apps: [
3    {
4      name: "api",
5      script: "./dist/server.js",
6      instances: "max",
7      exec_mode: "cluster",
8      listen_timeout: 10000,
9      kill_timeout: 5000,
10      max_memory_restart: "400M",
11      env: {
12        NODE_ENV: "production",
13        PORT: 4000
14      }
15    }
16  ]
17};

2) Graceful shutdown is mandatory

If your service does not stop cleanly, PM2 reload can still drop connections. Listen for SIGINT/SIGTERM and close HTTP server before exit.

javascriptserver-shutdown.js

1const server = app.listen(process.env.PORT || 4000);
2 
3function gracefulExit(signal) {
4  console.log("received", signal);
5  server.close(() => process.exit(0));
6  setTimeout(() => process.exit(1), 8000);
7}
8 
9process.on("SIGINT", () => gracefulExit("SIGINT"));
10process.on("SIGTERM", () => gracefulExit("SIGTERM"));

3) Keep Nginx config boring and explicit

nginx/etc/nginx/sites-enabled/api.conf

1server {
2  listen 80;
3  server_name api.dudelemon.com;
4 
5  location / {
6    proxy_pass http://127.0.0.1:4000;
7    proxy_http_version 1.1;
8    proxy_set_header Upgrade $http_upgrade;
9    proxy_set_header Connection "upgrade";
10    proxy_set_header Host $host;
11    proxy_cache_bypass $http_upgrade;
12  }
13}

4) Release script for zero-downtime deploy

bashdeploy.sh

1#!/usr/bin/env bash
2set -euo pipefail
3 
4git fetch origin main
5git reset --hard origin/main
6npm ci
7npm run build
8pm2 reload ecosystem.config.cjs --update-env
9pm2 save

Most outages during deploy come from process shutdown bugs, not PM2 itself.

5) Infrastructure baseline before first production deploy

Harden SSH access and disable password authentication.
Set up CloudWatch/Datadog monitoring for CPU, memory, disk, and error rates.
Provision TLS certificates and renewal automation.
Back up environment and deployment secrets outside the instance.

6) Release safety and rollback strategy

A rollback plan must be executable in minutes. Keep last known good artifact references and a scriptable rollback path. If health checks fail after deploy, rollback should be automatic or one command away.

bashrollback.sh

1#!/usr/bin/env bash
2set -euo pipefail
3 
4git fetch origin
5git checkout "$1"   # tag or commit hash
6npm ci
7npm run build
8pm2 reload ecosystem.config.cjs --update-env

7) Observability signals that matter most

P95 and P99 latency by route.
Non-2xx response rate over 5-minute windows.
Process restarts and memory pressure per worker.
Connection errors between app and dependent services.

PM2 on EC2 FAQ

Q: Should we run one big EC2 instance or multiple smaller nodes? A: Prefer multiple nodes for resilience and controlled failure domains.

Q: Is PM2 enough for all scaling needs? A: PM2 handles process-level scaling well, but infrastructure auto-scaling and load balancing still matter.

Need help building this?

Let our team build it for you.

Dude Lemon builds production-grade web apps, APIs, and cloud infrastructure. Get a free consultation and project proposal within 48 hours.

Start a Project

← PreviousImplementing WebAuthn Passkeys in a Node.js AppSecurity

Next →React Native CI/CD With EAS Build and EAS SubmitMobile

1) Use an explicit PM2 ecosystem file

javascriptecosystem.config.cjs

1module.exports = {
2  apps: [
3    {
4      name: "api",
5      script: "./dist/server.js",
6      instances: "max",
7      exec_mode: "cluster",
8      listen_timeout: 10000,
9      kill_timeout: 5000,
10      max_memory_restart: "400M",
11      env: {
12        NODE_ENV: "production",
13        PORT: 4000
14      }
15    }
16  ]
17};

2) Graceful shutdown is mandatory

If your service does not stop cleanly, PM2 reload can still drop connections. Listen for SIGINT/SIGTERM and close HTTP server before exit.

javascriptserver-shutdown.js

1const server = app.listen(process.env.PORT || 4000);
2 
3function gracefulExit(signal) {
4  console.log("received", signal);
5  server.close(() => process.exit(0));
6  setTimeout(() => process.exit(1), 8000);
7}
8 
9process.on("SIGINT", () => gracefulExit("SIGINT"));
10process.on("SIGTERM", () => gracefulExit("SIGTERM"));

3) Keep Nginx config boring and explicit

nginx/etc/nginx/sites-enabled/api.conf

1server {
2  listen 80;
3  server_name api.dudelemon.com;
4 
5  location / {
6    proxy_pass http://127.0.0.1:4000;
7    proxy_http_version 1.1;
8    proxy_set_header Upgrade $http_upgrade;
9    proxy_set_header Connection "upgrade";
10    proxy_set_header Host $host;
11    proxy_cache_bypass $http_upgrade;
12  }
13}

6) Release safety and rollback strategy

bashrollback.sh

1#!/usr/bin/env bash
2set -euo pipefail
3 
4git fetch origin
5git checkout "$1"   # tag or commit hash
6npm ci
7npm run build
8pm2 reload ecosystem.config.cjs --update-env

PM2 on EC2 FAQ

Q: Should we run one big EC2 instance or multiple smaller nodes? A: Prefer multiple nodes for resilience and controlled failure domains.

Q: Is PM2 enough for all scaling needs? A: PM2 handles process-level scaling well, but infrastructure auto-scaling and load balancing still matter.

Need help building this?

Let our team build it for you.

Dude Lemon builds production-grade web apps, APIs, and cloud infrastructure. Get a free consultation and project proposal within 48 hours.

Start a Project

PM2 Cluster Mode on AWS EC2: A Production Setup Guide

1) Use an explicit PM2 ecosystem file

2) Graceful shutdown is mandatory

3) Keep Nginx config boring and explicit

4) Release script for zero-downtime deploy

5) Infrastructure baseline before first production deploy

6) Release safety and rollback strategy

7) Observability signals that matter most

PM2 on EC2 FAQ

Let our team build it for you.

PM2 Cluster Mode on AWS EC2: A Production Setup Guide

1) Use an explicit PM2 ecosystem file

2) Graceful shutdown is mandatory

3) Keep Nginx config boring and explicit

4) Release script for zero-downtime deploy

5) Infrastructure baseline before first production deploy

6) Release safety and rollback strategy

7) Observability signals that matter most

PM2 on EC2 FAQ

Let our team build it for you.