Most Node.js tutorials teach you how to build a REST API in 20 minutes. None of them teach you what happens when that API serves 10,000 requests per second and an unhandled promise rejection crashes the process at 3 AM.
At Pillai Infotech, we run Node.js in production across multiple client projects — API servers, real-time services, background workers. These practices come from real incidents: the memory leak that took down a payment service, the missing error handler that silently dropped orders, the unvalidated input that nearly became a security breach.
Project Structure That Scales
The flat file structure that works for a 500-line Express app doesn't work at 50,000 lines. Structure your project by domain, not by technical role.
// DON'T: Group by technical role (doesn't scale)
├── controllers/
│ ├── userController.js
│ ├── orderController.js
│ └── productController.js
├── models/
├── routes/
├── services/
// DO: Group by domain/feature
├── modules/
│ ├── users/
│ │ ├── user.controller.ts
│ │ ├── user.service.ts
│ │ ├── user.repository.ts
│ │ ├── user.routes.ts
│ │ ├── user.validation.ts
│ │ └── user.test.ts
│ ├── orders/
│ │ ├── order.controller.ts
│ │ ├── order.service.ts
│ │ └── ...
├── shared/
│ ├── middleware/
│ ├── errors/
│ ├── config/
│ └── database/
├── app.ts // Express setup
└── server.ts // HTTP server + graceful shutdown
Key principles: keep app.ts (Express configuration) separate from server.ts (HTTP server lifecycle). This lets you import the app for testing without starting the server. Each module is self-contained — you can understand the orders feature by reading one folder.
Error Handling: The #1 Production Issue
Unhandled errors are the leading cause of Node.js crashes in production. Every error must be caught, logged, and handled — no exceptions.
Custom Error Classes
// shared/errors/app-error.ts
export class AppError extends Error {
constructor(
public statusCode: number,
message: string,
public isOperational = true // vs programmer errors
) {
super(message);
Error.captureStackTrace(this, this.constructor);
}
}
export class NotFoundError extends AppError {
constructor(resource: string) {
super(404, `${resource} not found`);
}
}
export class ValidationError extends AppError {
constructor(message: string) {
super(400, message);
}
}
Centralized Error Handler
// shared/middleware/error-handler.ts
export function errorHandler(err, req, res, next) {
// Operational errors: send to client
if (err instanceof AppError && err.isOperational) {
logger.warn({ err, path: req.path }, 'Operational error');
return res.status(err.statusCode).json({
error: err.message
});
}
// Programmer errors: log, don't expose details
logger.error({ err, path: req.path }, 'Unexpected error');
res.status(500).json({ error: 'Internal server error' });
// For truly unexpected errors, consider graceful shutdown
// process.exit(1) — let the process manager restart
}
// CRITICAL: Catch unhandled rejections and exceptions
process.on('unhandledRejection', (reason) => {
logger.fatal({ reason }, 'Unhandled rejection — shutting down');
// Throw to trigger uncaughtException handler
throw reason;
});
process.on('uncaughtException', (err) => {
logger.fatal({ err }, 'Uncaught exception — shutting down');
// Graceful shutdown: stop accepting new requests,
// finish in-flight, then exit
server.close(() => process.exit(1));
// Force kill after 10s if graceful fails
setTimeout(() => process.exit(1), 10000);
});
Async Error Wrapper for Express
// Express doesn't catch async errors automatically
const asyncHandler = (fn) => (req, res, next) =>
Promise.resolve(fn(req, res, next)).catch(next);
// Usage: no try-catch needed in route handlers
router.get('/users/:id', asyncHandler(async (req, res) => {
const user = await userService.findById(req.params.id);
if (!user) throw new NotFoundError('User');
res.json(user);
}));
Security Hardening
Node.js apps are internet-facing by nature. These security measures aren't optional — they're the minimum for any production deployment.
Essential Security Headers
import helmet from 'helmet';
import rateLimit from 'express-rate-limit';
import cors from 'cors';
// Security headers (helmet sets 15+ headers)
app.use(helmet());
// CORS: whitelist specific origins
app.use(cors({
origin: ['https://yourapp.com', 'https://admin.yourapp.com'],
credentials: true
}));
// Rate limiting: prevent brute force
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // 100 requests per window
standardHeaders: true,
legacyHeaders: false,
message: { error: 'Too many requests, please try again later' }
});
app.use('/api/', limiter);
// Stricter limit for auth endpoints
const authLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 5, // 5 login attempts per 15 minutes
});
app.use('/api/auth/login', authLimiter);
Input Validation at Every Boundary
import { z } from 'zod';
// Validate with Zod — parse, don't validate
const CreateUserSchema = z.object({
email: z.string().email().max(255),
name: z.string().min(1).max(100),
password: z.string().min(8).max(128),
});
router.post('/users', asyncHandler(async (req, res) => {
const data = CreateUserSchema.parse(req.body);
// data is now typed AND validated
const user = await userService.create(data);
res.status(201).json(user);
}));
More security essentials: never expose stack traces in production, use parameterized queries (never string concatenation for SQL), store secrets in environment variables (not code), keep dependencies updated (npm audit in CI), and use node --permission flag for filesystem sandboxing.
Performance and Scaling
Clustering: Use All CPU Cores
Node.js runs on a single thread. A 4-core server running a single Node process wastes 75% of available CPU. Use the cluster module or PM2 to fork multiple workers.
// server.ts — Cluster mode
import cluster from 'node:cluster';
import { availableParallelism } from 'node:os';
const numCPUs = availableParallelism();
if (cluster.isPrimary) {
console.log(`Primary ${process.pid} forking ${numCPUs} workers`);
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', (worker, code) => {
console.error(`Worker ${worker.process.pid} died (code ${code})`);
cluster.fork(); // Replace dead workers
});
} else {
// Workers share the TCP port
app.listen(3000, () => {
console.log(`Worker ${process.pid} started`);
});
}
// Or just use PM2:
// pm2 start server.js -i max
Don't Block the Event Loop
This is the single most important Node.js performance rule. Any synchronous operation that takes more than ~10ms blocks every other request.
// BAD: Blocks the event loop
const data = fs.readFileSync('large-file.json'); // Blocks!
const parsed = JSON.parse(hugeString); // Blocks if > 50MB!
const hash = crypto.pbkdf2Sync(password, salt, 100000, 64, 'sha512');
// GOOD: Non-blocking alternatives
const data = await fs.promises.readFile('large-file.json');
const hash = await util.promisify(crypto.pbkdf2)(
password, salt, 100000, 64, 'sha512'
);
// For CPU-intensive work: use Worker Threads
import { Worker } from 'node:worker_threads';
const result = await runInWorker('heavy-computation.js', inputData);
Connection Pooling
Creating a new database connection per request is a guaranteed bottleneck. Use connection pools — every database driver supports them.
// PostgreSQL connection pool
import pg from 'pg';
const pool = new pg.Pool({
host: process.env.DB_HOST,
max: 20, // Max connections in pool
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
// Use pool.query — connections are automatically managed
const { rows } = await pool.query('SELECT * FROM users WHERE id = $1', [id]);
Memory Management and Leak Prevention
Memory leaks are insidious — your app works fine for hours, then response times spike and the process crashes with "heap out of memory." We've debugged dozens of these.
Common Leak Sources
- Global caches without eviction — objects accumulate indefinitely
- Event listeners not removed — especially in WebSocket or EventEmitter patterns
- Closures holding references — large objects trapped in scope
- Growing arrays/maps — request logs, error queues without size limits
// BAD: Unbounded in-memory cache — will leak
const cache = {};
function getUser(id) {
if (!cache[id]) cache[id] = db.findUser(id);
return cache[id];
}
// GOOD: LRU cache with max size
import { LRUCache } from 'lru-cache';
const cache = new LRUCache({ max: 500, ttl: 1000 * 60 * 5 });
// Monitor memory in production
setInterval(() => {
const mem = process.memoryUsage();
logger.info({
rss: Math.round(mem.rss / 1024 / 1024) + 'MB',
heap: Math.round(mem.heapUsed / 1024 / 1024) + 'MB',
}, 'Memory usage');
}, 30000);
--max-old-space-size=512 in development (lower than production). Memory leaks will manifest faster, making them easier to catch before deployment.
Logging and Observability
If you're using console.log in production, you're already in trouble. Structured logging is non-negotiable for any app that needs debugging.
// Use Pino — fastest Node.js logger
import pino from 'pino';
const logger = pino({
level: process.env.LOG_LEVEL || 'info',
// JSON in production, pretty in development
transport: process.env.NODE_ENV === 'development'
? { target: 'pino-pretty' }
: undefined,
});
// Add request context to every log
app.use((req, res, next) => {
req.log = logger.child({
requestId: req.headers['x-request-id'] || crypto.randomUUID(),
method: req.method,
path: req.path,
});
next();
});
// Usage in handlers
router.get('/orders/:id', asyncHandler(async (req, res) => {
req.log.info({ orderId: req.params.id }, 'Fetching order');
const order = await orderService.findById(req.params.id);
res.json(order);
}));
For full observability, add distributed tracing (OpenTelemetry), metrics (Prometheus client), and health check endpoints.
// Health check endpoint
app.get('/health', (req, res) => {
const health = {
status: 'ok',
uptime: process.uptime(),
timestamp: Date.now(),
memory: process.memoryUsage(),
};
res.json(health);
});
// Readiness check — includes dependency health
app.get('/ready', async (req, res) => {
try {
await pool.query('SELECT 1'); // DB alive?
await redis.ping(); // Cache alive?
res.json({ status: 'ready' });
} catch (err) {
res.status(503).json({ status: 'not ready', error: err.message });
}
});
Testing Strategy
We use Vitest (faster than Jest, compatible API) with this testing pyramid:
- Unit tests (70%) — services, utilities, validators. Fast, no I/O.
- Integration tests (20%) — API endpoints hitting a real test database. Use
supertestwith the Express app (no server startup needed). - E2E tests (10%) — critical user flows only. Expensive, run in CI.
// Integration test with supertest + Vitest
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import request from 'supertest';
import { app } from '../app';
describe('POST /api/users', () => {
it('creates a user with valid data', async () => {
const res = await request(app)
.post('/api/users')
.send({ email: 'test@example.com', name: 'Test', password: 'secure123' })
.expect(201);
expect(res.body).toHaveProperty('id');
expect(res.body.email).toBe('test@example.com');
});
it('rejects invalid email', async () => {
await request(app)
.post('/api/users')
.send({ email: 'not-an-email', name: 'Test', password: 'secure123' })
.expect(400);
});
});
Deployment Patterns
Graceful Shutdown
When deploying, SIGTERM tells your process to stop. Without graceful shutdown, in-flight requests get dropped.
// Graceful shutdown handler
function gracefulShutdown(signal) {
logger.info({ signal }, 'Received shutdown signal');
server.close(async () => {
logger.info('HTTP server closed');
// Close database connections
await pool.end();
await redis.quit();
logger.info('All connections closed. Exiting.');
process.exit(0);
});
// Force shutdown after 30 seconds
setTimeout(() => {
logger.error('Forced shutdown after timeout');
process.exit(1);
}, 30000);
}
process.on('SIGTERM', gracefulShutdown);
process.on('SIGINT', gracefulShutdown);
Docker Best Practices
# Multi-stage Docker build for Node.js
FROM node:22-slim AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY . .
RUN npm run build
FROM node:22-slim
WORKDIR /app
RUN addgroup --system app && adduser --system --ingroup app app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package.json ./
USER app
EXPOSE 3000
CMD ["node", "dist/server.js"]
For more on Docker best practices and CI/CD pipelines, see our dedicated guides.