We audited a client's Docker setup last quarter. Their Node.js API image was 1.8GB. It contained the full development toolchain, node_modules with 400 dev dependencies, a .git directory, and their .env file with production database credentials baked into the image layer. The image took 4 minutes to push to the registry and 3 minutes to pull on deploy.
We rebuilt it. Final image: 89MB. Build time dropped from 6 minutes to 45 seconds. Deploy time from 7 minutes to 90 seconds. And the credentials were no longer permanently embedded in a Docker layer anyone with registry access could extract.
At Pillai Infotech, we containerize applications for production weekly. The gap between "it works in Docker" and "it's production-ready in Docker" is where most of the real work happens.
Development Docker vs. Production Docker
The first mistake teams make: using the same Dockerfile for development and production. They have different goals.
| Concern | Development | Production |
|---|---|---|
| Image size | Doesn't matter | Critical — affects deploy time + cost |
| Build time | Fast iteration priority | Reproducibility priority |
| Volume mounts | Mount source code for hot reload | COPY code into image — no mounts |
| Dev dependencies | Include everything | Production deps only |
| Debug tools | Include (debuggers, shells) | Exclude (smaller attack surface) |
| User | Root is fine | Never root — use non-root user |
Solution: use multi-stage builds with a shared base and separate dev/prod targets, or maintain a Dockerfile (production) and Dockerfile.dev (development).
Multi-Stage Builds: The Single Most Important Practice
If you take one thing from this article, let it be multi-stage builds. They solve image size, security, and build caching in one pattern.
Node.js Example
# Stage 1: Install dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --production
# Stage 2: Build
FROM node:20-alpine AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
# Stage 3: Production image
FROM node:20-alpine AS production
WORKDIR /app
# Non-root user
RUN addgroup -g 1001 -S appgroup && \
adduser -S appuser -u 1001 -G appgroup
USER appuser
# Only production deps + built assets
COPY --from=deps --chown=appuser:appgroup /app/node_modules ./node_modules
COPY --from=build --chown=appuser:appgroup /app/dist ./dist
COPY --from=build --chown=appuser:appgroup /app/package.json ./
EXPOSE 3000
CMD ["node", "dist/server.js"]
This image contains only what's needed to run: Node.js runtime, production dependencies, and compiled code. No TypeScript compiler, no test frameworks, no dev tools.
Go Example (Distroless)
# Build
FROM golang:1.22 AS build
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /server ./cmd/server
# Production — distroless (no shell, no package manager)
FROM gcr.io/distroless/static-debian12
COPY --from=build /server /server
USER nonroot:nonroot
ENTRYPOINT ["/server"]
Final image size: ~15MB. No shell, no package manager, no OS utilities — nothing an attacker could exploit if they somehow got into the container.
Python Example
# Build
FROM python:3.12-slim AS build
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
# Production
FROM python:3.12-slim
WORKDIR /app
RUN useradd --create-home --shell /bin/false appuser
USER appuser
COPY --from=build /install /usr/local
COPY --chown=appuser:appuser . .
CMD ["gunicorn", "app:create_app()", "-b", "0.0.0.0:8000", "-w", "4"]
Image Optimization: Every MB Counts
Layer Ordering Matters
Docker caches each layer. If a layer changes, all subsequent layers are rebuilt. Order your Dockerfile from least-changing to most-changing:
- Base image (changes monthly)
- System packages (changes monthly)
- Dependency files — package.json, requirements.txt (changes weekly)
- Install dependencies (rebuilds when #3 changes)
- Source code (changes every build)
Wrong:
COPY . . # Source code change → everything below rebuilds
RUN npm ci # Reinstalls ALL deps on every code change
Right:
COPY package.json package-lock.json ./ # Only changes when deps change
RUN npm ci # Cached when deps haven't changed
COPY . . # Only source code rebuilds
Use .dockerignore
A proper .dockerignore prevents unnecessary files from entering the build context:
node_modules
.git
.env*
*.md
tests/
coverage/
.vscode/
docker-compose*.yml
Dockerfile*
Without this, docker build sends your entire directory (including node_modules and .git) to the Docker daemon. For a project with a 500MB node_modules, that's 500MB transferred before the build even starts.
Combine RUN Commands
Each RUN creates a layer. Combine related commands to reduce layers and image size:
# Bad: 3 layers, apt cache retained in first layer
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*
# Good: 1 layer, clean in same layer
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*
Base Image Size Comparison
| Base Image | Size | Has Shell? | Best For |
|---|---|---|---|
node:20 |
~1.1 GB | Yes | Development only |
node:20-slim |
~240 MB | Yes | Production (need shell) |
node:20-alpine |
~180 MB | Yes (ash) | Production (good default) |
gcr.io/distroless/nodejs20 |
~130 MB | No | Production (max security) |
gcr.io/distroless/static |
~2 MB | No | Go, Rust (compiled binaries) |
Security Hardening
1. Never Run as Root
The default Docker user is root. If an attacker exploits your application, they have root access inside the container. With certain misconfigurations, that can escalate to host root.
# Create and switch to non-root user
RUN addgroup -g 1001 -S appgroup && \
adduser -S appuser -u 1001 -G appgroup
USER appuser
2. Pin Image Versions
Never use :latest in production. Pin to a specific version — or better, pin to a digest:
# Good — pinned version
FROM node:20.11.1-alpine
# Best — pinned digest (immutable)
FROM node@sha256:abc123def456...
Version tags can be overwritten by the publisher. Digests cannot. For critical production images, use digests.
3. Scan Images in CI
Run vulnerability scans on every build. We use Trivy in all our CI/CD pipelines:
# In GitHub Actions
- name: Scan image
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
format: table
exit-code: 1 # Fail build on HIGH/CRITICAL
severity: CRITICAL,HIGH
4. Don't Store Secrets in Images
Anything in a Dockerfile layer is permanently accessible — even if you delete it in a later layer. Docker layers are additive.
# WRONG — secret is in layer history forever
COPY .env /app/.env
RUN source /app/.env && npm run build
RUN rm /app/.env # DOESN'T HELP — it's still in a previous layer
# RIGHT — use build secrets (BuildKit)
RUN --mount=type=secret,id=env,dst=/app/.env \
source /app/.env && npm run build
At runtime, inject secrets via environment variables, mounted secrets (Kubernetes), or a secrets manager (AWS Secrets Manager, HashiCorp Vault).
5. Read-Only Filesystem
Run containers with a read-only root filesystem. This prevents attackers from writing malicious files:
docker run --read-only --tmpfs /tmp myapp:latest
# In Kubernetes:
securityContext:
readOnlyRootFilesystem: true
Choosing the Right Base Image
Alpine vs. Debian Slim vs. Distroless
- Alpine: Small (~5MB base), uses musl libc instead of glibc. Great for most applications. Can cause issues with native extensions that expect glibc (some Python/Node packages). Our default recommendation.
- Debian Slim: Larger (~80MB) but uses glibc. Better compatibility with native packages. Use when Alpine causes build issues.
- Distroless: Minimal images from Google — contain only your application runtime. No shell, no package manager. Best security posture but harder to debug. Use for production workloads where security is paramount.
- Ubuntu: Familiar but heavy (~75MB). Use for development or when you need specific Ubuntu packages. Not recommended for production.
Docker Networking & Compose for Production
Docker Compose for Local Development
Docker Compose is excellent for local development — running your app alongside databases, caches, and message queues. Here's a production-grade compose file pattern:
services:
api:
build:
context: .
target: production # Multi-stage build target
ports:
- "3000:3000"
environment:
- DATABASE_URL=postgres://user:pass@db:5432/app
- REDIS_URL=redis://cache:6379
depends_on:
db:
condition: service_healthy
cache:
condition: service_started
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 5s
retries: 3
restart: unless-stopped
db:
image: postgres:16-alpine
volumes:
- pgdata:/var/lib/postgresql/data
environment:
POSTGRES_DB: app
POSTGRES_USER: user
POSTGRES_PASSWORD: pass
healthcheck:
test: ["CMD-SHELL", "pg_isready -U user -d app"]
interval: 10s
timeout: 5s
retries: 5
cache:
image: redis:7-alpine
command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru
volumes:
pgdata:
Key Networking Principles
- Service names are DNS names: In Compose,
dbresolves to the database container. In Kubernetes, use service names. - Don't expose ports you don't need: Only the API needs an external port mapping. The database and cache communicate internally on the Docker network.
- Health checks are mandatory:
depends_onwithout a health check condition only waits for the container to start, not for the service to be ready. A started PostgreSQL container is not a ready PostgreSQL.
Docker in CI/CD Pipelines
Build Once, Run Everywhere
Build the Docker image once in CI. Tag it with the git commit SHA. Push to a registry. Deploy the exact same image to staging, then production. Never rebuild for different environments.
# CI: Build and push
docker build -t registry.example.com/api:${GIT_SHA} .
docker push registry.example.com/api:${GIT_SHA}
# Staging: Deploy exact image
kubectl set image deploy/api api=registry.example.com/api:${GIT_SHA}
# Production: Same exact image
kubectl set image deploy/api api=registry.example.com/api:${GIT_SHA}
Environment-specific configuration comes from environment variables or mounted config files — not from the image.
Caching Strategies for Faster Builds
- Registry cache: Push cache layers to your registry, pull them in CI.
docker build --cache-from registry.example.com/api:cache - BuildKit cache mounts: Cache package manager directories across builds.
RUN --mount=type=cache,target=/root/.npm npm ci - GitHub Actions cache: Use
actions/cacheto persist Docker layers between CI runs.
With proper caching, repeat builds (where only source code changed) complete in 30-60 seconds instead of 5-10 minutes. This is critical for CI/CD pipeline speed.
8 Docker Mistakes We See Every Week
1. Using :latest Tag
:latest means "whatever was pushed last." It's not versioned, not reproducible, and debugging "it worked yesterday" becomes impossible. Always use specific tags.
2. No .dockerignore
Without it, COPY . . includes .git, node_modules, .env, test fixtures, and IDE config. Your build context balloons, your image contains secrets, and builds are slow.
3. Installing Dependencies as Root
Installing npm packages as root can create files owned by root that the non-root runtime user can't access. Install as the app user, or use --chown on COPY.
4. Not Using Health Checks
Without a health check, Docker (and Kubernetes) can't distinguish between a running container and a healthy one. Your load balancer routes traffic to crashed-but-running containers.
5. Storing State in Containers
Containers are ephemeral. Anything written to the container filesystem is lost on restart. Use volumes for databases, object storage (S3) for files, and external services for sessions.
6. Using ADD Instead of COPY
ADD auto-extracts archives and fetches URLs — unexpected behavior. Use COPY for copying files. Only use ADD when you explicitly want tar extraction.
7. Not Handling Signals
When Kubernetes sends SIGTERM, your app needs to handle it — finish in-flight requests, close database connections, then exit. If your entrypoint is a shell script, use exec to replace the shell process with your app, so it receives signals directly.
# Wrong — shell absorbs SIGTERM, app never gets it
CMD npm start
# Right — exec form, node receives SIGTERM directly
CMD ["node", "dist/server.js"]
8. Ignoring Image Scanning Results
Running Trivy or Snyk but ignoring the findings. "We'll fix it later" becomes "we'll fix it after the breach." Block critical/high vulnerabilities in CI — no exceptions.
Frequently Asked Questions
What's the ideal Docker image size?
For compiled languages (Go, Rust): under 20MB. For Node.js/Python: under 150MB. For Java: under 300MB. These are achievable with multi-stage builds and proper base images. If your image is over 500MB, something is wrong.
Should I use Docker Compose in production?
For simple deployments (single server, a few containers), Docker Compose works fine. For anything that needs scaling, rolling updates, or multi-node — use Kubernetes, ECS, or Cloud Run. Compose is excellent for development and testing.
How do I pass secrets to Docker containers?
Never bake secrets into images. Use environment variables for simple cases, Docker secrets for Swarm, Kubernetes secrets or external secret managers (Vault, AWS Secrets Manager) for production. For build-time secrets, use BuildKit's --mount=type=secret.
Podman or Docker in 2026?
Both produce OCI-compliant images. Podman is rootless by default and daemonless — better security posture. Docker has a larger ecosystem and better tooling. For development, Docker Desktop. For production (especially security-sensitive), Podman is gaining ground. Either works.
How do I debug containers in production?
Don't install debug tools in your image. Use ephemeral debug containers: kubectl debug -it pod/myapp --image=busybox attaches a temporary container with shell access to the pod's network namespace. For distroless images, this is the only option.
Should I use Docker for databases in production?
Use managed databases (RDS, Cloud SQL, Atlas) for production. Containerized databases add complexity around data persistence, backup, and performance. The exception: if your team has strong Kubernetes and statefulset expertise, operators like CloudNativePG for PostgreSQL work well.