HomeInsightsDocker Best Practices Production: Expert Guide for Devs

June 27, 202610 min. read

Docker Best Practices Production: Expert Guide for Devs

Learn Docker best practices for production environments from Nordiso, Finland's premium software consultancy. Optimize security, performance, and reliability.

Introduction

Docker has fundamentally transformed how we build, ship, and run applications. Yet, the gap between a working container and a production-grade deployment is vast—and frequently underestimated. Many teams can spin up a Docker Compose stack for local development, but when that same stack hits a production environment, it often crumbles under the weight of poor configuration choices. The difference lies in applying Docker best practices production teams have honed through real-world failures and hard-won experience. Whether you are orchestrating microservices on Kubernetes or managing a standalone Docker host, understanding these practices is non-negotiable.

This is not a beginner's guide. We are writing for senior developers and architects who already know the syntax and the tooling. You understand how Docker works; now it is time to make it work reliably under load. From minimal base images to read-only root filesystems, from graceful shutdowns to resource constraints that prevent noisy neighbors, we will dissect the critical decisions that separate a resilient production system from one that wakes you up at 3 AM. At Nordiso, we have consulted for countless teams struggling with these exact issues, and we are sharing the distilled wisdom below.

Before we dive into the specifics, understand that these Docker best practices production guidelines are not optional. They are the difference between a deployment that scales elegantly and one that collapses under traffic. They are the difference between a security breach and a hardened surface. And they are the difference between a system that costs you money and one that saves it. Ready? Let's harden your containers.

Choose Minimal and Secure Base Images

Why Alpine and Distroless Matter

Every layer in a Docker image adds surface area for potential vulnerabilities. The larger the base image, the more packages, libraries, and binaries it includes—many of which you never even use. This is why choosing a minimal base image is one of the most impactful Docker best practices production you can adopt. For compiled languages like Go or Rust, consider using scratch (an empty image) and copying only the compiled binary. For interpreted languages, Alpine Linux (based on musl libc and BusyBox) provides a tiny footprint, often under 5 MB. Alternatively, Google's distroless images contain only essential runtime dependencies—no shell, no package manager, no unnecessary tools that attackers could exploit.

# Bad practice: using full Ubuntu base
FROM ubuntu:22.04
RUN apt-get update && apt-get install -y python3 && rm -rf /var/lib/apt/lists/*
COPY app.py .
CMD ["python3", "app.py"]

# Good practice: using distroless Python
FROM python:3.11-slim-buster AS builder
COPY requirements.txt .
RUN pip install --user -r requirements.txt

FROM gcr.io/distroless/python3-debian11
COPY --from=builder /root/.local /root/.local
COPY app.py .
CMD ["app.py"]

Verify Image Provenance

Using a minimal image is pointless if you pull it from an untrusted source. Always use official images from Docker Hub or your private registry, and pin to specific digest hashes (e.g., alpine:3.18@sha256:...) instead of version tags. Tags can be overwritten, but digests are immutable. This practice prevents supply chain attacks and ensures reproducible builds across environments—a core tenent of production reliability.

Implement Multi-Stage Builds Correctly

Separate Build and Runtime Stages

Multi-stage builds allow you to keep your final image lean by discarding build tools after compilation. Yet many teams misuse this feature, conflating the build process with runtime dependencies. A proper multi-stage build starts with a full-featured build stage (e.g., golang:1.21 for compiling Go code) and ends with a minimal runtime stage (e.g., alpine:3.18 or scratch). The key is to copy only the compiled binary or artifacts, not the entire build environment.

# Stage 1: Build
FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -o /app/server .

# Stage 2: Minimal runtime
FROM alpine:3.18
RUN apk add --no-cache ca-certificates
COPY --from=builder /app/server /server
EXPOSE 8080
USER 1000:1000
CMD ["/server"]

Notice we install ca-certificates in the final stage. This is a common oversight: if your application makes external HTTPS calls, it needs root certificates. Distroless or Alpine images do not include them by default. Test your image thoroughly in a staging environment before promoting to production to catch missing dependencies.

Optimize Layer Caching and Build Speed

Order Your Dockerfile Instructions Strategically

Docker builds each instruction as a separate layer. When you change a file, only layers from that point onward are rebuilt. Maximizing cache hits dramatically reduces build times—essential for CI/CD pipelines. Always place instructions that change infrequently (like installing system packages) before instructions that change often (like copying application code). Furthermore, copy dependency files (e.g., requirements.txt, go.mod) separately and run dependency installation before copying the entire source tree.

# Optimized for layer caching
FROM node:18-alpine
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production && npm cache clean --force
COPY . .
CMD ["node", "dist/server.js"]

This pattern ensures that if only your source code changes, the expensive npm ci step is not re-executed—it loads from cache. For large projects, this can shave minutes off each build.

Use Read-Only Root Filesystems

Stop Writing to the Container Layer

By default, Docker containers run with a writable layer. Every write (even temporary logs or caches) consumes disk space and can cause unexpected failures if the container is killed. Worse, if an attacker gains code execution, they can drop malicious files into the container. Enabling a read-only root filesystem (--read-only in the CLI or readOnlyRootFilesystem: true in Kubernetes) prevents all of this. Application code and binaries are immutable, forcing you to mount explicit volumes for writable data.

# Kubernetes Pod spec
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: app
    image: myapp:latest
    securityContext:
      readOnlyRootFilesystem: true
    volumeMounts:
    - name: tmp
      mountPath: /tmp
  volumes:
  - name: tmp
    emptyDir: {}

Many applications need /tmp for temporary data, so mount an emptyDir volume there. For databases, logs, or user uploads, use Kubernetes PersistentVolumeClaims or host mounts. This practice aligns with the principle of least privilege and is a hallmark of Docker best practices production security.

Set Resource Constraints and Limits

Prevent Noisy Neighbors

A single misbehaving container can consume all available CPU and memory on a host, starving other services. This is especially dangerous in multi-tenant environments like Kubernetes nodes. Always set both requests (the baseline resources guaranteed to the container) and limits (the maximum resources it can burst to). In Docker Compose, use deploy.resources.limits; in Kubernetes, use resources.limits under the container spec.

# In docker-compose.yml
services:
  web:
    image: nginx:1.25
    deploy:
      resources:
        limits:
          cpus: '0.5'
          memory: 256M
        reservations:
          cpus: '0.25'
          memory: 128M

Setting tight limits also forces you to profile your application's actual resource consumption. If a container consistently hits its memory limit, you either need to optimize code or increase the limit—both are better than a silent OOM kill. Remember, limits protect the cluster; requests protect the container during scheduling.

Implement Proper Health Checks

Know When to Restart

Docker and orchestrators rely on health checks to determine if a container is running correctly. Without them, a process that is alive but unresponsive (e.g., deadlocked) will never be restarted automatically. Use HEALTHCHECK instructions in your Dockerfile or configure livenessProbe and readinessProbe in Kubernetes. The health check should test actual application functionality (e.g., hitting a /healthz endpoint) rather than just checking if the process is alive.

# Dockerfile health check for a web service
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:8080/health || exit 1

In Kubernetes, separate liveness (is the container alive?) and readiness (is it ready to serve traffic?) probes. During rolling updates, the readiness probe prevents the service from receiving requests before it is fully initialized. We have seen countless production outages caused by flawed health checks—either too lenient (ignoring real failures) or too rigid (killing containers unnecessarily). Find the balance by testing with realistic network conditions and traffic patterns.

Handle Graceful Shutdowns

Catch SIGTERM Correctly

When Docker needs to stop a container, it sends a SIGTERM signal. If the process does not handle this signal and shut down gracefully within the timeout (default 10 seconds), Docker sends SIGKILL—effectively murdering the process mid-task. For web servers, this means in-flight requests are dropped. For background workers, it means jobs are lost. Always write your application to catch SIGTERM, complete ongoing work, and clean up resources before exiting.

// Node.js graceful shutdown example
process.on('SIGTERM', async () => {
  console.log('Received SIGTERM, shutting down gracefully...');
  await server.close();  // Stop accepting new connections
  await db.close();
  process.exit(0);
});

In Docker Compose, you can increase stop_grace_period for containers that need more time (e.g., databases with active transactions). In Kubernetes, Pods by default get 30 seconds to shut down gracefully (terminationGracePeriodSeconds). Adjust this value based on your application's maximum response time plus cleanup overhead. Ignoring graceful shutdowns is one of the most common violations of Docker best practices production we encounter.

Centralize and Manage Logging

Avoid File-Based Logs

Docker containers are ephemeral—they can be destroyed and recreated at any moment. Writing logs to files inside the container means losing them forever upon restart. Instead, log to stdout and stderr. Docker collects these streams and sends them to a logging driver (e.g., json-file, syslog, Fluentd, or AWS CloudWatch). This centralizes log management, enabling search, alerting, and retention policies.

For structured logging (recommended for production), output JSON to stdout. This makes it easy for tools like Elasticsearch or Loki to parse and index your logs. Avoid logging sensitive data (passwords, tokens, PII) even if it goes to a central system—log sanitization should happen at the application layer.

Scan Images for Vulnerabilities Regularly

Automated Scanning in CI/CD

A container image is only as secure as its last scan. Use tools like Docker Scout, Trivy, or Snyk to scan your images for known vulnerabilities (CVEs) before deploying to production. Integrate scanning into your CI/CD pipeline; if a critical CVE is found, fail the build. This proactive approach prevents vulnerable images from ever reaching your registry.

# Example using Trivy
$ trivy image myapp:latest --severity HIGH,CRITICAL

In addition to scanning, subscribe to security advisories for your base image's distribution. When a new CVE is disclosed for Alpine or Debian, you can rebuild and redeploy quickly. Cloud providers often offer managed scanning services (e.g., Amazon ECR image scanning) that automatically scan newly pushed images and alert on findings.

Conclusion

Adopting these Docker best practices production guidelines will elevate your containerized applications from experimental to enterprise-grade. You will ship smaller, more secure images. Your deployments will be faster, your infrastructure more resilient, and your on-call rotations a little less stressful. The practices we covered—minimal base images, multi-stage builds, read-only filesystems, resource limits, health checks, graceful shutdowns, and automated scanning—form the bedrock of any serious container strategy.

At Nordiso, we help companies across Finland and beyond build robust, scalable software solutions. If your team is struggling with container orchestration, performance bottlenecks, or security hardening, consider partnering with us. We bring decades of combined experience in cloud-native architectures, and we have guided dozens of teams through successful production deployments. Contact us to schedule a consultation.

Remember, production doesn't forgive. Debug now, not later.