Production-ready Docker deployments with Swarm and Compose. Covers secrets management, health checks, networking, rolling updates, and the 42 best practices every DevOps engineer needs.
Why Docker for Production in 2026?
While Kubernetes dominates enterprise conversations, Docker Swarm and Compose remain the pragmatic choice for teams that don't need—or can't afford—Kubernetes complexity.
Swarm is still the easiest orchestrator to explain to ops teams that don't need Kubernetes-level complexity. Docker Compose is now production-capable with proper configuration.
Docker Compose for Production
The 2026 Docker Compose Changes
Modern Docker Compose has evolved significantly:
| Change | Old Way | New Way (2026) |
|---|---|---|
| Version field | version: "3.8" | Omit entirely |
| Command | docker-compose | docker compose |
| Build | On-host | Multi-stage, CI/CD |
| Secrets | Environment variables | Docker secrets or vaults |
| Health checks | Optional | Required |
Production-Ready Compose Template
1# docker-compose.yml - No version field needed in 20262services:3 web:4 image: myapp/web:1.5.25 build:6 context: .7 dockerfile: Dockerfile8 target: production9 user: "1000:1000"10 read_only: true11 tmpfs:12 - /tmp13 - /var/run14 deploy:15 resources:16 limits:17 cpus: '1.0'18 memory: 512M19 reservations:20 cpus: '0.25'21 memory: 128M22 healthcheck:23 test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:3000/health"]24 interval: 30s25 timeout: 10s26 retries: 327 start_period: 40s28 environment:29 - NODE_ENV=production30 env_file:31 - .env.production32 secrets:33 - db_password34 - jwt_secret35 ports:36 - "3000:3000"37 networks:38 - frontend39 depends_on:40 db:41 condition: service_healthy42 logging:43 driver: "json-file"44 options:45 max-size: "10m"46 max-file: "5"4748 db:49 image: postgres:16-alpine50 user: "999:999"51 volumes:52 - postgres_data:/var/lib/postgresql/data53 environment:54 POSTGRES_DB: myapp55 POSTGRES_USER: myapp56 secrets:57 - source: db_password58 target: /run/secrets/postgres_password59 healthcheck:60 test: ["CMD-SHELL", "pg_isready -U myapp"]61 interval: 10s62 timeout: 5s63 retries: 564 networks:65 - backend66 deploy:67 resources:68 limits:69 cpus: '2.0'70 memory: 1G7172 redis:73 image: redis:7-alpine74 command: redis-server --appendonly yes --requirepass_file /run/secrets/redis_password75 volumes:76 - redis_data:/data77 secrets:78 - redis_password79 healthcheck:80 test: ["CMD", "redis-cli", "ping"]81 interval: 10s82 timeout: 5s83 retries: 584 networks:85 - backend86 deploy:87 resources:88 limits:89 cpus: '0.5'90 memory: 256M9192networks:93 frontend:94 driver: bridge95 backend:96 driver: bridge97 internal: true9899volumes:100 postgres_data:101 redis_data:102103secrets:104 db_password:105 file: ./secrets/db_password.txt106 jwt_secret:107 file: ./secrets/jwt_secret.txt108 redis_password:109 file: ./secrets/redis_password.txtHealth Check Patterns
Without health checks, Docker cannot detect if a containerized service is unhealthy.
Health check examples by application type:
1# HTTP service2healthcheck:3 test: ["CMD", "curl", "-f", "http://localhost:3000/health"]4 interval: 30s5 timeout: 10s6 retries: 378# Database9healthcheck:10 test: ["CMD-SHELL", "pg_isready -U postgres"]11 interval: 10s12 timeout: 5s13 retries: 51415# Redis16healthcheck:17 test: ["CMD", "redis-cli", "ping"]18 interval: 10s19 timeout: 5s20 retries: 52122# Custom script23healthcheck:24 test: ["CMD", "/app/healthcheck.sh"]25 interval: 30s26 timeout: 10s27 retries: 328 start_period: 60sDocker Swarm for Production
Docker Swarm provides native clustering and orchestration with built-in secrets management and rolling updates.
Swarm Architecture
Setting Up a Swarm Cluster
1# Initialize Swarm on first manager2docker swarm init --advertise-addr 192.168.1.1034# Output provides join tokens5# To add a manager:6docker swarm join-token manager78# To add a worker:9docker swarm join-token worker1011# Join additional managers (run on each manager node)12docker swarm join --token SWMTKN-1-xxx 192.168.1.10:23771314# Join workers (run on each worker node)15docker swarm join --token SWMTKN-1-yyy 192.168.1.10:2377Swarm Secrets Management
Docker Swarm secrets are encrypted at rest and in transit. Only containers explicitly granted permission can access the decrypted value.
Creating and using secrets:
1# Create from stdin2echo "my-super-secret-password" | docker secret create db_password -34# Create from file5docker secret create ssl_cert ./certs/server.crt6docker secret create ssl_key ./certs/server.key78# List secrets9docker secret ls1011# Inspect secret metadata (not the value)12docker secret inspect db_passwordUsing secrets in services:
1# docker-compose.yml for Swarm2services:3 api:4 image: myapp/api:latest5 secrets:6 - db_password7 - source: ssl_cert8 target: /etc/ssl/certs/server.crt9 mode: 044410 - source: ssl_key11 target: /etc/ssl/private/server.key12 uid: '1000'13 gid: '1000'14 mode: 04001516secrets:17 db_password:18 external: true19 ssl_cert:20 external: true21 ssl_key:22 external: trueReading secrets in your application:
1// Node.js example2import { readFileSync } from 'fs';34function getSecret(name: string): string {5 try {6 // Secrets are mounted at /run/secrets/7 return readFileSync(`/run/secrets/${name}`, 'utf8').trim();8 } catch (error) {9 // Fallback to environment variable for development10 const envValue = process.env[name.toUpperCase()];11 if (!envValue) {12 throw new Error(`Secret ${name} not found`);13 }14 return envValue;15 }16}1718const dbPassword = getSecret('db_password');Production Stack Deployment
1# stack.yml - Full production example2services:3 traefik:4 image: traefik:v3.05 command:6 - "--api.dashboard=true"7 - "--providers.docker.swarmMode=true"8 - "--providers.docker.exposedbydefault=false"9 - "--entrypoints.web.address=:80"10 - "--entrypoints.websecure.address=:443"11 - "--certificatesresolvers.letsencrypt.acme.httpchallenge=true"12 - "--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web"13 - "--certificatesresolvers.letsencrypt.acme.email=admin@example.com"14 - "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"15 ports:16 - "80:80"17 - "443:443"18 volumes:19 - /var/run/docker.sock:/var/run/docker.sock:ro20 - traefik_certs:/letsencrypt21 networks:22 - traefik-public23 deploy:24 mode: global25 placement:26 constraints:27 - node.role == manager28 labels:29 - "traefik.enable=true"30 - "traefik.http.routers.dashboard.rule=Host(`traefik.example.com`)"31 - "traefik.http.routers.dashboard.service=api@internal"32 - "traefik.http.routers.dashboard.middlewares=auth"33 - "traefik.http.middlewares.auth.basicauth.users=admin:$$apr1$$xxx"3435 api:36 image: myapp/api:1.5.037 environment:38 - NODE_ENV=production39 - DATABASE_URL=postgresql://myapp@db:5432/myapp40 secrets:41 - db_password42 - jwt_secret43 networks:44 - traefik-public45 - backend46 deploy:47 replicas: 348 update_config:49 parallelism: 150 delay: 30s51 failure_action: rollback52 monitor: 60s53 order: start-first54 rollback_config:55 parallelism: 156 delay: 10s57 restart_policy:58 condition: on-failure59 delay: 5s60 max_attempts: 361 window: 120s62 resources:63 limits:64 cpus: '1'65 memory: 512M66 reservations:67 cpus: '0.25'68 memory: 128M69 labels:70 - "traefik.enable=true"71 - "traefik.http.routers.api.rule=Host(`api.example.com`)"72 - "traefik.http.routers.api.entrypoints=websecure"73 - "traefik.http.routers.api.tls.certresolver=letsencrypt"74 - "traefik.http.services.api.loadbalancer.server.port=3000"75 healthcheck:76 test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:3000/health"]77 interval: 30s78 timeout: 10s79 retries: 38081 db:82 image: postgres:16-alpine83 volumes:84 - postgres_data:/var/lib/postgresql/data85 environment:86 POSTGRES_DB: myapp87 POSTGRES_USER: myapp88 POSTGRES_PASSWORD_FILE: /run/secrets/db_password89 secrets:90 - db_password91 networks:92 - backend93 deploy:94 replicas: 195 placement:96 constraints:97 - node.labels.db == true98 resources:99 limits:100 cpus: '4'101 memory: 4G102103networks:104 traefik-public:105 driver: overlay106 attachable: true107 backend:108 driver: overlay109 driver_opts:110 encrypted: "true"111 internal: true112113volumes:114 traefik_certs:115 postgres_data:116117secrets:118 db_password:119 external: true120 jwt_secret:121 external: trueDeploying and Managing Stacks
1# Deploy stack2docker stack deploy -c stack.yml myapp34# List stacks5docker stack ls67# List services in stack8docker stack services myapp910# View service logs11docker service logs myapp_api -f1213# Scale a service14docker service scale myapp_api=51516# Rolling update17docker service update --image myapp/api:1.6.0 myapp_api1819# Rollback20docker service rollback myapp_api2122# Remove stack23docker stack rm myappRolling Updates and Zero-Downtime Deployments
Update Configuration Options
1deploy:2 update_config:3 parallelism: 1 # Update one container at a time4 delay: 30s # Wait 30s between updates5 failure_action: rollback # Auto-rollback on failure6 monitor: 60s # Monitor for 60s after update7 max_failure_ratio: 0 # Any failure triggers rollback8 order: start-first # Start new before stopping old| Option | Recommended Value | Purpose |
|---|---|---|
| parallelism | 1 | Update one at a time for safety |
| delay | 30s | Allow service to stabilize |
| failure_action | rollback | Automatic recovery |
| monitor | 60s | Catch delayed failures |
| order | start-first | Zero-downtime deployment |
Networking Best Practices
Network Architecture
Network configuration:
1networks:2 # Public-facing services3 public:4 driver: overlay5 attachable: true67 # Internal services - no external access8 private:9 driver: overlay10 driver_opts:11 encrypted: "true" # Encrypt traffic between nodes12 internal: true # No external connectivity1314 # Database network - extra isolation15 database:16 driver: overlay17 driver_opts:18 encrypted: "true"19 internal: trueThe 42 Docker Production Best Practices
Image Optimization (1-10)
- 1. Use multi-stage builds to reduce image size
- 2. Pin base image versions (never use
:latest) - 3. Use Alpine or distroless images when possible
- 4. Order Dockerfile instructions by change frequency
- 5. Combine RUN commands to reduce layers
- 6. Use .dockerignore to exclude unnecessary files
- 7. Scan images for vulnerabilities (Trivy, Docker Scout)
- 8. Sign and verify images in production
- 9. Use BuildKit for faster builds
- 10. Cache dependencies separately from application code
Security (11-20)
- 11. Never run containers as root
- 12. Use read-only filesystems
- 13. Drop all capabilities, add only what's needed
- 14. Use secrets management (never environment variables for secrets)
- 15. Limit resources (CPU, memory)
- 16. Use internal networks for service-to-service communication
- 17. Enable TLS for all network communication
- 18. Regularly update base images
- 19. Use security contexts and AppArmor/SELinux profiles
- 20. Implement least-privilege principle
Health & Reliability (21-30)
- 21. Add health checks to ALL services
- 22. Use start_period for slow-starting services
- 23. Implement graceful shutdown handlers
- 24. Configure restart policies appropriately
- 25. Set up proper logging drivers
- 26. Use dependson with condition: servicehealthy
- 27. Configure rolling updates with proper delays
- 28. Set failure_action to rollback
- 29. Monitor services with Prometheus/Grafana
- 30. Set up alerting for container failures
Performance (31-37)
- 31. Use tmpfs for temporary files
- 32. Configure appropriate resource limits
- 33. Use volume mounts for persistent data
- 34. Optimize container startup time
- 35. Use connection pooling for databases
- 36. Configure proper logging (avoid stdout flooding)
- 37. Use overlay2 storage driver
Operations (38-42)
- 38. Tag images with git commit SHA and semantic version
- 39. Implement blue-green or canary deployments
- 40. Automate deployments with CI/CD
- 41. Back up volumes and secrets regularly
- 42. Document runbooks for common operations
Monitoring and Observability
Basic Monitoring Stack
1services:2 prometheus:3 image: prom/prometheus:latest4 volumes:5 - ./prometheus.yml:/etc/prometheus/prometheus.yml6 - prometheus_data:/prometheus7 command:8 - '--config.file=/etc/prometheus/prometheus.yml'9 - '--storage.tsdb.path=/prometheus'10 networks:11 - monitoring12 deploy:13 placement:14 constraints:15 - node.role == manager1617 grafana:18 image: grafana/grafana:latest19 volumes:20 - grafana_data:/var/lib/grafana21 environment:22 - GF_SECURITY_ADMIN_PASSWORD_FILE=/run/secrets/grafana_password23 secrets:24 - grafana_password25 networks:26 - monitoring27 - public2829 cadvisor:30 image: gcr.io/cadvisor/cadvisor:latest31 volumes:32 - /:/rootfs:ro33 - /var/run:/var/run:ro34 - /sys:/sys:ro35 - /var/lib/docker/:/var/lib/docker:ro36 networks:37 - monitoring38 deploy:39 mode: global4041volumes:42 prometheus_data:43 grafana_data:4445networks:46 monitoring:47 driver: overlay48 internal: trueKey Metrics to Monitor
| Metric | Alert Threshold | Action |
|---|---|---|
| Container CPU | > 80% for 5min | Scale or optimize |
| Container Memory | > 85% | Investigate leaks |
| Container Restarts | > 3 in 5min | Check logs |
| Health Check Failures | > 2 consecutive | Auto-restart |
| Disk Usage | > 80% | Cleanup or expand |
When to Graduate to Kubernetes
Docker Swarm works great, but consider Kubernetes/K3s when:
Brisbane Docker Consulting
At Buun Group, we help Queensland businesses deploy containerized applications:
- Docker architecture — design production-ready setups
- Swarm clusters — multi-node HA deployments
- Security hardening — secrets, networking, scanning
- Migration — move from VMs to containers
We've deployed production Docker workloads across various scales. We know what works.
Need Docker help?
Topics
Comments
Sign in to join the conversation
LoginNo comments yet. Be the first to share your thoughts!
Found an issue with this article?
