DevOps

December 18, 2025

8 min read

Docker & CI/CD Best Practices

After three years of containerizing applications and countless hours debugging CI/CD pipelines, I've accumulated a mental checklist of practices that separate smooth deployments from 2 AM production incidents. Let me share the lessons that cost me sleep so you can rest easy.

The Wake-Up Call

It was 11 PM on a Friday when our deployment pipeline broke. The build that took 5 minutes that morning was now taking 45 minutes. Our developers were waiting to push critical bug fixes. I was frantically Googling "why is Docker build so slow." That night taught me that Docker optimization isn't optional—it's essential.

Multi-Stage Builds: The Game Changer

Multi-stage builds transformed our Docker workflow. Before discovering them, our Python application images were 1.2 GB. Afterward? 180 MB. Here's the pattern:


dockerfile
# ❌ Bad: Single-stage build (1.2 GB)
FROM python:3.11
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "app.py"]

# ✅ Good: Multi-stage build (180 MB)
# Stage 1: Build stage with all build dependencies
FROM python:3.11-slim as builder
WORKDIR /app

# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc \
    g++ \
    && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Stage 2: Runtime stage with only necessary files
FROM python:3.11-slim
WORKDIR /app

# Copy only the installed packages from builder
COPY --from=builder /root/.local /root/.local
COPY . .

# Make sure scripts in .local are usable
ENV PATH=/root/.local/bin:$PATH

CMD ["python", "app.py"]

For Node.js applications, the difference is even more dramatic:


dockerfile
# Build stage
FROM node:18-alpine as builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

# Runtime stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]

The

code

node_modules

folder alone can be hundreds of megabytes. By using

code

npm ci

in a builder stage and copying only what's needed, we dramatically reduce the final image size.

Layer Caching: The Hidden Performance Multiplier

Understanding Docker layer caching changed everything. Docker builds layers from top to bottom, and if a layer hasn't changed, it reuses the cache. The trick is ordering your Dockerfile intelligently.

The Wrong Way


dockerfile
FROM python:3.11-slim
WORKDIR /app

# ❌ This invalidates cache every time code changes
COPY . .
RUN pip install -r requirements.txt

CMD ["python", "app.py"]

Every code change rebuilds dependencies—even though

code

requirements.txt

hasn't changed.

The Right Way


dockerfile
FROM python:3.11-slim
WORKDIR /app

# ✅ Copy dependency file first
COPY requirements.txt .
RUN pip install -r requirements.txt

# Copy code last
COPY . .

CMD ["python", "app.py"]

Now dependencies are cached until

code

requirements.txt

changes. Our build times went from 8 minutes to 30 seconds for typical code changes.

For more complex projects, be even more strategic:


dockerfile
FROM node:18-alpine
WORKDIR /app

# Layer 1: Package files (changes rarely)
COPY package*.json ./

# Layer 2: Dependencies (cached unless package.json changes)
RUN npm ci

# Layer 3: Source code (changes frequently)
COPY src/ ./src/
COPY public/ ./public/

# Layer 4: Build step (only reruns if source changes)
RUN npm run build

CMD ["npm", "start"]

Think about your layers in order of change frequency: least frequently changing first.

.dockerignore: The Forgotten Performance Booster

I once spent 30 minutes debugging why builds were slow, only to discover we were copying 500 MB of

code

node_modules

and

code

.git

history into the Docker context. Enter

code

.dockerignore


code
# .dockerignore
node_modules
npm-debug.log
.git
.gitignore
.env
.env.local
*.md
.vscode
.idea
__pycache__
*.pyc
.pytest_cache
.coverage
dist
build
.DS_Store

This reduced our build context from 800 MB to 15 MB. The Docker daemon thanks you for not sending gigabytes of unnecessary files.

GitHub Actions: From Slow to Fast

Our GitHub Actions workflow initially took 15 minutes. Here's how we got it under 4 minutes:

Strategy 1: Dependency Caching


yaml
name: CI/CD Pipeline

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3

      # Cache Python dependencies
      - uses: actions/setup-python@v4
        with:
          python-version: '3.11'
          cache: 'pip'

      - name: Install dependencies
        run: |
          pip install -r requirements.txt
          pip install -r requirements-dev.txt

      - name: Run tests
        run: pytest --cov=. --cov-report=xml

That

code

cache: 'pip'

line? Pure magic. It caches dependencies between runs, saving 2-3 minutes every time.

For Node.js:


yaml
- uses: actions/setup-node@v3
  with:
    node-version: '18'
    cache: 'npm'

- run: npm ci
- run: npm test

Strategy 2: Docker Layer Caching in CI


yaml
- name: Set up Docker Buildx
  uses: docker/setup-buildx-action@v2

- name: Build and push
  uses: docker/build-push-action@v4
  with:
    context: .
    push: true
    tags: myapp:latest
    cache-from: type=registry,ref=myapp:buildcache
    cache-to: type=registry,ref=myapp:buildcache,mode=max

This caches Docker layers between builds. Our Docker builds went from 12 minutes to 3 minutes.

Strategy 3: Parallel Jobs


yaml
jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ['3.9', '3.10', '3.11']
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: ${{ matrix.python-version }}
      - run: pip install -r requirements.txt
      - run: pytest

  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - run: flake8 .
      - run: black --check .

  build:
    runs-on: ubuntu-latest
    needs: [test, lint]
    steps:
      - uses: actions/checkout@v3
      - run: docker build -t myapp .

Tests, linting, and builds run simultaneously. Total pipeline time reduced by 60%.

Security Scanning: Catching Vulnerabilities Early

Nothing ruins your day like a security audit finding critical vulnerabilities in production. Integrate scanning into CI:


yaml
- name: Run Trivy vulnerability scanner
  uses: aquasecurity/trivy-action@master
  with:
    image-ref: 'myapp:latest'
    format: 'sarif'
    output: 'trivy-results.sarif'
    severity: 'CRITICAL,HIGH'

- name: Upload Trivy results to GitHub Security
  uses: github/codeql-action/upload-sarif@v2
  with:
    sarif_file: 'trivy-results.sarif'

We caught a critical vulnerability in a Python dependency this way—before it hit production.

For even more comprehensive scanning:


yaml
- name: Run Snyk to check for vulnerabilities
  uses: snyk/actions/python@master
  env:
    SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
  with:
    args: --severity-threshold=high

Docker Compose for Local Development

One frustration I hear constantly: "It works on my machine." Docker Compose solves this:


yaml
version: '3.8'

services:
  app:
    build: .
    ports:
      - "8000:8000"
    volumes:
      - .:/app
    environment:
      - DATABASE_URL=postgresql://user:pass@db:5432/mydb
      - REDIS_URL=redis://redis:6379
    depends_on:
      - db
      - redis
    command: uvicorn app.main:app --host 0.0.0.0 --reload

  db:
    image: postgres:15
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=pass
      - POSTGRES_DB=mydb
    volumes:
      - postgres_data:/var/lib/postgresql/data
    ports:
      - "5432:5432"

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

volumes:
  postgres_data:

Now every developer runs

code

docker-compose up

and has an identical environment. No more "I have a different PostgreSQL version" debugging sessions.

Production Optimization: The Details Matter

1. Health Checks


dockerfile
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s --retries=3 \
  CMD curl -f http://localhost:8000/health || exit 1

This lets orchestrators like Kubernetes detect and restart unhealthy containers.

2. Non-Root User


dockerfile
# Create a non-root user
RUN adduser --disabled-password --gecos '' appuser

# Change ownership of app files
RUN chown -R appuser:appuser /app

# Switch to non-root user
USER appuser

CMD ["python", "app.py"]

Running as root is a security risk. This simple change significantly improves your security posture.

3. Environment-Specific Configurations


yaml
# docker-compose.prod.yml
version: '3.8'

services:
  app:
    image: myapp:latest
    restart: always
    environment:
      - NODE_ENV=production
    deploy:
      replicas: 3
      resources:
        limits:
          cpus: '1'
          memory: 512M

The Build-Test-Deploy Pipeline

Here's our complete production pipeline:


yaml
name: Production Deployment

on:
  push:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: '3.11'
          cache: 'pip'
      - run: pip install -r requirements.txt
      - run: pytest --cov=. --cov-report=xml

  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - run: docker build -t myapp:scan .
      - uses: aquasecurity/trivy-action@master
        with:
          image-ref: 'myapp:scan'
          severity: 'CRITICAL,HIGH'
          exit-code: '1'

  build:
    needs: [test, security]
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: docker/setup-buildx-action@v2
      - uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_PASSWORD }}
      - uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          tags: myapp:latest,myapp:${{ github.sha }}
          cache-from: type=registry,ref=myapp:buildcache
          cache-to: type=registry,ref=myapp:buildcache,mode=max

  deploy:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - name: Deploy to production
        run: |
          echo "Deploying to Kubernetes/ECS/your platform"
          # Your deployment commands here

This pipeline ensures every production deployment is tested, secure, and reproducible.

Lessons from the Trenches

Optimize for the common case: Most builds are incremental. Cache aggressively.
Fail fast: Run quick tests first, slow tests later.
Security is not optional: Scan every image before it reaches production.
Local-prod parity: Docker Compose should mirror production as closely as possible.
Monitor your builds: If builds start slowing down, investigate immediately.

After implementing these practices across a dozen projects, our deployment confidence went from "fingers crossed" to "ship it with confidence." Docker and CI/CD done right aren't obstacles—they're accelerators.