Friday, 1 May 2026

TASK H - GUIDE - Dockerize the App

 

Containerization & Image Management (Docker) — Step-by-Step Guide

Overview

In this task, you containerize the HealthPulse Portal using Docker. You will build a production-ready Docker image, scan it for vulnerabilities, push it to a registry, and deploy it to your k3s cluster manually. Every step here is something the CI pipeline will eventually automate in Task F — you are doing it by hand first so you understand what the automation is actually doing.

Why now? You just finished Task G (bare-metal deployment) where you manually copied files to an Nginx server, reloaded services, and had no versioning, no rollback, and no environment parity. Docker solves all of those problems by packaging the application and its server into a single, immutable, versioned artifact.

What you'll do:

  1. Understand the multi-stage Dockerfile
  2. Build the Docker image locally
  3. Run and test the container
  4. Use Docker Compose for consistent configuration
  5. Scan the image for security vulnerabilities (manually)
  6. Tag and push to a container registry
  7. Deploy to k3s manually using kubectl
  8. Compare bare-metal vs container deployment
  9. Document in MkDocs
  10. Clean up

Time estimate: This is a Week 5 task, typically completed after Task G.


Prerequisites

Before starting Task H, ensure you have completed:

  •  Task G — Bare-metal deployment done (you have felt the pain of manual SCP, Nginx config, no versioning)
  •  Docker installed (docker --version → v24+ or v27+)
  •  Docker Compose installed (docker compose version → v2.x)
  •  The project builds locally (pnpm build produces dist/)

Why Task G first? If you skip straight to Docker, you won't appreciate what it solves. Task G is intentionally painful — it is the "before" picture. Task H is the "after."


Step 1: Understand the Dockerfile

1.1 — Open and Read the Dockerfile

Open docker/Dockerfile and study it line by line:

# ============================================
# Stage 1: Build the application
# ============================================
FROM node:20-alpine AS build

WORKDIR /app

# Copy dependency files first for layer caching
COPY package.json package-lock.json ./
RUN npm ci

# Copy source code
COPY . .

# Build arguments for environment-specific builds
ARG VITE_API_URL=http://localhost:3000/api
ARG VITE_ENV=production
ARG VITE_APP_VERSION=1.0.0

# Build the application
RUN npm run build

# ============================================
# Stage 2: Serve with Nginx
# ============================================
FROM nginx:1.27-alpine AS production

# Remove default nginx config
RUN rm /etc/nginx/conf.d/default.conf

# Copy custom nginx config
COPY docker/nginx.conf /etc/nginx/conf.d/default.conf

# Copy built application from build stage
COPY --from=build /app/dist /usr/share/nginx/html

# Add healthcheck
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:80/ || exit 1

# Expose port 80
EXPOSE 80

CMD ["nginx", "-g", "daemon off;"]

1.2 — What Is a Multi-Stage Build?

A multi-stage build uses multiple FROM statements. Each FROM starts a new stage. Only the final stage becomes the image you ship.

Stage 1: "build" (node:20-alpine)          Stage 2: "production" (nginx:1.27-alpine)
┌────────────────────────────────┐          ┌────────────────────────────────┐
│  Node.js 20                    │          │  Nginx 1.27                    │
│  npm                           │          │                                │
│  package.json + package-lock   │          │  nginx.conf (custom)           │
│  All source code (src/, etc.)  │          │  dist/ (copied from Stage 1)   │
│  node_modules/ (hundreds of MB)│          │                                │
│  dist/ (build output)          │  ──────> │  HEALTHCHECK configured        │
│                                │  COPY    │                                │
│  ~ 400-800 MB                  │ --from=  │  ~ 40-60 MB                    │
│                                │  build   │                                │
│  DISCARDED after build         │          │  THIS IS YOUR FINAL IMAGE      │
└────────────────────────────────┘          └────────────────────────────────┘

What stays: Only the dist/ folder (your compiled HTML/CSS/JS) and the Nginx server.

What is discarded: Node.js, npm, node_modules, source code, TypeScript files — everything needed to build but not to run. This is why the final image is ~50 MB instead of ~800 MB.

1.3 — Why Alpine?

Both stages use Alpine Linux variants (node:20-alpinenginx:1.27-alpine).

Base ImageSizeUse Case
node:20 (Debian)~1 GBFull OS, every tool included, good for development
node:20-alpine~130 MBMinimal OS, only what's needed, good for builds
nginx:1.27 (Debian)~190 MBFull Nginx with extras
nginx:1.27-alpine~45 MBMinimal Nginx, perfect for serving static files

Alpine uses musl libc instead of glibc and apk instead of apt. It strips out everything you don't need — man pages, shell utilities, package caches. For a production image that just serves static files, this is ideal.

1.4 — How the nginx.conf Works

The custom Nginx config (docker/nginx.conf) is copied into the image at build time:

server {
    listen 80;
    server_name _;
    root /usr/share/nginx/html;
    index index.html;

    # Security headers
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    ...

    # Gzip compression
    gzip on;
    ...

    # Cache static assets aggressively
    location /assets/ {
        expires 1y;
        add_header Cache-Control "public, immutable";
    }

    # SPA routing — serve index.html for all routes
    location / {
        try_files $uri $uri/ /index.html;
    }

    # Health check endpoint
    location /health {
        access_log off;
        return 200 '{"status":"healthy"}';
        add_header Content-Type application/json;
    }
}

Key points:

  • try_files — This is critical for React Router. Without it, navigating to /dashboard and refreshing gives a 404 because there is no dashboard file on disk. try_files tells Nginx: "If the file doesn't exist, serve index.html and let React handle the route."
  • /health — A synthetic health endpoint. Docker's HEALTHCHECK and Kubernetes probes will hit this to determine if the container is alive.
  • Security headers — Prevent clickjacking (X-Frame-Options), MIME sniffing (X-Content-Type-Options), and XSS (X-XSS-Protection).
  • Gzip — Compresses text-based assets before sending to the browser, reducing transfer size by 60-80%.

1.5 — Layer Caching Strategy

Notice the order of COPY instructions in Stage 1:

COPY package.json package-lock.json ./  # Step A — changes rarely
RUN npm ci                              # Step B — expensive (30-60s)
COPY . .                                # Step C — changes every commit
RUN npm run build                       # Step D — depends on source code

Docker caches each layer. If a layer's input hasn't changed, Docker reuses the cached version. By copying dependency files before source code:

  • If you only changed source code (Step C), Steps A and B are cached — npm ci is skipped entirely
  • A rebuild takes seconds instead of minutes

If you put COPY . . before npm ci, every code change would invalidate the dependency cache, forcing a full reinstall every build. This is the #1 Dockerfile performance mistake beginners make.


Step 2: Build the Docker Image Locally

2.1 — Run the Build

From the project root (not the docker/ directory):

docker build -t healthpulse-portal:local -f docker/Dockerfile .
FlagMeaning
docker buildBuild a Docker image from a Dockerfile
-t healthpulse-portal:localTag the image with name healthpulse-portal and tag local
-f docker/DockerfileUse this specific Dockerfile (not the default ./Dockerfile)
.Build context — the directory Docker sends to the build daemon. . means the project root, so COPY . . copies everything in the project

2.2 — Watch the Build Output

You will see Docker executing each instruction:

[+] Building 45.2s (15/15) FINISHED
 => [build 1/6] FROM node:20-alpine@sha256:...                    2.3s
 => [build 2/6] WORKDIR /app                                      0.0s
 => [build 3/6] COPY package.json package-lock.json ./             0.1s
 => [build 4/6] RUN npm ci                                        28.4s
 => [build 5/6] COPY . .                                          0.3s
 => [build 6/6] RUN npm run build                                  8.1s
 => [production 1/4] FROM nginx:1.27-alpine@sha256:...             1.5s
 => [production 2/4] RUN rm /etc/nginx/conf.d/default.conf         0.2s
 => [production 3/4] COPY docker/nginx.conf ...                    0.1s
 => [production 4/4] COPY --from=build /app/dist ...               0.1s
 => exporting to image                                             0.2s

The first build will be slow (~45s) because there are no cached layers. Subsequent builds with only source code changes will be much faster (~10s) thanks to layer caching.

2.3 — Verify the Image

docker images healthpulse-portal
REPOSITORY            TAG       IMAGE ID       CREATED          SIZE
healthpulse-portal    local     a1b2c3d4e5f6   30 seconds ago   47.2MB

Compare this to bare-metal (Task G): On the EC2 server, you had Node.js installed, npm, build tools, source code — all living on the server. The Docker image contains only Nginx and the compiled static files. 47 MB vs a full Ubuntu server.


Step 3: Run and Test Locally

3.1 — Start the Container

docker run -d --name healthpulse -p 8080:80 healthpulse-portal:local
FlagMeaning
-dDetached mode — run in the background (don't lock your terminal)
--name healthpulseGive the container a human-readable name
-p 8080:80Map port 8080 on your machine to port 80 inside the container
healthpulse-portal:localThe image to run

Port mapping explained:

Your Machine                     Docker Container
┌──────────────┐                ┌──────────────────┐
│              │  -p 8080:80   │                  │
│  localhost   │──────────────>│  Nginx           │
│  :8080       │               │  :80             │
│              │               │                  │
│  Browser     │               │  /usr/share/     │
│  curl        │               │  nginx/html/     │
└──────────────┘               └──────────────────┘

3.2 — Test with curl

# Health check
curl http://localhost:8080/health
# → {"status":"healthy"}

# Home page (should return HTML)
curl -s http://localhost:8080/ | head -5
# → <!DOCTYPE html>
# → <html lang="en">
# → ...

3.3 — Test in the Browser

Open http://localhost:8080 in your browser. You should see the HealthPulse Portal. Navigate around — try /dashboard/appointments, then refresh the page. If the page loads correctly on refresh, the try_files SPA fallback is working.

3.4 — Inspect the Running Container

# See running containers
docker ps
CONTAINER ID   IMAGE                      COMMAND                  STATUS                    PORTS                  NAMES
a1b2c3d4e5f6   healthpulse-portal:local   "/docker-entrypoint.…"  Up 2 minutes (healthy)    0.0.0.0:8080->80/tcp   healthpulse

Note the (healthy) status — that is the HEALTHCHECK in the Dockerfile working.

# View container logs (Nginx access logs)
docker logs healthpulse

# Follow logs in real-time (Ctrl+C to stop)
docker logs -f healthpulse

# Inspect container details (image, network, mounts, etc.)
docker inspect healthpulse

# Execute a command inside the running container
docker exec -it healthpulse /bin/sh

# Once inside, explore:
ls /usr/share/nginx/html/      # Your built app
cat /etc/nginx/conf.d/default.conf  # Your Nginx config
nginx -t                       # Test Nginx config
exit

3.5 — Stop the Container

docker stop healthpulse

Checkpoint: You have built and run the HealthPulse Portal in a Docker container. The image contains everything needed to serve the app — Nginx, config, static files — in a single 47 MB artifact.


Step 4: Docker Compose

4.1 — Why Compose?

Running docker run with all its flags is error-prone. Docker Compose captures the entire runtime configuration in a YAML file so you can start the app with one command and get the same result every time.

Without ComposeWith Compose
docker build -t healthpulse-portal:local -f docker/Dockerfile .docker compose -f docker/docker-compose.yml up -d
docker run -d --name healthpulse -p 8080:80 healthpulse-portal:local(one command does both build + run)
Must remember every flagFlags are in the YAML file
Multiple commands for multiple containersOne file, one command

4.2 — Review the Compose File

Open docker/docker-compose.yml:

version: "3.8"

services:
  healthpulse:
    build:
      context: ..
      dockerfile: docker/Dockerfile
      args:
        VITE_API_URL: ${VITE_API_URL:-http://localhost:3000/api}
        VITE_ENV: ${VITE_ENV:-development}
        VITE_APP_VERSION: ${VITE_APP_VERSION:-1.0.0-dev}
    ports:
      - "${APP_PORT:-3000}:80"
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:80/"]
      interval: 30s
      timeout: 5s
      retries: 3
      start_period: 10s

Key details:

  • context: .. — The build context is the parent directory (project root), not the docker/ folder
  • ${APP_PORT:-3000} — Uses the APP_PORT environment variable if set, defaults to 3000
  • restart: unless-stopped — Container restarts automatically if it crashes (unless you explicitly stop it)
  • healthcheck — Same concept as the Dockerfile HEALTHCHECK, but configured at the Compose level

Step 5: Manual Security Scanning

Before pushing your image to a registry, you should scan it for known vulnerabilities. In production, the CI pipeline (Task F) will automate this. Here, you do it by hand to understand what the automation does.

5.1 — What Is Vulnerability Scanning?

Docker images are built on base images (like nginx:1.27-alpine), which contain OS packages. Those packages may have known security vulnerabilities (CVEs). A scanner compares the packages in your image against public vulnerability databases and reports what it finds.

Your Image: healthpulse-portal:local
├── nginx:1.27-alpine (base)
│   ├── alpine 3.20 (OS)
│   │   ├── openssl 3.1.4  ← CVE-2024-XXXX (HIGH)
│   │   ├── curl 8.5.0     ← no known CVEs
│   │   ├── musl 1.2.5     ← no known CVEs
│   │   └── ...
│   └── nginx 1.27.0       ← CVE-2024-YYYY (MEDIUM)
├── Your static files (dist/)  ← not scanned (no executable code)
└── nginx.conf                 ← not scanned (config file)

5.2 — Option A: Trivy (Recommended — Open Source)

Trivy is the most widely used open-source container scanner. Install it, then scan:

# Install Trivy (macOS)
brew install trivy

# Install Trivy (Linux)
sudo apt-get install -y wget apt-transport-https gnupg lsb-release
wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | sudo apt-key add -
echo "deb https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main" | sudo tee /etc/apt/sources.list.d/trivy.list
sudo apt-get update && sudo apt-get install -y trivy

# Install Trivy (Windows — via scoop or download binary)
scoop install trivy

Run the scan:

trivy image healthpulse-portal:local

Example output:

healthpulse-portal:local (alpine 3.20.3)

Total: 5 (UNKNOWN: 0, LOW: 2, MEDIUM: 2, HIGH: 1, CRITICAL: 0)

┌──────────────┬────────────────┬──────────┬────────────────┬───────────────┬─────────────────────────────────┐
│   Library    │ Vulnerability  │ Severity │ Installed Ver  │  Fixed Ver    │            Title                │
├──────────────┼────────────────┼──────────┼────────────────┼───────────────┼─────────────────────────────────┤
│ libssl3      │ CVE-2024-XXXX  │ HIGH     │ 3.1.4-r0       │ 3.1.4-r1      │ openssl: buffer overflow in ... │
│ libcrypto3   │ CVE-2024-XXXX  │ HIGH     │ 3.1.4-r0       │ 3.1.4-r1      │ openssl: buffer overflow in ... │
│ curl         │ CVE-2024-YYYY  │ MEDIUM   │ 8.5.0-r0       │ 8.5.1-r0      │ curl: header injection via...   │
│ busybox      │ CVE-2024-ZZZZ  │ LOW      │ 1.36.1-r15     │ 1.36.1-r16    │ busybox: unsafe temp file...    │
│ musl         │ CVE-2024-WWWW  │ LOW      │ 1.2.5-r0       │               │ musl: minor memory leak...      │
└──────────────┴────────────────┴──────────┴────────────────┴───────────────┴─────────────────────────────────┘

5.3 — Option B: Docker Scout

Docker Scout is built into Docker Desktop (v4.17+):

docker scout cves healthpulse-portal:local

Example output:

    i New version 1.14.0 available (installed version is 1.13.0)
    ✓ Image stored for indexing
    ✓ Indexed 45 packages

  ## Overview

                      │ Analyzed Image
  ────────────────────┼──────────────────────────
    Target            │ healthpulse-portal:local
    digest            │ sha256:abc123...
    platform          │ linux/amd64
    vulnerabilities   │ 0C  1H  2M  2L
    size              │ 47 MB
    packages          │ 45

  ## Vulnerabilities

    1H  libssl3       3.1.4-r0   (fixed in 3.1.4-r1)
    1M  curl          8.5.0-r0   (fixed in 8.5.1-r0)
    ...

5.4 — Option C: Snyk CLI

If your organization uses Snyk:

# Install Snyk CLI
npm install -g snyk

# Authenticate
snyk auth

# Scan the image
snyk container test healthpulse-portal:local

5.5 — Understanding Severity Levels

SeverityMeaningAction
CRITICALActively exploited, remote code execution possibleFix immediately — do not deploy
HIGHSerious vulnerability, exploit likely existsFix before production deployment
MEDIUMVulnerability exists, exploit requires specific conditionsFix in next release cycle
LOWMinor issue, theoretical riskTrack and fix when convenient

5.6 — Common Fixes

ProblemFix
Vulnerabilities in base imagePin a specific patched version: FROM nginx:1.27.1-alpine instead of FROM nginx:1.27-alpine
Unnecessary packages in imageRemove packages not needed at runtime: RUN apk del <package>
Image too largeUse multi-stage builds (you already do), use Alpine variants
Outdated base imageUpdate to the latest patch: check Docker Hub for newer tags

5.7 — Document Your Findings

Record the scan results in your MkDocs wiki. Include:

  • Which scanner you used
  • How many vulnerabilities at each severity level
  • Whether fixes are available
  • What actions you would take for each finding

Key insight: This is what the CI pipeline will automate in Task F. In the pipeline, a scan runs on every build, and the build fails if CRITICAL or HIGH vulnerabilities are found. You are doing it by hand first so you understand what the pipeline is checking and why.


Step 6: Tag and Push to Registry

Your image only exists on your local machine. To deploy it elsewhere (k3s cluster, other servers, teammates), you need to push it to a container registry — a centralized store for Docker images.

6.1 — Image Tagging Strategy

Before pushing, tag your image with a version strategy:

# Tag with a specific version
docker tag healthpulse-portal:local <REGISTRY>/healthpulse-portal:1.0.0

# Also tag as latest
docker tag healthpulse-portal:local <REGISTRY>/healthpulse-portal:latest

Why two tags?

TagPurpose
1.0.0Immutable version — this exact build, forever. Used for rollback, auditing, and reproducibility.
latestFloating tag — always points to the most recent build. Convenient but dangerous in production (you don't know exactly which version you're running).

Best practice: Always deploy by version tag (1.0.0), never by latest. Use latest only for development convenience.

6.2 — Option A: JFrog Artifactory (Enterprise Registry)

If your organization uses JFrog Artifactory:

# Set your registry URL (get this from your instructor or Artifactory admin)
REGISTRY="your-artifactory.jfrog.io/healthpulse-docker"

# Tag the image for Artifactory
docker tag healthpulse-portal:local $REGISTRY/healthpulse-portal:1.0.0
docker tag healthpulse-portal:local $REGISTRY/healthpulse-portal:latest

# Log in to Artifactory
docker login your-artifactory.jfrog.io
# → Username: your-username
# → Password: your-api-token (NOT your password — generate a token in Artifactory)

# Push both tags
docker push $REGISTRY/healthpulse-portal:1.0.0
docker push $REGISTRY/healthpulse-portal:latest

Verify in the Artifactory UI:

  1. Open your Artifactory URL in a browser
  2. Navigate to Artifacts → healthpulse-docker → healthpulse-portal
  3. You should see tags 1.0.0 and latest

6.3 — Option B: Docker Hub (Public Registry)

If you are using Docker Hub:

# Your Docker Hub username
DOCKERHUB_USER="your-dockerhub-username"

# Tag the image for Docker Hub
docker tag healthpulse-portal:local $DOCKERHUB_USER/healthpulse-portal:1.0.0
docker tag healthpulse-portal:local $DOCKERHUB_USER/healthpulse-portal:latest

# Log in to Docker Hub
docker login
# → Username: your-dockerhub-username
# → Password: your-access-token (generate at hub.docker.com → Account Settings → Security)

# Push both tags
docker push $DOCKERHUB_USER/healthpulse-portal:1.0.0
docker push $DOCKERHUB_USER/healthpulse-portal:latest

Verify at https://hub.docker.com/r/<your-username>/healthpulse-portal/tags.

This is what the CI pipeline will automate in Task F. On every successful build, the pipeline will: build the image, scan it, tag it with the build number, and push it to the registry. You are doing each step manually so you understand the full flow.



No comments:

Post a Comment

TASK H - GUIDE - Dockerize the App

  Containerization & Image Management (Docker) — Step-by-Step Guide Overview In this task, you containerize the HealthPulse Portal using...