Sunday, 17 May 2026

GITLAB CI

TASK J: GitLab CI — CI Pipeline Guide

Overview

This guide gets your Continuous Integration (CI) pipeline running end-to-end.

You will complete this guide first. The CD (Continuous Delivery) components — Kustomize overlay updates, Argo CD sync — are covered in a separate guide. This separation is intentional: get CI green and your image in the registry first, then layer in CD.

By the end of this guide you will have:

  • A GitLab Runner installed and registered on your EC2
  • A .gitlab-ci.yml that installs, lints, tests, scans, builds, publishes, and image-scans your Docker image automatically on every push
  • Your image visible in GitLab's built-in Container Registry
  • Your compiled app uploaded to JFrog Artifactory

What this guide does NOT cover yet:

  • Kustomize overlay updates
  • CD repo commits
  • Argo CD syncing
  • Kubernetes deployments

Those come in the next guide once this pipeline is green.


Pipeline Overview

The CI pipeline you will build runs these stages in order:

install
   └── npm ci — downloads and caches dependencies

test  (parallel)
   ├── lint       — ESLint checks
   └── unit-tests — Vitest + coverage report

scan  (parallel)
   ├── gitleaks      — scans for committed secrets
   ├── sonarqube     — code quality + coverage gate
   └── snyk-security — dependency vulnerability scan

build
   └── build-app — compiles React/TypeScript → dist/

publish  (parallel)
   ├── docker-publish      — builds Docker image, pushes to GitLab registry
   └── artifactory-upload  — uploads dist/ to JFrog Artifactory

scan-image
   └── image-scan — scans the built Docker image for OS-level CVEs (Trivy)

At the end of a successful run you will see:

  • A green pipeline in GitLab → CI/CD → Pipelines
  • Your image tagged <pipeline-number>-<git-sha> in GitLab → Packages and registries → Container Registry
  • Your build artefacts in JFrog Artifactory
  • A Trivy scan report confirming no HIGH or CRITICAL CVEs in the image

Two-Repo Setup — Quick Recap

You have two GitLab repositories:

RepoWhat lives there
CI repoSource code, Dockerfile, .gitlab-ci.yml, tests
CD repokubernetes/ — Kustomize overlays, Argo CD Application CRDs

This guide works entirely in the CI repo. You will not touch the CD repo until the next guide.


Step 1: What is GitLab CI and Why Is It Relevant?

GitLab is more than a Git host

GitLab is a unified DevOps platform — it ships Git hosting, CI/CD, a container registry, security scanning, and environment dashboards in a single product. You do not need to wire together separate tools.

For the HealthPulse capstone this matters because:

  • Your .gitlab-ci.yml sits in the same repo as your source code — no separate CI server to configure
  • GitLab's built-in Container Registry gives you a free private Docker registry with zero setup ($CI_REGISTRY is pre-populated automatically)
  • CI/CD Variables store your secrets — no Vault, no .env files committed by accident
  • The Pipelines page shows every run, its logs, and pass/fail status in real time

How GitLab CI works — the big picture

Developer pushes to 'develop'
        │
        ▼
GitLab reads .gitlab-ci.yml
        │
        ▼
GitLab queues jobs for your Runner
        │
        ▼
Runner (your EC2) picks up each job
        │
        ▼
Runner starts a fresh Docker container per job
        │
        ▼
Scripts run inside the container
        │
        ▼
Container is destroyed, logs streamed back to GitLab
        │
        ▼
Pipeline shows green ✅ or red ❌

How jobs actually run — the container model

Each job is a fresh, isolated container that starts when the job begins and is destroyed when it ends.

The pipeline is the orchestrator. The containers are the workers. Here is what the CI pipeline looks like as containers:

Pipeline
├── stage: install
│     └── container: node:24-alpine       → runs npm ci → destroyed
├── stage: test (parallel)
│     ├── container: node:24-alpine       → runs lint → destroyed
│     └── container: node:24-alpine       → runs vitest → destroyed
├── stage: scan (parallel)
│     ├── container: gitleaks:v8.30.1     → scans repo for secrets → destroyed
│     ├── container: sonar-scanner-cli    → sends data to SonarQube server → destroyed
│     └── container: node:24-alpine       → npx downloads + runs snyk → destroyed
├── stage: build
│     └── container: node:24-alpine       → runs npm build → destroyed
├── stage: publish (parallel)
│     ├── container: docker:29.4.3-dind   → builds + pushes image → destroyed
│     └── container: jfrog-cli-v2-jf      → uploads dist/ to Artifactory → destroyed
└── stage: scan-image
      └── container: aquasec/trivy        → pulls image from registry, scans for CVEs → destroyed

The only things that survive between containers:

WhatHow
ArtifactsFiles uploaded at end of job, downloaded at start of the next job that needs: it
Cachenode_modules/ reused between pipeline runs
The registryThe built Docker image — lives there permanently after docker push

This is why image: is one of the most important keys in a job — it chooses the operating system and pre-installed tools for that container's run. Almost nothing needs to be pre-installed on your EC2 host.

GitLab's built-in container registry — is it free?

Yes. Included on every plan. Container registry storage is not counted against GitLab's 10 GiB project storage quota — it is tracked separately with no published hard limit on the free tier.

GitLab pre-populates three variables in every pipeline automatically:

VariableWhat it contains
$CI_REGISTRYRegistry URL — e.g. registry.gitlab.com
$CI_REGISTRY_USERTemporary username — valid for this pipeline run only
$CI_REGISTRY_PASSWORDTemporary password — valid for this job only

You do not need to add any variables to use it.

Registry options — if you prefer a different registry

This guide uses the GitLab built-in registry by default. The only change needed for other registries is the login command and image tag prefix:

JCR (JFrog): Add variables JCR_REGISTRY, JCR_USER, JCR_TOKEN, then:

before_script:
  - echo "$JCR_TOKEN" | docker login -u "$JCR_USER" --password-stdin "$JCR_REGISTRY"
script:
  - docker build -t $JCR_REGISTRY/$APP_NAME:$BUILD_VERSION .
  - docker push $JCR_REGISTRY/$APP_NAME:$BUILD_VERSION

DockerHub: Add variables DOCKERHUB_USER, DOCKERHUB_TOKEN, then:

before_script:
  - echo "$DOCKERHUB_TOKEN" | docker login -u "$DOCKERHUB_USER" --password-stdin
script:
  - docker build -t $DOCKERHUB_USER/$APP_NAME:$BUILD_VERSION .
  - docker push $DOCKERHUB_USER/$APP_NAME:$BUILD_VERSION

The rest of the pipeline is identical regardless of which registry you choose.


Step 2: What is a GitLab Runner?

The runner is the machine that does the work

GitLab.com is the coordinator — it stores your code, reads your .gitlab-ci.yml, and decides which jobs to run. But it does not execute the jobs itself.

A GitLab Runner is a separate agent process that runs on a machine you control. It polls GitLab for jobs, picks them up, runs the scripts, and streams the logs back.

GitLab Server                    Your EC2 (runner)
─────────────────                ──────────────────────────
Reads .gitlab-ci.yml
Queues jobs                      Runner polls: "any jobs for me?"
Assigns job to runner   ──────►  Runner receives job
                                 Runner spins up Docker container
                                 Runner executes scripts inside it
                        ◄──────  Runner streams logs back
Stores artefacts                 Runner uploads artefacts
Shows pass/fail in UI

Executor types

When you register a runner you choose an executor — how each job is isolated:

ExecutorHow it worksUse when
DockerEach job runs in a fresh container✅ Recommended — clean, isolated every time
ShellScripts run directly on the host OSLegacy only — no isolation between jobs
KubernetesA new pod per jobProduction-grade at scale — overkill for this project
Docker MachineAutoscaling VMs❌ Deprecated by GitLab

Modern standard: Kubernetes executor is used by large organisations and GitLab.com's own shared runners. For your EC2-based setup, Docker executor is the right choice — it gives you identical container-per-job isolation without the cluster management overhead, and the concepts transfer directly to Kubernetes executor when you reach a production environment.

For the HealthPulse capstone you will use:

  • Docker executor for all test, scan, and build jobs
  • Docker-in-Docker (dind) as a service inside the docker-publish job

Registration token — removed since GitLab 17.0

Before GitLab 17.0 you copied a "Registration Token" from Settings and passed it as --registration-token. That method is disabled by default since GitLab 17.0 and will be fully removed in GitLab 20.0.

New flow: create the runner in the GitLab UI first, get an authentication token (glrt- prefix), then register on EC2 using --token. This guide uses the new flow.


Step 3: Prepare Your EC2 for the Runner

3.1 — Check your EC2 specs

ResourceMinimumRecommended
CPU1 vCPU2 vCPU (t3.medium)
RAM2 GB4 GB
Disk20 GB30 GB (Docker image layers cache up fast)
OSUbuntu 20.04 LTSUbuntu 22.04 / 24.04 LTS

t2.micro will not work. The docker-publish job runs Docker-in-Docker and needs at least 2 GB RAM to build the image without running out of memory.

3.2 — Install Docker Engine

Do not use docker.io — it is an unofficial Ubuntu-maintained package that lags on security patches and can conflict with official Docker packages. Docker's own documentation explicitly says to remove it first.

SSH into your runner EC2 and run:

# Step 1 — Remove any unofficial Docker packages if present
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do
  sudo apt-get remove -y $pkg 2>/dev/null || true
done

# Step 2 — Install required packages
sudo apt-get update
sudo apt-get install -y ca-certificates curl

# Step 3 — Add Docker's official GPG key
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Step 4 — Add the Docker CE repository
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Step 5 — Install Docker CE
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# Step 6 — Start Docker and enable on boot
sudo systemctl start docker
sudo systemctl enable docker

Verify it works:

docker --version
# Expected: Docker version 29.x.x, build ...

sudo docker run hello-world
# Expected: "Hello from Docker!" message

3.3 — Why Docker is required

This guide uses the Docker executor. Every CI job specifies an image: — the Runner pulls that image and runs the job scripts inside a fresh container. Without Docker on the host, every job fails with:

Cannot connect to the Docker daemon at unix:///var/run/docker.sock

What needs to be on the host vs what arrives via Docker images:

ToolNeeded on host?How it arrives
Docker Engine✅ Yes — install it nowStep 3.2 above
Node.js❌ Nonode:24-alpine image per job
SonarScanner❌ Nosonarsource/sonar-scanner-cli:12.1 image
Gitleaks❌ Noghcr.io/gitleaks/gitleaks:v8.30.1 image
Docker CLI (for builds)❌ Nodocker:29.4.3-dind image
git✅ Pre-installed on Ubuntu
curl✅ Pre-installed on Ubuntu

Step 4: Install and Register a GitLab Runner

4.1 — Install the runner binary

On the EC2:

# Add the GitLab Runner package repository
curl -L https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh | sudo bash

# Install
sudo apt-get install -y gitlab-runner

# Verify
gitlab-runner --version
# Expected: gitlab-runner 18.x.x (current stable: 18.11)

4.2 — Create the runner in GitLab UI

  1. Go to your CI repo on GitLab
  2. Click SettingsCI/CDRunners
  3. Click New project runner
  4. Fill in:
    • Platform: Linux
    • Tags: healthpulse,docker
    • Description: healthpulse-runner
    • Tick Run untagged jobs
  5. Click Create runner
  6. GitLab shows a gitlab-runner register command with a pre-filled glrt- token — copy the full command

4.3 — Register the runner on EC2

Paste the command GitLab gave you, but add --docker-privileged before running:

sudo gitlab-runner register \
  --non-interactive \
  --url "https://gitlab.com/" \
  --token "glrt-t3_xxxxxxxxxxxxxxxxxxxx" \
  --executor "docker" \
  --docker-image "alpine:latest" \
  --docker-privileged \
  --description "healthpulse-runner"

Why --docker-privileged? The docker-publish job runs Docker-in-Docker — a container that runs docker build inside itself. This requires elevated permissions to talk to the Docker daemon. Without this flag the image build step fails.

Only enable privileged mode on a dedicated runner EC2, not on a shared or production server.

4.4 — Start and verify the runner

sudo gitlab-runner start
sudo gitlab-runner status
# Expected: gitlab-runner: Service is running

sudo gitlab-runner list
# Expected: healthpulse-runner   Executor=docker  Token=glrt-...

Go to SettingsCI/CDRunners in GitLab. The runner should show a green circle (online).

4.5 — Give the runner access to Docker

sudo usermod -aG docker gitlab-runner
sudo systemctl restart gitlab-runner

# Verify
sudo -u gitlab-runner docker ps
# Expected: empty table (no error)

Step 5: Configure CI/CD Variables

5.1 — Open the Variables settings

Go to your CI repoSettingsCI/CD → expand Variables → click Add variable.

5.2 — Variables required for the CI pipeline

Security scanning:

KeyValueProtectedMasked
SONAR_TOKENYour SonarQube user tokenYesYes
SONAR_HOST_URLhttp://<SONARQUBE_IP>:9000NoNo
SNYK_TOKENYour Snyk API tokenYesYes

Artifactory upload:

KeyValueProtectedMasked
ARTIFACTORY_URLhttp://<JCR_IP>:8082/artifactoryNoNo
ARTIFACTORY_USERhealthpulse-deployerNoNo
ARTIFACTORY_PASSWORDJCR access tokenYesYes

GitLab registry — nothing to add. $CI_REGISTRY, $CI_REGISTRY_USER, and $CI_REGISTRY_PASSWORD are injected automatically by GitLab into every pipeline.

Variables for CD components (CD_REPO_TOKEN, CD_REPO_URL, Ansible Tower variables) are added in the CD guide. You do not need them yet.

5.3 — What Protected and Masked mean

Protected — the variable is only injected on protected branches (e.g. main, develop). Use this for tokens and passwords.

Masked — the value is replaced with [MASKED] in job logs. Always mask passwords and tokens.

5.4 — Where to get your Snyk token

  1. Go to app.snyk.ioAccount Settings
  2. Under Auth Token, click Click to show
  3. Copy the token and paste it as SNYK_TOKEN in GitLab

Never call snyk auth in CI. It opens a browser OAuth flow and hangs in headless environments. The SNYK_TOKEN environment variable is the correct approach — the Snyk CLI picks it up automatically.


Step 6: Understanding the .gitlab-ci.yml

Read this section before creating the file. It explains every concept you will use.

What is it and where does it live?

It is a YAML file at the root of your CI repo. GitLab reads it automatically on every push and uses it to decide what pipeline to run.

healthpulse-ci/            ← your CI repo root
├── src/
├── docker/
│   ├── Dockerfile
│   └── nginx.conf
├── tests/
├── .gitlab-ci.yml         ← here, at the root
└── package.json

The Dockerfile — what it is and what it does

The pipeline has two separate steps that together produce the Docker image:

build-app (node:24-alpine)           docker-publish (docker:29.4.3-dind)
──────────────────────────           ───────────────────────────────────
npm run build                   →    docker build -f docker/Dockerfile
produces dist/ ─────────────────►    Dockerfile copies dist/ into Nginx
                  artifact

build-app runs npm run build inside a Node container. Vite compiles your React/TypeScript source into optimised static files in dist/. That folder is uploaded as a GitLab artifact.

docker-publish receives the dist/ artifact, then runs docker build. The Dockerfile packages those pre-built static files into a lightweight Nginx image.

Create docker/Dockerfile in your CI repo:

# =============================================================
#  HealthPulse Portal — Dockerfile
#  Serves the pre-compiled React app via Nginx.
#
#  The dist/ folder is built by the 'build-app' CI job and
#  passed here as a GitLab artifact. This image just packages
#  and serves it — no Node.js needed at runtime.
# =============================================================

FROM nginx:1.27-alpine

# Remove the default Nginx welcome page
RUN rm -rf /usr/share/nginx/html/*

# Copy the compiled React app from the CI build-app artifact
COPY dist/ /usr/share/nginx/html

# Copy the custom Nginx config (handles React Router client-side routing)
COPY docker/nginx.conf /etc/nginx/conf.d/default.conf

EXPOSE 80

CMD ["nginx", "-g", "daemon off;"]

Create docker/nginx.conf in your CI repo:

server {
    listen 80;
    server_name _;

    root /usr/share/nginx/html;
    index index.html;

    # React Router — serve index.html for all routes
    location / {
        try_files $uri $uri/ /index.html;
    }

    # Cache static assets aggressively
    location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff|woff2|ttf|eot)$ {
        expires 1y;
        add_header Cache-Control "public, immutable";
    }

    # Do not cache index.html — always serve the latest version
    location = /index.html {
        add_header Cache-Control "no-cache, no-store, must-revalidate";
    }

    # Health check endpoint — used by Kubernetes liveness probe
    location /health {
        return 200 'OK';
        add_header Content-Type text/plain;
    }
}

Commit both files before creating the pipeline:

mkdir -p docker
# create docker/Dockerfile and docker/nginx.conf with the content above
git add docker/Dockerfile docker/nginx.conf
git commit -m "feat: add Dockerfile and Nginx config for React app"
git push origin develop

Top-level structure

variables:      # Global variables available to all jobs
  KEY: value

stages:         # Order stages run in
  - install
  - test

job-name:       # A job definition
  stage: test
  script:
    - echo "hello"

Anatomy of a job

job-name:
  stage: test              # Which stage this job belongs to
  image: node:24-alpine    # Docker image — the container for this job
  needs: [install]         # Jobs that must finish before this one starts
  variables:               # Job-level variables (override globals)
    MY_VAR: value
  before_script:           # Runs before script — use for setup (login, installs)
    - npm config set ...
  script:                  # The actual commands — this is the job's work
    - npm run test
  artifacts:               # Files to keep and pass to later jobs
    paths:
      - coverage/
    expire_in: 7 days
  cache:                   # Files to reuse between pipeline runs
    paths:
      - node_modules/
  services:                # Sidecar containers (e.g. Docker daemon)
    - docker:29.4.3-dind
  allow_failure: false     # If true, pipeline stays green even if this job fails

Key concepts explained

image — The Docker image that becomes the container for this job. It provides the OS and pre-installed tools. Each job can use a different image.

needs — Creates a dependency. The job waits for those jobs to finish and downloads their artifacts before starting. Without needs:, jobs in the same stage run in parallel.

artifacts — Files uploaded to GitLab after the job. Any job that needs: this job can download them. This is how dist/ travels from build-app to docker-publish.

cache — Files persisted between pipeline runs (not between jobs). Use for node_modules/ so npm ci does not re-download packages on every push.

services — Sidecar containers that run alongside the job. docker:29.4.3-dind starts a Docker daemon the job's Docker CLI can talk to — this is what Docker-in-Docker means.

allow_failure — Controls whether pipeline fails if this job fails. false (default) means a failing job stops the pipeline. true means the job is informational — it reports findings but does not block.

YAML anchors — avoiding repetition

.node_cache: &node_cache       # Define once
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/

install:
  <<: *node_cache              # Merge in — same as writing it out
  script:
    - npm ci

&node_cache defines the anchor. *node_cache references it. <<: merges it into the job. This prevents repeating the same cache config in every job.

How each scan job works

gitleaks

The runner pulls the gitleaks image, starts a container, and GitLab automatically clones your repo into it before your script: runs. The full scan — all files and commit history — happens entirely inside the container.

entrypoint: [""] is required because the gitleaks image sets its default entrypoint to the gitleaks binary. Without overriding it, Docker would pass your script: lines as arguments to gitleaks instead of shell commands — breaking GitLab's wrapper.

needs: [] means it starts the moment the scan stage begins with no dependencies.

sonarqube

Same container mechanism, but needs: [unit-tests] does two things: waits for the job to finish and downloads its artifacts. That is how coverage/lcov.info arrives in this container.

The sonar-scanner CLI is a client — it reads your source files and coverage report, then ships everything to your SonarQube server ($SONAR_HOST_URL). The actual analysis runs on the server. -Dsonar.qualitygate.wait=true tells the scanner to keep polling the server until it returns a pass or fail result before the job exits.

snyk-security

The node:24-alpine image does not contain the Snyk CLI. That is why npx snyk is used. npx is the Node Package Runner — it checks node_modules/.bin/snyk first, then downloads Snyk from npm on the fly if not found.

needs: [install] brings node_modules/ into this container so Snyk can see the exact resolved version of every dependency and transitive dependency — not just the version ranges in package.json. Exact versions are what Snyk checks against its vulnerability database.

JobImage contains the tool?How the tool arrives
gitleaks✅ Yes — the image is gitleaksBuilt into the image
sonarqube✅ Yes — baked into CLI imagePre-installed in the image
snyk-security❌ No — plain Node imagenpx downloads it at runtime

Source code scans vs image scan — why you need both

The three scans above all run before the Docker image is built. They look at your source code and your node_modules/ dependencies. They cannot see what is inside the final image.

The image scan runs after docker-publish. It pulls the image that was just pushed to the registry and scans everything inside it — including the OS packages in nginx:1.27-alpine. These are completely different from your npm dependencies.

Source code scans (gitleaks, sonarqube, snyk)
  └── What they see: your .ts files, node_modules/, git history
  └── What they miss: Alpine Linux packages, Nginx binaries, OpenSSL in the image

Image scan (Trivy)
  └── What it sees: every layer of the built Docker image
  └── Catches: CVEs in nginx, Alpine libc, OpenSSL, libcrypto — anything in the base image
  └── Does not care about: your TypeScript source code

A real-world example: your source code could be perfectly clean, but nginx:1.27-alpine could ship with a version of libssl that has a known CVE. Only an image scan catches that.

Trivy (by Aqua Security) is the de-facto standard for container image scanning. It is free, fast, and runs entirely inside the container — no external service needed. It downloads its vulnerability database at runtime from Aqua's public DB.

image-scan:
  image:
    name: aquasec/trivy:latest
    entrypoint: [""]
  needs: [docker-publish]
  variables:
    TRIVY_USERNAME: $CI_REGISTRY_USER      # Trivy uses these to pull the image
    TRIVY_PASSWORD: $CI_REGISTRY_PASSWORD  # from the GitLab registry for scanning
  script:
    - trivy image --exit-code 1 --severity HIGH,CRITICAL
        --no-progress --format table
        --output trivy-results.txt
        $CI_REGISTRY/$APP_NAME:$BUILD_VERSION

--exit-code 1 means Trivy exits with code 1 (fail) if it finds any HIGH or CRITICAL CVEs — blocking the pipeline. --severity HIGH,CRITICAL ignores LOW and MEDIUM findings. You can tighten or loosen this threshold as your team's policy dictates.

TRIVY_USERNAME and TRIVY_PASSWORD are set to GitLab's auto-injected registry credentials so Trivy can pull the image it needs to scan. Without these, Trivy cannot authenticate with a private registry.


Step 7: Create the .gitlab-ci.yml

In the root of your CI repo, create .gitlab-ci.yml with the following content.

Option A — on your local machine:

cd ~/projects/healthpulse-ci    # adjust to wherever you cloned it
touch .gitlab-ci.yml
# open in your editor, paste the content below, save

Option B — GitLab web editor:

  1. Go to your CI repo on GitLab
  2. Click +New file
  3. File name: .gitlab-ci.yml
  4. Paste the content below

# =============================================================
#  HealthPulse Portal — GitLab CI Pipeline (CI only)
#  Repo: CI repo (source code)
#  Flow: Install → Test → Scan → Build → Publish
#
#  CD components (Kustomize overlay updates, Argo CD sync)
#  are added in the next guide after this pipeline is green.
# =============================================================

# ─────────────── GLOBAL VARIABLES ───────────────
variables:
  APP_NAME: healthpulse-portal
  NODE_IMAGE: node:24-alpine

# ─────────────── PIPELINE STAGES ───────────────
stages:
  - install        # npm ci — download dependencies
  - test           # lint + unit tests (parallel)
  - scan           # gitleaks + sonarqube + snyk (parallel)
  - build          # npm run build — compile the React app
  - publish        # docker build + push + artifactory upload (parallel)
  - scan-image     # trivy — scan the built Docker image for OS-level CVEs

# ─────────────── YAML ANCHOR — node cache ───────────────
.node_cache: &node_cache
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/

# =============================================================
# STAGE: INSTALL
# Downloads npm dependencies and caches them.
# Uploads node_modules/ as an artifact so later jobs get it
# without repeating the install.
# =============================================================
install:
  stage: install
  image: $NODE_IMAGE
  <<: *node_cache
  script:
    - npm ci
  artifacts:
    paths:
      - node_modules/
    expire_in: 1 hour

# =============================================================
# STAGE: TEST
# lint and unit-tests run in parallel — both need node_modules.
# unit-tests uploads the coverage report as an artifact for
# SonarQube to use in the scan stage.
# =============================================================
lint:
  stage: test
  image: $NODE_IMAGE
  needs: [install]
  script:
    - npm run lint

unit-tests:
  stage: test
  image: $NODE_IMAGE
  needs: [install]
  script:
    - npm run test:coverage
  artifacts:
    when: always       # Upload coverage even if tests fail (useful for debugging)
    paths:
      - coverage/
    reports:
      junit: test-results/*.xml
    expire_in: 7 days
  coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'

# =============================================================
# STAGE: SCAN
# Three scans run in parallel after tests pass.
#
# gitleaks:      checks for secrets committed to the repo
# sonarqube:     code quality + coverage gate
# snyk-security: dependency vulnerability scan
#
# gitleaks and sonarqube are blocking (allow_failure: false).
# snyk is informational (allow_failure: true).
# =============================================================
gitleaks:
  stage: scan
  image:
    name: ghcr.io/gitleaks/gitleaks:v8.30.1
    entrypoint: [""]    # Override image entrypoint so GitLab can run its own shell script
  needs: []             # No dependency — runs immediately when scan stage starts
  script:
    - gitleaks detect --source . --config .gitleaks.toml --verbose
  allow_failure: false  # Pipeline fails if secrets are found

sonarqube:
  stage: scan
  image:
    name: sonarsource/sonar-scanner-cli:12.1
    entrypoint: [""]
  needs: [unit-tests]   # Needs coverage report from unit-tests artifacts
  variables:
    SONAR_USER_HOME: "${CI_PROJECT_DIR}/.sonar"   # Cache scanner rules/plugins
  script:
    - sonar-scanner
      -Dsonar.projectKey=$APP_NAME
      -Dsonar.projectName="HealthPulse Portal"
      -Dsonar.sources=src
      -Dsonar.exclusions="**/test/**,**/node_modules/**"
      -Dsonar.javascript.lcov.reportPaths=coverage/lcov.info
      -Dsonar.host.url=$SONAR_HOST_URL
      -Dsonar.token=$SONAR_TOKEN
      -Dsonar.qualitygate.wait=true   # Block pipeline until quality gate result
  allow_failure: false

snyk-security:
  stage: scan
  image: $NODE_IMAGE
  needs: [install]      # Needs node_modules for accurate dependency tree scanning
  variables:
    # SNYK_TOKEN is set as a CI/CD Variable (masked + protected).
    # The Snyk CLI picks it up automatically — no 'snyk auth' call needed in CI.
    # 'snyk auth' opens a browser OAuth flow and will hang in headless environments.
    SNYK_TOKEN: $SNYK_TOKEN
  script:
    - npx snyk test --severity-threshold=high --json > snyk-results.json || true
    - npx snyk monitor --project-name=$APP_NAME
  artifacts:
    paths:
      - snyk-results.json
    expire_in: 30 days
  allow_failure: true   # Snyk findings are reported but do not block the pipeline

# =============================================================
# STAGE: BUILD
# Compiles the React/TypeScript app to static files in dist/.
# Waits for lint and unit-tests to both pass before building.
# The dist/ folder is uploaded as an artifact for docker-publish.
# =============================================================
build-app:
  stage: build
  image: $NODE_IMAGE
  needs: [install, lint, unit-tests]
  script:
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 7 days

# =============================================================
# STAGE: PUBLISH
# Two jobs run in parallel:
#
# docker-publish:     builds the Docker image and pushes to
#                     GitLab's built-in container registry.
#                     Also tags as :latest for convenience.
#
# artifactory-upload: uploads the compiled dist/ folder to
#                     JFrog Artifactory for traceability.
#
# BUILD_VERSION format: {pipeline-number}-{git-sha}
# Example: 42-a1b2c3d — unique, traceable, sortable
# =============================================================
docker-publish:
  stage: publish
  image: docker:29.4.3-dind
  services:
    - docker:29.4.3-dind    # Starts the Docker daemon this job will use
  needs: [build-app, sonarqube]
  variables:
    BUILD_VERSION: "${CI_PIPELINE_IID}-${CI_COMMIT_SHORT_SHA}"
  before_script:
    # $CI_REGISTRY_PASSWORD is a short-lived token GitLab generates per job
    - echo "${CI_REGISTRY_PASSWORD}" | docker login -u "${CI_REGISTRY_USER}" --password-stdin "${CI_REGISTRY}"
  script:
    # Build the image using the Dockerfile in docker/
    - docker build -f docker/Dockerfile -t $CI_REGISTRY/$APP_NAME:$BUILD_VERSION .
    # Push the versioned tag — this is what Argo CD will deploy later
    - docker push $CI_REGISTRY/$APP_NAME:$BUILD_VERSION
    # Also tag and push :latest — useful for manual pulls
    - docker tag $CI_REGISTRY/$APP_NAME:$BUILD_VERSION $CI_REGISTRY/$APP_NAME:latest
    - docker push $CI_REGISTRY/$APP_NAME:latest

artifactory-upload:
  stage: publish
  image: releases-docker.jfrog.io/jfrog/jfrog-cli-v2-jf:2.103.0
  needs: [build-app]
  variables:
    BUILD_VERSION: "${CI_PIPELINE_IID}-${CI_COMMIT_SHORT_SHA}"
  script:
    - jfrog rt upload "dist/**" "healthpulse-builds/$BUILD_VERSION/"
      --url=$ARTIFACTORY_URL
      --user=$ARTIFACTORY_USER
      --password=$ARTIFACTORY_PASSWORD
      --build-name=$APP_NAME
      --build-number=$BUILD_VERSION

# =============================================================
# STAGE: SCAN-IMAGE
# Scans the built Docker image for OS-level CVEs using Trivy.
# Runs AFTER docker-publish — it pulls the image from the
# registry and scans every layer including the base OS packages.
#
# This is separate from the source code scans (gitleaks, snyk)
# which cannot see inside the Docker image. Trivy catches CVEs
# in nginx, Alpine Linux, OpenSSL — anything in the base image.
#
# TRIVY_USERNAME / TRIVY_PASSWORD — GitLab's auto-injected
# registry credentials. Trivy needs these to pull the private
# image from the GitLab Container Registry for scanning.
# =============================================================
image-scan:
  stage: scan-image
  image:
    name: aquasec/trivy:latest
    entrypoint: [""]
  needs: [docker-publish]
  variables:
    BUILD_VERSION: "${CI_PIPELINE_IID}-${CI_COMMIT_SHORT_SHA}"
    TRIVY_USERNAME: $CI_REGISTRY_USER      # Authenticate with GitLab registry
    TRIVY_PASSWORD: $CI_REGISTRY_PASSWORD  # to pull the image for scanning
  script:
    # Scan the image — fail pipeline on any HIGH or CRITICAL CVE
    # --exit-code 1     → exit with failure if findings match severity filter
    # --severity        → only report HIGH and CRITICAL (ignore LOW/MEDIUM)
    # --no-progress     → cleaner CI log output
    # --format table    → human-readable table output
    # --output          → save report to file (uploaded as artifact)
    - trivy image
        --exit-code 1
        --severity HIGH,CRITICAL
        --no-progress
        --format table
        --output trivy-results.txt
        $CI_REGISTRY/$APP_NAME:$BUILD_VERSION
    - cat trivy-results.txt
  artifacts:
    when: always       # Save report even if scan fails — useful for review
    paths:
      - trivy-results.txt
    expire_in: 30 days
  allow_failure: false  # Pipeline fails if HIGH or CRITICAL CVEs are found

Step 8: Push and Watch Your First Pipeline

8.1 — Commit and push

git add .gitlab-ci.yml
git commit -m "feat: add GitLab CI pipeline"
git push origin develop

8.2 — Watch the pipeline run

Go to your CI repo → CI/CDPipelines. A pipeline appears within a few seconds.

Click on it to see the stage graph. Jobs turn orange (running) and then green (passed) or red (failed) in real time.

Expected run order:

install
   └─► lint (parallel with unit-tests)
   └─► unit-tests (parallel with lint)
          └─► sonarqube (waits for coverage)
   └─► gitleaks (starts immediately)
   └─► snyk-security (waits for node_modules)
          └─► build-app (waits for lint + unit-tests)
                 └─► docker-publish (parallel with artifactory-upload)
                 └─► artifactory-upload (parallel with docker-publish)
                        └─► image-scan (waits for docker-publish)

8.3 — Read job logs

Click any job to see its full log output. Useful things to look for:

JobWhat a successful log looks like
installadded N packages — npm finished
lintNo output or N problems found
unit-testsN tests passed with coverage summary
gitleaksNo leaks found
sonarqubeQuality Gate status: PASSED
build-appdist/ built in Ns
docker-publishdigest: sha256:... — image pushed
image-scanTotal: 0 (HIGH: 0, CRITICAL: 0) — no CVEs found

Step 9: Verify the Image in GitLab Container Registry

After docker-publish completes:

  1. Go to your CI repo → Packages and registriesContainer Registry
  2. You should see:
healthpulse-portal
  latest       pushed just now    registry.gitlab.com/your-group/healthpulse-ci:latest
  42-a1b2c3d   pushed just now    registry.gitlab.com/your-group/healthpulse-ci:42-a1b2c3d

Pull the image to verify it runs

On any machine with Docker installed and access to GitLab:

# Log in
echo "<your-personal-access-token>" | docker login registry.gitlab.com \
  -u <your-gitlab-username> --password-stdin

# Pull the versioned image
docker pull registry.gitlab.com/<your-group>/healthpulse-ci:latest

# Run it locally
docker run -p 8080:80 registry.gitlab.com/<your-group>/healthpulse-ci:latest

Open http://localhost:8080 in your browser. You should see the HealthPulse Portal.

If the image runs correctly locally, your Dockerfile and Nginx config are working. This is the same image that will be deployed to Kubernetes in the CD guide.


Step 10: Verify Artifactory Upload

After artifactory-upload completes:

  1. Open http://<JCR_IP>:8082 in your browser
  2. Log in to JFrog Artifactory
  3. Go to ArtifactoryArtifacts
  4. Browse to healthpulse-builds/<pipeline-number>-<sha>/

You should see the contents of dist/ uploaded there — index.html, JS bundles, CSS, assets.


What's Next — Adding CD Components

Now that CI is green and your image is in the registry, you are ready to add the CD pipeline.

The CD guide covers:

  • Creating and configuring the CD repo (healthpulse-cd)
  • Adding CD_REPO_TOKEN and CD_REPO_URL variables
  • Adding update-dev-manifest stage — CI commits the new image tag to the CD repo
  • Adding update-uat-manifest stage — runs on release/* branches
  • Connecting Argo CD to watch the CD repo
  • The [skip ci] rules pattern that prevents an infinite loop
  • Prod promotion via merge request — no CI stage writes to overlays/prod/

Do not add CD stages to this pipeline yet. Complete the CD guide as a separate step. Mixing them before CI is stable makes debugging significantly harder.


Acceptance Criteria

Before marking this step complete, verify every item:

  •  GitLab Runner installed on EC2, Docker executor, privileged mode enabled
  •  Runner shows green (online) in SettingsCI/CDRunners
  •  All CI/CD Variables configured: SONAR_TOKEN, SONAR_HOST_URL, SNYK_TOKEN, ARTIFACTORY_URL, ARTIFACTORY_USER, ARTIFACTORY_PASSWORD
  •  docker/Dockerfile and docker/nginx.conf committed to the CI repo
  •  .gitlab-ci.yml committed to the root of the CI repo
  •  Push to develop triggers a pipeline
  •  All six stages pass: install → test → scan → build → publish → scan-image
  •  Docker image appears in Packages and registriesContainer Registry tagged <iid>-<sha>
  •  Image tagged latest also present
  •  Image runs locally and serves the HealthPulse Portal on port 8080
  •  Build artefacts visible in JFrog Artifactory under healthpulse-builds/<version>/
  •  image-scan job passes — trivy-results.txt artifact downloadable from the job
  •  Trivy report shows Total: 0 (HIGH: 0, CRITICAL: 0) or all findings are acknowledged

Troubleshooting

Runner is offline (grey circle in GitLab)

sudo gitlab-runner status
sudo gitlab-runner start

sudo gitlab-runner verify
sudo journalctl -u gitlab-runner -f

Jobs fail with "Cannot connect to the Docker daemon"

Two possible causes:

  1. Docker is not installed — run Step 3.2
  2. gitlab-runner is not in the docker group:
sudo usermod -aG docker gitlab-runner
sudo systemctl restart gitlab-runner

docker-publish fails with "permission denied" on docker socket

The runner was not registered with --docker-privileged. Re-register:

sudo gitlab-runner unregister --name "healthpulse-runner"

sudo gitlab-runner register \
  --non-interactive \
  --url "https://gitlab.com/" \
  --token "glrt-..." \
  --executor "docker" \
  --docker-image "alpine:latest" \
  --docker-privileged \
  --description "healthpulse-runner"

sonarqube job fails — "Could not connect to SonarQube server"

  1. Check SONAR_HOST_URL is set correctly (no trailing slash, correct port)
  2. Check the SonarQube server is running: curl http://<SONARQUBE_IP>:9000/api/system/status
  3. Check the security group on the SonarQube EC2 allows inbound TCP 9000 from the runner EC2

gitleaks fails — real secrets found

This is the correct behaviour. Gitleaks found actual secrets in your code or commit history. You must:

  1. Rotate the exposed credential immediately
  2. Remove the secret from the codebase
  3. If it is in git history, you need to rewrite history with git filter-repo
  4. Add the pattern to .gitleaks.toml to allowlist false positives

unit-tests pass locally but fail in CI

Common causes:

  • Tests depend on environment variables not set in CI — add them as CI/CD Variables
  • Tests use Date.now() or timezone-sensitive functions — CI runner timezone may differ
  • Test imports use absolute paths that work locally but fail in Alpine container

docker-publish fails — image push rejected

denied: access forbidden

Check that $CI_REGISTRY_PASSWORD is being passed correctly. The --password-stdin flag expects it from stdin, not as a positional argument:

- echo "${CI_REGISTRY_PASSWORD}" | docker login -u "${CI_REGISTRY_USER}" --password-stdin "${CI_REGISTRY}"

image-scan fails — CVEs found in the image

Trivy found HIGH or CRITICAL CVEs in the Docker image. Download trivy-results.txt from the job artifacts to see which packages are affected.

Common causes:

  • The base image (nginx:1.27-alpine) has known CVEs in Alpine or Nginx packages — update to the latest patch release in your Dockerfile: FROM nginx:1.27-alpine (re-run to get the latest 1.27.x)
  • The base image is pinned to an old digest — rebuild without the cache to pull fresh layers

If a CVE has no fix yet (Trivy shows "fixed version: none"), you can lower the threshold temporarily while tracking the issue:

- trivy image --exit-code 1 --severity CRITICAL ...   # only fail on CRITICAL, not HIGH

Document why you changed the threshold in a comment.

image-scan fails — "unauthorized" pulling image from registry

Trivy cannot authenticate with the GitLab Container Registry. Check that TRIVY_USERNAME and TRIVY_PASSWORD are set correctly in the job variables:

variables:
  TRIVY_USERNAME: $CI_REGISTRY_USER
  TRIVY_PASSWORD: $CI_REGISTRY_PASSWORD

These are GitLab-injected variables — they should always be available. If the job runs on an unprotected branch and your registry is set to "Private", confirm the runner has access to the project.

artifactory-upload fails — authentication error

  1. Verify ARTIFACTORY_URL does not have a trailing slash
  2. Verify the user has deploy permissions on the repository in JFrog
  3. Test the credentials manually: curl

GITLAB CI

TASK J: GitLab CI — CI Pipeline Guide Overview This guide gets your Continuous Integration (CI) pipeline running end-to-end. You will compl...