HealthPulse Portal — Complete Capstone Project
*HealthPulse Inc.** is a healthcare technology startup that has built a patient portal as a **React/TypeScript single-page application**. The application allows patients to view appointments, lab results, medications, and communicate with their care team.
Currently, the development team **manually builds and deploys** the application by:
1. Running `npm run build` on a developer's laptop
2. SCP-ing the `dist/` folder to a single Nginx server
3. SSHing into the server and restarting Nginx
and hosted by their private server
https://healthpulse-capstone.vercel.app/
This process takes **45 minutes per deployment**, is error-prone, and has caused **3 production outages** in the last quarter from misconfigurations. There is **no testing in the pipeline**, **no code quality checks**, **no security scanning**, and **no monitoring**.
**HealthPulse Inc. has hired your DevOps team** to design and implement a complete CI/CD pipeline, multi-environment infrastructure, container orchestration, and observability platform on **AWS**.
---
## Application Details
| Item | Detail |
|------|--------|
| **App Name** | HealthPulse Portal |
| **Tech Stack** | React 18, TypeScript, Vite, shadcn/ui, Tailwind CSS |
| **Testing** | Vitest (unit), Playwright (e2e) |
| **Build Output** | Static files (`dist/`) served by Nginx |
| **Container** | Multi-stage Dockerfile (Node build → Nginx serve) |
| **Health Endpoint** | `GET /health` → `{"status":"healthy"}` |
Stack: React 18 + TypeScript + Vite + shadcn/ui + Tailwind CSS + Recharts
| New File | Purpose |
|---|---|
docs/mkdocs.yml | MkDocs config with Material theme, dark/light toggle, nav, extensions |
docs/Dockerfile | Multi-stage build (mkdocs-material → nginx:alpine) |
docs/docker-compose.yml | Prod on port 84 + live-reload dev mode on port 8084 |
docs/docs/index.md | Home page with project overview, team roster template |
docs/docs/architecture.md | ADR templates (CI/CD platform + container orchestration) |
docs/docs/environments.md | Environment matrix table (Dev/UAT/QA/Prod) |
docs/docs/runbooks.md | 4 runbook templates (deploy, rollback, scale, incident) |
docs/docs/pipeline.md | CI/CD pipeline stage docs with diagrams |
Summary Task A
TASK A: Documentation Platform (Docs-as-Code)
1. Set up MkDocs with Material theme inside the deployment repo
2. Create a docker-compose.yml to serve docs on port 84
3. Write initial documentation pages:
- Team roster and roles
- ADR: "Why we chose [Jenkins/GitLab/Azure DevOps]"
- Environment matrix (Dev/UAT/QA/Prod)
- Runbook template
4. Build docs via Docker (multi-stage: mkdocs build → nginx serve)
5. CI pipeline auto-builds docs site on push to /docs folder
Acceptance Criteria:
- Docs served on port 84 via Docker
- mkdocs.yml and all markdown files committed to Git
- Multi-stage Dockerfile builds and serves the docs
- 4 documentation pages created with real content
1. Live Reload Dev Mode
When writing documentation (editing the Markdown files), you need to see how their changes look in real-time. That's what dev mode does:
Student edits runbooks.md → saves file → browser auto-refreshes → sees updated page instantly
Without dev mode: Edit markdown → rebuild Docker image → restart container → refresh browser → check result. That's painful and slow.
With dev mode: MkDocs watches the files. The second you hit save, the browser updates automatically. It's the same concept as npm run dev for the React app — hot reload for docs.
In the docker-compose.yml, there are two services:
| Service | Port | Purpose |
|---|---|---|
docs-prod | 84 | Built static site served by Nginx (what users/team see) |
docs-dev | 8084 | Live preview with auto-refresh (only used while writing docs) |
Students use 8084 while writing, then build and deploy to 84 for production. It's a workflow thing — not two permanent servers.
2. Runbook Template
A runbook is an operational instruction manual — step-by-step procedures for when things happen in production. Think of it like a recipe book, but for servers.
Every real DevOps team has them. Eg When it's 2 AM and production is down, you don't want the on-call engineer guessing — you want them following a tested checklist.
Here's An Example of what the you would fill in as you complete the project:
RUNBOOK: Deploy New Version
═══════════════════════════
When to use: New release ready for production
Who can run: DevOps team lead
Steps:
1. Verify build passed in Jenkins → check #healthpulse-builds Slack
2. Confirm SonarQube quality gate passed
3. Approve deployment in pipeline (manual gate)
4. Monitor Datadog dashboard during rollout
5. Verify /health endpoint returns 200
6. If health check fails → pipeline auto-rolls back via Ansible
───────────────────────────
RUNBOOK: Rollback Production
════════════════════════════
When to use: Production deployment caused errors
Who can run: Any DevOps team member
Steps:
1. Run: ./scripts/k8s-manage.sh rollback
OR: Trigger Ansible Tower rollback job
2. Verify previous version is serving traffic
3. Check Datadog for error rate returning to normal
4. Post incident summary in wiki
───────────────────────────
RUNBOOK: Scale Application
══════════════════════════
When to use: High traffic / slow response times
Who can run: Any DevOps team member
Steps:
1. Check Datadog → confirm CPU/memory is the bottleneck
2. Run: REPLICAS=6 ./scripts/k8s-manage.sh scale
3. Monitor HPA: kubectl get hpa -n healthpulse-prod
4. Scale back down after traffic normalizeshealthpulse-docs/
├── mkdocs.yml # Site config + navigation
├── Dockerfile # Multi-stage build (mkdocs → nginx)
├── docker-compose.yml # Prod (port 84) + dev (port 8084)
└── docs/
├── index.md # Home — project overview, team roster, quick links
├── architecture.md # ADR templates (CI/CD choice, orchestration choice)
├── environments.md # Environment matrix (IPs, URLs, sizing)
├── pipeline.md # CI/CD pipeline stages and config
├── setup-template.md # Reusable template — copy for each tool install
├── runbooks.md # Deploy, rollback, scale, health check procedures
├── incidents.md # Incident log template — track issues + root causes
└── changelog.md # Weekly progress log — what was built, when, by whom
How Students Use It
| Page | When |
|---|---|
| Setup Template | Copy to setup-jenkins.md, setup-sonarqube.md, setup-artifactory.md, setup-ansible-tower.md, setup-datadog.md — one per tool they install. Documents every command they ran. |
| Runbooks | Fill in real commands and URLs as they complete Tasks F-H |
| Incident Log | Every time something breaks during the project, they log it |
| Changelog | Weekly entries tracking progress across all tasks |
| Architecture/Environments/Pipeline | Fill in as they make decisions and provision infrastructure |
One template, students create as many copies as they need. Keeps it simple.
TASK B: Version Control & Code Security
Plan & Code
App Name: Healthpulse
- WorkStation A- Team Pipeline Pirates - 3.15.209.165
- WorkStation B - Team DevopsAvengers - 3.143.221.53
- WorkStation C- Team Devius - 3.142.240.0
Create two repositories:
| Repository | Purpose | Access |
|---|---|---|
HealthPulse_App | Application source code | Developers |
HealthPulse_Deployment | IaC, Ansible, pipelines, scripts | DevOps team |
Implement GitFlow in the App repository:
main ─────────────────────────────────────────►
└── develop ─────────────────────────────────►
├── feature/login-page ──► (merge to develop)
├── feature/dashboard ───► (merge to develop)
└── release/1.0.0 ───────► (merge to main + develop)
B.3 — Repository Security (Layer 1 & Layer 3)
Repository security follows a defense-in-depth approach with 3 layers. In this task you set up Layer 1 (local hooks) and Layer 3 (branch protection). Layer 2 (gitleaks in the CI pipeline) comes later in Task F once the pipeline exists.
Layer 1 (this task): Local hooks → fast feedback for developers
Layer 2 (Task F): CI pipeline scan → server-side safety net
Layer 3 (this task): Branch protection → platform-enforced rules
Layer 1: Local Git Hooks (pre-commit + pre-push)
Install pre-commit and pre-push hooks so developers get early feedback when they accidentally commit secrets. Understand that developers can bypass these with --no-verify — that's why Layer 3 exists.
| Hook | Tool | Purpose |
|---|---|---|
| pre-commit | detect-secrets | Scans staged changes for secrets using entropy + pattern analysis |
| pre-push | custom script | Warns on direct push to main/develop |
Use the provided .pre-commit-config.yaml and scripts/setup-git-hooks.sh.
# Step 1: Install the pre-commit framework
curl -O https://raw.githubusercontent.com/princexav/security/refs/heads/main/.pre-commit-config.yaml
pip install pre-commit
# Step 2: Install hooks into the repo
pre-commit install
# Step 3: Test it — this should be BLOCKED
echo "AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" >> test.txt
git add test.txt && git commit -m "test secret"
# Expected: detect-secrets blocks the commit
# Step 4: Clean up
git checkout -- test.txt
# Step 5: Test the pre-push hook
git checkout main
git push origin main
# Expected: Warning message about direct push to protected branchKey lesson: Run
git commit --no-verify -m "test"and notice the hook is skipped entirely. This is why local hooks alone are NOT enough — you need Layer 3.
Layer 3: Branch Protection Rules (platform-level — cannot be bypassed)
Configure these in your Git hosting platform (GitHub / GitLab / Bitbucket). Unlike hooks, these are enforced by the server — no developer can skip them.
| Rule | Setting |
|---|---|
| Require pull request before merging | main and develop |
| Require at least 1 approval | main and develop |
| Do not allow bypassing the above | Even admins must follow the rules |
Note: The rule "Require CI status checks to pass" will be added in Task F once your pipeline is built. For now, configure the PR and approval requirements.
# Test it — this should be REJECTED by the platform
git checkout main
git commit --allow-empty -m "testing direct push"
git push origin main
# Expected: Rejected — branch protection requires a pull requestAcceptance Criteria:
- Both repos created with proper access controls
- GitFlow branching strategy demonstrated (main, develop, feature/, release/)
- SSH key authentication configured for repo access
-
pre-commit installruns successfully and hooks are active - Demonstrate: committing a fake AWS key is blocked by
detect-secrets - Demonstrate:
--no-verifybypasses the hook (explain why this matters) - Demonstrate: pre-push hook warns on direct push to
main - Branch protection rules configured on
mainanddevelop(screenshot required) - PR requires at least 1 approval before merge
- Direct push to
mainis rejected by the platform (not just the hook) - Document the security setup in your MkDocs wiki
Before containers, deploy the application the traditional way — built files served directly by Nginx on an EC2 instance. This teaches what containers replace and why they exist.
Use the provided terraform/baremetal/ configuration to create a VPC, subnet, and EC2 instance with Nginx pre-installed.
See GUIDES:
https://www.devopstreams.com/2026/03/aws-credentials-setup-best-practices.html for IAM setup.
https://www.devopstreams.com/2026/03/task-c-bare-metal-deployment-nginx-on.html Step by Step Guide
https://github.com/princexav/mkdocs/tree/main/baremetal TERRAFORM FILES
cd terraform/baremetal
terraform init
terraform plan -var-file=dev.tfvars -var="ssh_public_key=$(cat ~/.ssh/healthpulse-key.pub)"
terraform apply -var-file=dev.tfvars -var="ssh_public_key=$(cat ~/.ssh/healthpulse-key.pub)"What Terraform creates:
| Resource | Detail |
|---|---|
| VPC + Subnet | Isolated network with internet gateway and route table |
| EC2 Instance | Ubuntu 22.04, t2.micro |
| Nginx | Installed and configured via user_data bootstrap |
| Security Group | Ports 22 (SSH), 80 (HTTP), 443 (HTTPS) |
| Elastic IP | Static public IP |
| Nginx Config | SPA fallback, gzip, security headers, /health endpoint |
| Deploy Path | /var/www/healthpulse |
Detailed walkthrough: See
guides/TASK-G-GUIDE.mdfor step-by-step instructions.
Manual deploy (for learning):
# SSH into the server
ssh -i ~/.ssh/healthpulse-key.pem ubuntu@<ELASTIC_IP>
# On the server — this is what Ansible automates
cd /var/www/healthpulse
# Copy dist/ files here
sudo systemctl reload nginx
# Verify
curl http://localhost/health
# → {"status":"healthy","deploy":"baremetal"}Acceptance Criteria:
- EC2 instance provisioned via Terraform with Nginx running
- Application accessible at
http://<ELASTIC_IP> - Health check returns 200 at
/health - Pain points documented in MkDocs wiki
- SSH into the server and explain what Nginx is serving and from where
Tool Instance Type Purpose Jenkins/Gitlab/github Actions/AzureDevops t2.large CI/CD server SonarQube t2.xlarge Code analysis Ansible Tower t2.2xlarge Configuration management JFrog Artifactory t2.2xlarge Artifact repository Acceptance Criteria:
- k3s cluster provisioned with 3 nodes (1 master + 2 workers)
kubectl get nodes— all Ready -
kubectl get nodesshows all nodes Ready - Infrastructure tagged properly(All Names Spaces Created for Dev, Qa, Prod)
- Can
terraform destroyand re-create cleanly - HPA configured (
kubectl get hpashows targets) - Can SSH into master and explain the cluster architecture
- documention in MkDocs wiki and devops tools set up
Install and configure Datadog agents on all servers.
Use the provided
monitoring/datadog/datadog-agent-setup.ymlAnsible playbook.Requirement Detail Infrastructure metrics CPU, memory, disk, network Container monitoring Docker container metrics Process monitoring Running process visibility Server tagging app:healthpulse,env:<environment>,team:<team-name>Acceptance Criteria:
- Datadog agent running on all servers
- Infrastructure metrics visible in Datadog dashboard
- Containers monitored with docker integration
- Process-level monitoring enabled
- All servers tagged and filterable by environment
Register a team domain and configure DNS.
Requirement Detail Domain e.g., team-healthpulse.comDNS Provider Route 53 (preferred), GoDaddy, etc. Records A/CNAME pointing to ALB Environments dev.team-healthpulse.com,uat.team-healthpulse.com,team-healthpulse.comAcceptance Criteria:
- Domain registered
- DNS records pointing to load balancers
- Application accessible via domain name
See guide: https://www.devopstreams.com/2026/05/task-f-guide.html
Now take the same application you deployed as bare files and package it into a Docker container. Build it, run it locally, scan it for vulnerabilities, push it to a registry, then deploy it to your k3s cluster manually.
Detailed walkthrough:
See Guide https://www.devopstreams.com/2026/05/task-h-guide-dockerize-app.html
.Docker files:
https://github.com/princexav/mkdocs/tree/main/docker
Why manual first? Every step you do by hand here becomes an automated pipeline stage in Task F. When the pipeline breaks, you'll know how to debug it — because you've done each step yourself.
Review the provided docker/Dockerfile:
Stage 1: Node 20 Alpine
├── corepack enable (activate pnpm)
├── pnpm install --frozen-lockfile
└── pnpm build → produces dist/
Stage 2: Nginx Alpine
├── Copy dist/ from Stage 1
├── Copy custom nginx.conf
└── Expose port 80
Key concept: The entire build environment (Node, pnpm, dependencies) exists only in Stage 1 and is discarded. The final image is just Nginx + your static files — small, fast, and secure.
# Build the Docker image
docker build -t healthpulse-portal:local -f docker/Dockerfile .
# Run it locally
docker run -d --name healthpulse -p 8080:80 healthpulse-portal:local
# Test it
curl http://localhost:8080/health
# → {"status":"healthy"}
# Open in browser
# → http://localhost:8080
# Check the running container
docker ps
docker logs healthpulse
# Stop and remove
docker stop healthpulse && docker rm healthpulse# Start with docker-compose (uses docker/docker-compose.yml)
docker compose -f docker/docker-compose.yml up -d
# Check status
docker compose -f docker/docker-compose.yml ps
# View logs
docker compose -f docker/docker-compose.yml logs -f
# Tear down
docker compose -f docker/docker-compose.yml downBefore pushing your image to a registry, scan it for vulnerabilities. This is what the CI pipeline will automate in Task F — do it manually first so you understand the output.
# Option 1: Trivy (open-source, recommended)
# Install: https://aquasecurity.github.io/trivy/
trivy image healthpulse-portal:local
# Option 2: Docker Scout (built into Docker Desktop)
docker scout cves healthpulse-portal:local
# Option 3: Snyk CLI (if installed)
snyk container test healthpulse-portal:localWhat to look for:
| Severity | Action |
|---|---|
| CRITICAL | Must fix — update base image or package |
| HIGH | Should fix — update if feasible |
| MEDIUM | Note and track — fix when time allows |
| LOW | Acceptable risk for a capstone |
Common fixes:
- Update
FROM nginx:alpinetoFROM nginx:alpine3.20(pin the version) - Remove unnecessary packages in the final stage
- Use
--no-cacheinapk addto reduce attack surface
Document your findings in MkDocs: what vulnerabilities did you find? What did you fix? What was acceptable risk?
Choose one registry — either your team's Artifactory or Docker Hub:
Option A: JFrog Artifactory (enterprise registry)
docker tag healthpulse-portal:local <ARTIFACTORY_URL>/healthpulse-portal:1.0.0
docker tag healthpulse-portal:local <ARTIFACTORY_URL>/healthpulse-portal:latest
docker login <ARTIFACTORY_URL>
docker push <ARTIFACTORY_URL>/healthpulse-portal:1.0.0
docker push <ARTIFACTORY_URL>/healthpulse-portal:latestOption B: Docker Hub (public registry)
docker tag healthpulse-portal:local <DOCKERHUB_USERNAME>/healthpulse-portal:1.0.0
docker tag healthpulse-portal:local <DOCKERHUB_USERNAME>/healthpulse-portal:latest
docker login
docker push <DOCKERHUB_USERNAME>/healthpulse-portal:1.0.0
docker push <DOCKERHUB_USERNAME>/healthpulse-portal:latestNote: The CI pipeline (Task F) automates this on every build. Here you're doing it manually to understand the process.
Now pull your image from the registry and deploy it to the k3s cluster by hand:
export KUBECONFIG=~/.kube/healthpulse-config
# Create an image pull secret (if using private registry)
kubectl create secret docker-registry regcred \
--docker-server=<REGISTRY_URL> \
--docker-username=<USERNAME> \
--docker-password=<PASSWORD> \
-n healthpulse-dev
# Apply deployment (update image in deployment.yml first)
kubectl apply -f kubernetes/deployment.yml -n healthpulse-dev
kubectl apply -f kubernetes/service.yml -n healthpulse-dev
# Watch pods come up
kubectl get pods -n healthpulse-dev -w
# Test
curl http://<K3S_MASTER_IP>:<NODE_PORT>/healthThis is the manual version of what the pipeline will automate. Feel how many commands it takes — that's why CI/CD exists.
After running both ways, document the comparison in your MkDocs wiki:
| Aspect | Bare-Metal (Task G) | Container (Task H) |
|---|---|---|
| Server setup | Install Node, Nginx, configure manually | docker run — everything is inside the image |
| Build output | dist/ folder copied to server | Docker image with Nginx + dist/ baked in |
| Deploy time | Minutes (download, extract, reload Nginx) | Seconds (pull image, start container) |
| Rollback | Restore from tar backup | docker run previous-image:tag |
| Environment parity | Hope configs match across servers | Guaranteed — same image everywhere |
| Dependencies | Installed on the OS — can conflict | Isolated inside the container |
| Reproducibility | "Works on my machine" problems | Same image runs everywhere |
| Security scanning | Manual audit of server packages | trivy image — automated CVE check |
| Cleanup | Files scattered across the OS | docker rm — clean removal |
Acceptance Criteria:
- Docker image builds successfully with
docker build - Application runs locally via
docker runand is accessible athttp://localhost:8080 - Health check returns 200 at
/health - Navigate through the app — all pages work (SPA routing via Nginx)
- Manual vulnerability scan completed (Trivy, Docker Scout, or Snyk)
- Scan findings documented — what was found, what was fixed, what was accepted
- Image pushed to registry (Artifactory or Docker Hub) with version tag
- Image pulled from registry and deployed to k3s cluster manually
- Bare-metal vs container comparison documented in MkDocs wiki
- Can explain: what is in the final Docker image? What was discarded?
Now that your applications are running on k3s, add Kubernetes-native monitoring using Prometheus and Grafana. This complements Datadog (Task D) by providing deep visibility into pod-level metrics, deployment health, and cluster performance.
Detailed walkthrough: See
guides/TASK-I-GUIDE.mdfor the complete step-by-step guide.
https://www.devopstreams.com/2026/05/task-i-kubernetes-monitoring.html
Datadog vs Prometheus — why both?
| Datadog (Task D) | Prometheus + Grafana (Task K) | |
|---|---|---|
| Scope | Infrastructure (OS-level) | Kubernetes (pod/container-level) |
| Runs where | Agent on each server → SaaS cloud | Inside the k3s cluster |
| Metrics | CPU, memory, disk, network, processes | Pod resource usage, deployment health, HPA scaling, request rates |
| Dashboards | Datadog web console | Grafana (self-hosted on k3s) |
| Cost | Free tier (5 hosts) → paid | Free (open source) |
| Industry | Used alongside Prometheus in most orgs | Standard for Kubernetes monitoring |
# Install Helm (if not installed)
# https://helm.sh/docs/intro/install/
# Add the Prometheus community Helm chart repo
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Install the kube-prometheus-stack (includes Prometheus + Grafana + Node Exporter)
helm install monitoring prometheus-community/kube-prometheus-stack \
--namespace monitoring --create-namespace \
--set grafana.adminPassword=healthpulse123This single command installs:
| Component | Purpose |
|---|---|
| Prometheus | Scrapes and stores metrics from all k8s components |
| Grafana | Visualization dashboards |
| Node Exporter | Hardware/OS metrics from each node |
| kube-state-metrics | Kubernetes object metrics (pods, deployments, etc.) |
| Alertmanager | Alert routing and notifications |
# Port-forward Grafana to your local machine
kubectl port-forward -n monitoring svc/monitoring-grafana 3000:80
# Open in browser: http://localhost:3000
# Login: admin / healthpulse123The Helm chart includes pre-built dashboards. Navigate to Dashboards in Grafana and explore:
| Dashboard | What It Shows |
|---|---|
| Kubernetes / Compute Resources / Namespace (Pods) | CPU + memory per pod, per namespace |
| Kubernetes / Compute Resources / Node (Pods) | Which pods are using resources on each node |
| Node Exporter / Nodes | OS-level metrics per node (CPU, memory, disk, network) |
| Kubernetes / Networking / Namespace (Pods) | Network traffic per pod |
- Go to the Namespace (Pods) dashboard
- Select namespace:
healthpulse-prod - You'll see CPU and memory usage for your HealthPulse pods
- Deploy a new version and watch the metrics change in real-time
Create a dashboard with these panels:
- Pod count by namespace — how many pods per environment
- CPU usage by pod — which pods are consuming resources
- Memory usage trend — are pods leaking memory over time?
- Pod restart count — are pods crash-looping?
- HPA replica count — is the autoscaler active?
Use k9s to cross-reference what Prometheus reports:
k9s
# :pods → see pod status
# Compare with Grafana dashboards — do the numbers match?Acceptance Criteria:
- Prometheus + Grafana installed on k3s via Helm
- Grafana accessible and pre-built dashboards visible
- HealthPulse pod metrics visible in Grafana (CPU, memory)
- Custom dashboard created with at least 4 panels
- Can explain: what does Prometheus scrape? How does Grafana query it?
- Datadog vs Prometheus comparison documented in MkDocs wiki