Devops Training : Deploy Prometheus and Grafana Helm Chart on Kubernetes using Terraform

Helm is a package manager for Kubernetes applications event the more complex via charts. Basically, it templates the YAML components from a single file containing the custom values. There are Terraform providers for Helm and Kubernetes to integrate them in your codebase.

In this lab, we'll set up a monitoring stack with Prometheus and Grafana. Terraform will configure the chart values to make them communicate together. When you use the Helm provider the charts are considered as resources. Reused attributes from a Helm chart in another will create a dependency and manage the deployment orchestration.

Note: Before you can start using kubectl, you have to install the AWS CLI and KUBECTL on your computer.

Prerequisites

Create an IAM User
Configure Terraform Backend with S3 Storage
Setting up CD Pipeline for EKS Cluster
Create Terraform Workspace for EKS Cluster
Map IAM User to EKS using ClusterRole & ClusterRoleBinding
Run Pipeline Job
Configure AWS
Authenticate to EKS Cluster
Verify EKS Cluster is Active and Nodes are Visible
Verify Helm Deployment
Access Grafana Dashboard
Access Prometheus UI

Create an IAM User
Go to AWS Console
Search for IAM as shown below
Select Users and create a User called terraform-user with Console access and AdministratorAccess policy attached. Ensure to download your credentials once the user has been created as this is required to login to the EKS Cluster.
Navigate to Policies and select create policy called eks-assume
Select JSON and copy the below code highlighted in yellow to create policy: (Note: Ensure to change the AWS-ACCOUNT-NUMBER)

{
  "Version": "2012-10-17",
  "Statement": {
  "Effect": "Allow",
  "Action": "sts:AssumeRole",
  "Resource": "arn:aws:iam::AWS-ACCOUNT-NUMBER:role/terraform-eks-cluster"
  }
}
Navigate to Policies and select create policy called eks-permission
Select JSON and copy the below code highlighted in yellow to create policy:

{
  "Version": "2012-10-17",
  "Statement": [
  {
  "Effect": "Allow",
  "Action": [
  "eks:DescribeNodegroup",
  "eks:ListNodegroups",
  "eks:DescribeCluster",
  "eks:ListClusters",
  "eks:AccessKubernetesApi",
  "ssm:GetParameter",
  "eks:ListUpdates",
  "eks:ListFargateProfiles"
  ],
  "Resource": "*"
  }
  ]
}
Create Group called eksgroup and attach the eks-permission and eks-assume policy to terrform-user
Verify group and policy is attached to terraform-user by navigating to Users.

Configure Terraform Backend with S3 Storage

Create S3 bucket in AWS to configure the backend and store terraform state files in storage. (Name the S3 Bucket whatever you prefer)

Setting up CD Pipeline for EKS Cluster

Go to Jenkins > New Items. Enter eks-pipeline in name field > Choose Pipeline > Click OK

Select Configure after creation.
Go to Build Triggers and enable Trigger builds remotely.
Enter tf_token as Authentication Token

Bitbucket Changes

Create a new Bitbucket Repo and call it eks-pipeline
Go to Repository Settings after creation and select Webhooks
Click Add Webhooks
Enter tf_token as the Title
Copy and paste the url as shown below

http://JENKINS_URL:8080/job/eks-pipeline/buildWithParameters?token=tf_token

Status should be active
Click on skip certificate verification
triggers --> repository push

Go back to Jenkins and select Configure
Scroll down to Pipeline and click on the drop down to select Pipeline Script From SCM
Enter credentials for Bitbucket, Leave the Branch master as the default, Make sure script path is Jenkinsfile.
Apply and Save.

Create Terraform Workspace for EKS Pipeline

Open File Explorer, navigate to Desktop and create a folder my-
eks-cluster
Once folder has been created, open Visual Code Studio and add folder to workspace

Open a New Terminal
Run the command before cloning repo: git init
Navigate to eks-pipeline repo in Bitbucket
Clone the repo with SSH or HTTPS
Make sure to cd eks-pipeline and create new files in the eks-pipeline folder

Create a new file eks-asg.tf and copy the below code in yellow color

resource "aws_eks_cluster" "tf_eks" {

name = local.cluster_name

enabled_cluster_log_types = ["authenticator","api", "controllerManager", "scheduler"]

role_arn = aws_iam_role.tf-eks-master.arn

version = var.kube_version

vpc_config {

security_group_ids = [aws_security_group.eks-master-sg.id]

subnet_ids = data.aws_subnet_ids.public.ids

}

timeouts {

create = var.cluster_create_timeout

delete = var.cluster_delete_timeout

}

depends_on = [

aws_iam_role_policy_attachment.tf-cluster-AmazonEKSClusterPolicy,

aws_iam_role_policy_attachment.tf-cluster-AmazonEKSServicePolicy,

]

tags = local.common_tags

}

########################################################################################

# Setup AutoScaling Group for worker nodes

########################################################################################

locals {

tf-eks-node-userdata = <<USERDATA

#!/bin/bash

set -o xtrace

/etc/eks/bootstrap.sh --apiserver-endpoint '${aws_eks_cluster.tf_eks.endpoint}' --b64-cluster-ca '${aws_eks_cluster.tf_eks.certificate_authority.0.data}' '${local.cluster_name}'

USERDATA

}

resource "aws_launch_configuration" "config" {

associate_public_ip_address = true

iam_instance_profile = aws_iam_instance_profile.node.name

image_id = data.aws_ami.eks-worker.id

instance_type = var.instance_type

name_prefix = "my-eks-cluster"

security_groups = [aws_security_group.eks-node-sg.id, aws_security_group.worker_ssh.id]

user_data_base64 = base64encode(local.tf-eks-node-userdata)

key_name = var.keypair-name

lifecycle {

create_before_destroy = true

}

ebs_optimized = true

root_block_device {

volume_size = 100

delete_on_termination = true

}

resource "aws_autoscaling_group" "asg" {

desired_capacity = 2

launch_configuration = aws_launch_configuration.config.id

max_size = 2

min_size = 2

name = local.cluster_name

vpc_zone_identifier = data.aws_subnet_ids.public.ids

tag {

key = "eks-worker-nodes"

value = local.cluster_name

propagate_at_launch = true

}

tag {

key = "kubernetes.io/cluster/${aws_eks_cluster.tf_eks.name}"

value = "owned"

propagate_at_launch = true

}

Create a new file iam.tf and copy the below code in yellow color

# Setup for IAM role needed to setup an EKS clusters

resource "aws_iam_role" "tf-eks-master" {

name = "terraform-eks-cluster"

assume_role_policy = <<POLICY

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"Service": "eks.amazonaws.com",

"AWS": "arn:aws:iam::AWS-ACCOUNT-NUMBER:user/terraform-user"

"Action": "sts:AssumeRole"

}

]

}

POLICY

lifecycle {

create_before_destroy = true

}

resource "aws_iam_role_policy_attachment" "tf-cluster-AmazonEKSClusterPolicy" {

policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"

role = aws_iam_role.tf-eks-master.name

}

resource "aws_iam_role_policy_attachment" "tf-cluster-AmazonEKSServicePolicy" {

policy_arn = "arn:aws:iam::aws:policy/AmazonEKSServicePolicy"

role = aws_iam_role.tf-eks-master.name

}

resource "aws_iam_role_policy_attachment" "tf-eks-node-AmazonEKSWorkerNode" {

policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"

role = aws_iam_role.tf-eks-master.name

}

########################################################################################

# Setup IAM role & instance profile for worker nodes

resource "aws_iam_role" "tf-eks-node" {

name = "terraform-eks-tf-eks-node"

assume_role_policy = <<POLICY

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Principal": {

"Service": "ec2.amazonaws.com"

"Action": "sts:AssumeRole"

}

]

}

POLICY

}

resource "aws_iam_role_policy_attachment" "tf-eks-node-AmazonEKSWorkerNodePolicy" {

policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"

role = aws_iam_role.tf-eks-node.name

}

resource "aws_iam_role_policy_attachment" "tf-eks-node-AmazonEKS_CNI_Policy" {

policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"

role = aws_iam_role.tf-eks-node.name

}

resource "aws_iam_role_policy_attachment" "tf-eks-node-AmazonEC2ContainerRegistryReadOnly" {

policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"

role = aws_iam_role.tf-eks-node.name

}

resource "aws_iam_instance_profile" "node" {

name = "terraform-eks-node"

role = aws_iam_role.tf-eks-node.name

}

Create a new file kube.tf and copy the below code in yellow color

########################################################################################

# Setup provider for kubernetes

# ---------------------------------------------------------------------------------------

# Get an authentication token to communicate with the EKS cluster.

# By default (before other roles are added to the Auth ConfigMap), you can authenticate to EKS cluster only by assuming the role that created the cluster.

# `aws_eks_cluster_auth` uses IAM credentials from the AWS provider to generate a temporary token.

# If the AWS provider assumes an IAM role, `aws_eks_cluster_auth` will use the same IAM role to get the auth token.

# https://www.terraform.io/docs/providers/aws/d/eks_cluster_auth.html

data "aws_eks_cluster_auth" "aws_iam_authenticator" {

name = "${aws_eks_cluster.tf_eks.name}"

}

data "aws_iam_user" "terraform_user" {

user_name = "terraform-user"

}

locals {

# roles to allow kubernetes access via cli and allow ec2 nodes to join eks cluster

configmap_roles = [{

rolearn = "${data.aws_iam_user.terraform_user.arn}"

username = "{{SessionName}}"

groups = ["system:masters"]

{

rolearn = "${aws_iam_role.tf-eks-node.arn}"

username = "system:node:{{EC2PrivateDNSName}}"

groups = ["system:bootstrappers","system:nodes"]

{

rolearn = "${aws_iam_role.tf-eks-master.arn}"

username = "{{SessionName}}"

groups = ["system:masters"]

},]

}

# Allow worker nodes to join cluster via config map

resource "kubernetes_config_map" "aws_auth" {

metadata {

name = "aws-auth"

namespace = "kube-system"

}

data = {

mapRoles = yamlencode(local.configmap_roles)

}

locals {

kubeconfig = <<KUBECONFIG

apiVersion: v1

clusters:

- cluster:

server: ${aws_eks_cluster.tf_eks.endpoint}

certificate-authority-data: ${aws_eks_cluster.tf_eks.certificate_authority.0.data}

name: kubernetes

contexts:

- context:

cluster: kubernetes

user: aws

name: aws

current-context: aws

kind: Config

preferences: {}

users:

- name: aws

user:

exec:

apiVersion: client.authentication.k8s.io/v1alpha1

command: aws-iam-authenticator

args:

- "token"

- "-i"

- "${aws_eks_cluster.tf_eks.name}"

KUBECONFIG

}

Create a new file output.tf and copy the below code in yellow color

output "eks_kubeconfig" {

value = "${local.kubeconfig}"

depends_on = [

aws_eks_cluster.tf_eks

]

}

Create a new file provider.tf and copy the below code in yellow color

terraform {

backend "s3" {

bucket = "S3-BUCKET-NAME"

key = "eks/terraform.tfstste"

region = "us-east-2"

}

provider "aws" {

region = var.region

version = "~> 2.0"

}

provider "kubernetes" {

host = aws_eks_cluster.tf_eks.endpoint

cluster_ca_certificate = base64decode(aws_eks_cluster.tf_eks.certificate_authority.0.data)

token = data.aws_eks_cluster_auth.aws_iam_authenticator.token

}

provider "helm" {

kubernetes {

host = aws_eks_cluster.tf_eks.endpoint

cluster_ca_certificate = base64decode(aws_eks_cluster.tf_eks.certificate_authority.0.data)

token = data.aws_eks_cluster_auth.aws_iam_authenticator.token

}

Create a new file sg-eks.tf and copy the below code in yellow color

# # #SG to control access to worker nodes

resource "aws_security_group" "eks-master-sg" {

name = "terraform-eks-cluster"

description = "Cluster communication with worker nodes"

vpc_id = var.vpc_id

egress {

from_port = 0

to_port = 0

protocol = "-1"

cidr_blocks = ["0.0.0.0/0"]

}

tags = merge(

local.common_tags,

map(

"Name","eks-cluster",

"kubernetes.io/cluster/${local.cluster_name}","owned"

)

}

resource "aws_security_group" "eks-node-sg" {

name = "terraform-eks-node"

description = "Security group for all nodes in the cluster"

vpc_id = var.vpc_id

egress {

from_port = 0

to_port = 0

protocol = "-1"

cidr_blocks = ["0.0.0.0/0"]

}

tags = merge(

local.common_tags,

map(

"Name","eks-worker-node",

"kubernetes.io/cluster/${aws_eks_cluster.tf_eks.name}","owned"

)

}

resource "aws_security_group" "worker_ssh" {

name_prefix = "worker_ssh"

vpc_id = var.vpc_id

egress {

from_port = 0

to_port = 0

protocol = "-1"

cidr_blocks = ["0.0.0.0/0"]

}

ingress {

from_port = 22

to_port = 22

protocol = "tcp"

cidr_blocks = ["0.0.0.0/0"]

}

tags = merge(

local.common_tags,

map(

"Name","worker_ssh",

)

}

Create a new file sg-rules-eks.tf and copy the below code in yellow color

# Allow inbound traffic from your local workstation external IP

# to the Kubernetes. You will need to replace A.B.C.D below with

# your real IP. Services like icanhazip.com can help you find this.

resource "aws_security_group_rule" "tf-eks-cluster-ingress-workstation-https" {

cidr_blocks = ["0.0.0.0/0"]

description = "Allow workstation to communicate with the cluster API Server"

from_port = 443

protocol = "tcp"

security_group_id = aws_security_group.eks-master-sg.id

to_port = 443

type = "ingress"

}

########################################################################################

# Setup worker node security group

resource "aws_security_group_rule" "tf-eks-node-ingress-self" {

description = "Allow node to communicate with each other"

from_port = 0

protocol = "-1"

security_group_id = aws_security_group.eks-node-sg.id

source_security_group_id = aws_security_group.eks-node-sg.id

to_port = 65535

type = "ingress"

}

resource "aws_security_group_rule" "tf-eks-node-ingress-cluster" {

description = "Allow worker Kubelets and pods to receive communication from the cluster control plane"

from_port = 1025

protocol = "tcp"

security_group_id = aws_security_group.eks-node-sg.id

source_security_group_id = aws_security_group.eks-master-sg.id

to_port = 65535

type = "ingress"

}

# allow worker nodes to access EKS master

resource "aws_security_group_rule" "tf-eks-cluster-ingress-node-https" {

description = "Allow pods to communicate with the cluster API Server"

from_port = 443

protocol = "tcp"

security_group_id = aws_security_group.eks-node-sg.id

source_security_group_id = aws_security_group.eks-master-sg.id

to_port = 443

type = "ingress"

}

resource "aws_security_group_rule" "tf-eks-node-ingress-master" {

description = "Allow cluster control to receive communication from the worker Kubelets"

from_port = 443

protocol = "tcp"

security_group_id = aws_security_group.eks-master-sg.id

source_security_group_id = aws_security_group.eks-node-sg.id

to_port = 443

type = "ingress"

}

Create a new file variables.tf and copy the below code in yellow color

# Setup data source to get amazon-provided AMI for EKS nodes

data "aws_ami" "eks-worker" {

filter {

name = "name"

values = ["amazon-eks-node-1.21-*"]

}

most_recent = true

owners = ["602401143452"] # Amazon EKS AMI Account ID

}

data "aws_subnet_ids" "public" {

vpc_id = var.vpc_id

filter {

name = "tag:Name"

values = ["subnet-public-*"]

}

variable region {

type = string

default = "us-east-2"

}

variable "cluster_create_timeout" {

description = "Timeout value when creating the EKS cluster."

type = string

default = "30m"

}

variable "cluster_delete_timeout" {

description = "Timeout value when deleting the EKS cluster."

type = string

default = "15m"

}

variable "vpc_id" {

type = string

default = "PASTE-VPC-ID-HERE"

}

variable "keypair-name" {

type = string

default = "KEY-NAME"

}

variable "creator" {

description = "Creator of deployed servers"

type = string

default = "YOUR-NAME"

}

variable "instance_type" {}

variable "env" {}

variable "grafana_password" {}

## Application/workspace specific inputs

variable "app" {

description = "Name of Application"

type = string

default = "my-eks"

}

variable "kube_version" {

type = string

description = "Kubernetes version for eks"

}

## Tagging naming convention

locals {

common_tags = {

env = var.env,

creator = var.creator,

app = var.app

}

cluster_name = "${var.app}-${var.env}"

}

Create a new folder templates then a new file in templates folder called grafana-values.yaml and copy the below code in yellow color
rbac:
create: true
pspEnabled: true
pspUseAppArmor: true
namespaced: true
extraRoleRules: []
# - apiGroups: []
# resources: []
# verbs: []
extraClusterRoleRules: []
# - apiGroups: []
# resources: []
# verbs: []
serviceAccount:
create: true
name: ${GRAFANA_SERVICE_ACCOUNT}
nameTest:
# annotations:

replicas: 1

## See `kubectl explain poddisruptionbudget.spec` for more
## ref: https://kubernetes.io/docs/tasks/run-application/configure-pdb/
podDisruptionBudget: {}
# minAvailable: 1
# maxUnavailable: 1

## See `kubectl explain deployment.spec.strategy` for more
## ref: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy
deploymentStrategy:
type: RollingUpdate

readinessProbe:
httpGet:
path: /api/health
port: 3000

livenessProbe:
httpGet:
path: /api/health
port: 3000
initialDelaySeconds: 60
timeoutSeconds: 30
failureThreshold: 10

## Use an alternate scheduler, e.g. "stork".
## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
##
# schedulerName: "default-scheduler"

image:
repository: grafana/grafana
tag: 7.1.1
sha: ""
pullPolicy: IfNotPresent

## Optionally specify an array of imagePullSecrets.
## Secrets must be manually created in the namespace.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
##
# pullSecrets:
# - myRegistrKeySecretName

testFramework:
enabled: true
image: "bats/bats"
tag: "v1.1.0"
imagePullPolicy: IfNotPresent
securityContext: {}

securityContext:
runAsUser: 472
runAsGroup: 472
fsGroup: 472

extraConfigmapMounts: []
# - name: certs-configmap
# mountPath: /etc/grafana/ssl/
# subPath: certificates.crt # (optional)
# configMap: certs-configmap
# readOnly: true

extraEmptyDirMounts: []
# - name: provisioning-notifiers
# mountPath: /etc/grafana/provisioning/notifiers

## Assign a PriorityClassName to pods if set
# priorityClassName:

downloadDashboardsImage:
repository: curlimages/curl
tag: 7.70.0
sha: ""
pullPolicy: IfNotPresent

downloadDashboards:
env: {}
resources: {}

## Pod Annotations
# podAnnotations: {}

## Pod Labels
podLabels:
app: grafana

podPortName: grafana

## Deployment annotations
# annotations: {}

## Expose the grafana service to be accessed from outside the cluster (LoadBalancer service).
## or access it from within the cluster (ClusterIP service). Set the service type and the port to serve it.
## ref: http://kubernetes.io/docs/user-guide/services/
##
service:
type: ClusterIP
port: 80
targetPort: 3000
# targetPort: 4181 To be used with a proxy extraContainer
annotations: {}
labels:
app: grafana
portName: service

extraExposePorts: []
# - name: keycloak
# port: 8080
# targetPort: 8080
# type: ClusterIP

# overrides pod.spec.hostAliases in the grafana deployment's pods
hostAliases: []
# - ip: "1.2.3.4"
# hostnames:
# - "my.host.com"

ingress:
enabled: false
# Values can be templated
#annotations:
#kubernetes.io/ingress.class: nginx
#kubernetes.io/tls-acme: "true"
labels: {}
path: /
hosts:
- chart-example.local
## Extra paths to prepend to every host configuration. This is useful when working with annotation based services.
extraPaths: []
# - path: /*
# backend:
# serviceName: ssl-redirect
# servicePort: use-annotation
tls: []
# - secretName: chart-example-tls
# hosts:
# - chart-example.local

resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi

## Node labels for pod assignment
## ref: https://kubernetes.io/docs/user-guide/node-selection/
#
nodeSelector: {}

## Tolerations for pod assignment
## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
##
tolerations: []

## Affinity for pod assignment
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
##
affinity: {}

extraInitContainers: []

## Enable an Specify container in extraContainers. This is meant to allow adding an authentication proxy to a grafana pod
extraContainers: |
# - name: proxy
# image: quay.io/gambol99/keycloak-proxy:latest
# args:
# - -provider=github
# - -client-id=
# - -client-secret=
# - -github-org=<ORG_NAME>
# - -email-domain=*
# - -cookie-secret=
# - -http-address=http://0.0.0.0:4181
# - -upstream-url=http://127.0.0.1:3000
# ports:
# - name: proxy-web
# containerPort: 4181

## Volumes that can be used in init containers that will not be mounted to deployment pods
extraContainerVolumes: []
# - name: volume-from-secret
# secret:
# secretName: secret-to-mount
# - name: empty-dir-volume
# emptyDir: {}

## Enable persistence using Persistent Volume Claims
## ref: http://kubernetes.io/docs/user-guide/persistent-volumes/
##
persistence:
type: pvc
enabled: false
# storageClassName: default
accessModes:
- ReadWriteOnce
size: 10Gi
# annotations: {}
finalizers:
- kubernetes.io/pvc-protection
# subPath: ""
# existingClaim:

initChownData:
## If false, data ownership will not be reset at startup
## This allows the prometheus-server to be run with an arbitrary user
##
enabled: true

## initChownData container image
##
image:
repository: busybox
tag: "1.31.1"
sha: ""
pullPolicy: IfNotPresent

## initChownData resource requests and limits
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
##
resources: {}
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi

# Administrator credentials when not using an existing secret (see below)
adminUser: ${GRAFANA_ADMIN_USER}
adminPassword: ${GRAFANA_ADMIN_PASSWORD}

## Define command to be executed at startup by grafana container
## Needed if using `vault-env` to manage secrets (ref: https://banzaicloud.com/blog/inject-secrets-into-pods-vault/)
## Default is "run.sh" as defined in grafana's Dockerfile
# command:
# - "sh"
# - "/run.sh"

## Use an alternate scheduler, e.g. "stork".
## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
##
# schedulerName:

## Extra environment variables that will be pass onto deployment pods
env: {}

## "valueFrom" environment variable references that will be added to deployment pods
## ref: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.17/#envvarsource-v1-core
## Renders in container spec as:
## env:
## ...
## - name: <key>
## valueFrom:
## <value rendered as YAML>
envValueFrom: {}

## The name of a secret in the same kubernetes namespace which contain values to be added to the environment
## This can be useful for auth tokens, etc. Value is templated.
envFromSecret: ""

## Sensible environment variables that will be rendered as new secret object
## This can be useful for auth tokens, etc
envRenderSecret: {}

## Additional grafana server secret mounts
# Defines additional mounts with secrets. Secrets must be manually created in the namespace.
extraSecretMounts: []
# - name: secret-files
# mountPath: /etc/secrets
# secretName: grafana-secret-files
# readOnly: true
# subPath: ""

## Additional grafana server volume mounts
# Defines additional volume mounts.
extraVolumeMounts: []
# - name: extra-volume
# mountPath: /mnt/volume
# readOnly: true
# existingClaim: volume-claim

## Pass the plugins you want installed as a list.
##
plugins: []
# - digrich-bubblechart-panel
# - grafana-clock-panel

## Configure grafana datasources
## ref: http://docs.grafana.org/administration/provisioning/#datasources
##
datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
url: http://${PROMETHEUS_SVC}.${NAMESPACE}.svc.cluster.local
access: proxy
isDefault: true

## Configure notifiers
## ref: http://docs.grafana.org/administration/provisioning/#alert-notification-channels
##
notifiers: {}
# notifiers.yaml:
# notifiers:
# - name: email-notifier
# type: email
# uid: email1
# # either:
# org_id: 1
# # or
# org_name: Main Org.
# is_default: true
# settings:
# addresses: an_email_address@example.com
# delete_notifiers:

## Configure grafana dashboard providers
## ref: http://docs.grafana.org/administration/provisioning/#dashboards
##
## `path` must be /var/lib/grafana/dashboards/<provider_name>
##
dashboardProviders:
dashboardproviders.yaml:
apiVersion: 1
providers:
- name: 'default'
orgId: 1
folder: ''
type: file
disableDeletion: false
editable: true
options:
path: /var/lib/grafana/dashboards/default

## Configure grafana dashboard to import
## NOTE: To use dashboards you must also enable/configure dashboardProviders
## ref: https://grafana.com/dashboards
##
## dashboards per provider, use provider name as key.
##
dashboards:
# default:
# some-dashboard:
# json: |
# $RAW_JSON
# custom-dashboard:
# file: dashboards/custom-dashboard.json
# prometheus-stats:
# gnetId: 2
# revision: 2
# datasource: Prometheus
# local-dashboard:
# url: https://example.com/repository/test.json
# local-dashboard-base64:
# url: https://example.com/repository/test-b64.json
# b64content: true
default:
prometheus-stats:
gnetId: 10000
revision: 1
datasource: Prometheus

## Reference to external ConfigMap per provider. Use provider name as key and ConfiMap name as value.
## A provider dashboards must be defined either by external ConfigMaps or in values.yaml, not in both.
## ConfigMap data example:
##
## data:
## example-dashboard.json: |
## RAW_JSON
##
dashboardsConfigMaps: {}
# default: ""

## Grafana's primary configuration
## NOTE: values in map will be converted to ini format
## ref: http://docs.grafana.org/installation/configuration/
##
grafana.ini:
paths:
data: /var/lib/grafana/data
logs: /var/log/grafana
plugins: /var/lib/grafana/plugins
provisioning: /etc/grafana/provisioning
analytics:
check_for_updates: true
log:
mode: console
grafana_net:
url: https://grafana.net
## grafana Authentication can be enabled with the following values on grafana.ini
# server:
# The full public facing url you use in browser, used for redirects and emails
# root_url:
# https://grafana.com/docs/grafana/latest/auth/github/#enable-github-in-grafana
# auth.github:
# enabled: false
# allow_sign_up: false
# scopes: user:email,read:org
# auth_url: https://github.com/login/oauth/authorize
# token_url: https://github.com/login/oauth/access_token
# api_url: https://github.com/user
# team_ids:
# allowed_organizations:
# client_id:
# client_secret:
## LDAP Authentication can be enabled with the following values on grafana.ini
## NOTE: Grafana will fail to start if the value for ldap.toml is invalid
# auth.ldap:
# enabled: true
# allow_sign_up: true
# config_file: /etc/grafana/ldap.toml

## Grafana's LDAP configuration
## Templated by the template in _helpers.tpl
## NOTE: To enable the grafana.ini must be configured with auth.ldap.enabled
## ref: http://docs.grafana.org/installation/configuration/#auth-ldap
## ref: http://docs.grafana.org/installation/ldap/#configuration
ldap:
enabled: false
# `existingSecret` is a reference to an existing secret containing the ldap configuration
# for Grafana in a key `ldap-toml`.
existingSecret: ""
# `config` is the content of `ldap.toml` that will be stored in the created secret
config: ""
# config: |-
# verbose_logging = true

# [[servers]]
# host = "my-ldap-server"
# port = 636
# use_ssl = true
# start_tls = false
# ssl_skip_verify = false
# bind_dn = "uid=%s,ou=users,dc=myorg,dc=com"

## Grafana's SMTP configuration
## NOTE: To enable, grafana.ini must be configured with smtp.enabled
## ref: http://docs.grafana.org/installation/configuration/#smtp
smtp:
# `existingSecret` is a reference to an existing secret containing the smtp configuration
# for Grafana.
existingSecret: ""
userKey: "user"
passwordKey: "password"

## Sidecars that collect the configmaps with specified label and stores the included files them into the respective folders
## Requires at least Grafana 5 to work and can't be used together with parameters dashboardProviders, datasources and dashboards
sidecar:
image:
repository: kiwigrid/k8s-sidecar
tag: 0.1.151
sha: ""
imagePullPolicy: IfNotPresent
resources: {}
# limits:
# cpu: 100m
# memory: 100Mi
# requests:
# cpu: 50m
# memory: 50Mi
# skipTlsVerify Set to true to skip tls verification for kube api calls
# skipTlsVerify: true
enableUniqueFilenames: false
dashboards:
enabled: false
SCProvider: true
# label that the configmaps with dashboards are marked with
label: grafana_dashboard
# folder in the pod that should hold the collected dashboards (unless `defaultFolderName` is set)
folder: /tmp/dashboards
# The default folder name, it will create a subfolder under the `folder` and put dashboards in there instead
defaultFolderName: null
# If specified, the sidecar will search for dashboard config-maps inside this namespace.
# Otherwise the namespace in which the sidecar is running will be used.
# It's also possible to specify ALL to search in all namespaces
searchNamespace: null
# provider configuration that lets grafana manage the dashboards
provider:
# name of the provider, should be unique
name: sidecarProvider
# orgid as configured in grafana
orgid: 1
# folder in which the dashboards should be imported in grafana
folder: ''
# type of the provider
type: file
# disableDelete to activate a import-only behaviour
disableDelete: false
# allow updating provisioned dashboards from the UI
allowUiUpdates: false
datasources:
enabled: false
# label that the configmaps with datasources are marked with
label: grafana_datasource
# If specified, the sidecar will search for datasource config-maps inside this namespace.
# Otherwise the namespace in which the sidecar is running will be used.
# It's also possible to specify ALL to search in all namespaces
searchNamespace: null
notifiers:
enabled: false
# label that the configmaps with notifiers are marked with
label: grafana_notifier
# If specified, the sidecar will search for notifier config-maps inside this namespace.
# Otherwise the namespace in which the sidecar is running will be used.
# It's also possible to specify ALL to search in all namespaces
searchNamespace: null

## Override the deployment namespace
##
namespaceOverride: ""

Create a new file helm-prometheus.tf and copy the below code in yellow color
resource "helm_release" "prometheus" {
chart = "prometheus"
name = "prometheus"
namespace = "default"
repository = "https://prometheus-community.github.io/helm-charts"

# When you want to directly specify the value of an element in a map you need \\ to escape the point.
set {
name = "podSecurityPolicy\\.enabled"
value = true
}

set {
name = "server\\.persistentVolume\\.enabled"
value = false
}

set {
name = "server\\.resources"
# You can provide a map of value using yamlencode
value = yamlencode({
limits = {
cpu = "200m"
memory = "50Mi"
}
requests = {
cpu = "100m"
memory = "30Mi"
}
})
}
}

Create a new file helm-grafana.tf and copy the below code in yellow color
data "template_file" "grafana_values" {
template = file("templates/grafana-values.yaml")

vars = {
GRAFANA_SERVICE_ACCOUNT = "grafana"
GRAFANA_ADMIN_USER = "admin"
GRAFANA_ADMIN_PASSWORD = var.grafana_password
PROMETHEUS_SVC = "${helm_release.prometheus.name}-server"
NAMESPACE = "default"
}
}

resource "helm_release" "grafana" {
chart = "grafana"
name = "grafana"
repository = "https://grafana.github.io/helm-charts"
namespace = "default"

values = [
data.template_file.grafana_values.rendered
]
set {
name = "service.type"
value = "LoadBalancer"
}
}
Create a new file Jenkinsfile and copy the below code in yellow color

pipeline {

agent {

node {

label "master"

}

parameters {

choice(choices: ['dev', 'qa', 'prod'], description: 'Select Lifecycle to deploy', name: 'Environment')

password(name: 'GrafanaPassword', description: 'Enter Grafana Password Here')

choice(choices: ['master', 'feature_1', 'feature_2'], description: 'Select Branch to clone', name: 'Branch')

choice(choices: ['m4.large', 'm4.xlarge', 'm4.2xlarge'], description: 'Select Instance Size', name: 'InstanceSize')

choice(choices: ['1.18', '1.20', '1.21'], description: 'Select Kubernetes Version', name: 'KubeV')

booleanParam(name: 'autoApprove', defaultValue: false, description: 'Automatically run apply after generating plan?')

booleanParam(name: 'ACCEPTANCE_TESTS_LOG_TO_FILE', defaultValue: true, description: 'Should debug logs be written to a separate file?')

choice(name: 'ACCEPTANCE_TESTS_LOG_LEVEL', choices: ['WARN', 'ERROR', 'DEBUG', 'INFO', 'TRACE'], description: 'The Terraform Debug Level')

}

environment {

AWS_ACCESS_KEY_ID = credentials('AWS_ACCESS_KEY_ID')

AWS_SECRET_ACCESS_KEY = credentials('AWS_SECRET_ACCESS_KEY')

TF_LOG = "${params.ACCEPTANCE_TESTS_LOG_LEVEL}"

TF_LOG_PATH = "${params.ACCEPTANCE_TESTS_LOG_TO_FILE ? 'tf_log.log' : '' }"

TF_VAR_grafana_password = "${params.GrafanaPassword}"

TF_VAR_env = "${params.Environment}"

TF_VAR_instance_type = "${params.InstanceSize}"

TF_VAR_kube_version = "${params.KubeV}"

TF_VAR_environment = "${params.Branch}"

}

stages {

stage('checkout') {

steps {

echo "Pulling changes from the branch ${params.Branch}"

git credentialsId: 'bitbucket', url: 'https://bitbucket.org/username/eks-sample.git' , branch: "${params.Branch}"

}

stage('terraform plan') {

steps {

sh "pwd ; terraform init -input=true"

sh "terraform validate"

sh "terraform plan -input=true -out tfplan"

sh 'terraform show -no-color tfplan > tfplan.txt'

}

stage('terraform apply approval') {

when {

not {

equals expected: true, actual: params.autoApprove

}

steps {

script {

def plan = readFile 'tfplan.txt'

input message: "Do you want to apply the plan?",

parameters: [text(name: 'Plan', description: 'Please review the plan', defaultValue: plan)]

}

stage('terraform apply') {

steps {

sh "terraform apply -input=true tfplan"

}

stage('terraform destroy approval') {

steps {

input 'Run terraform destroy?'

}

stage('terraform destroy') {

steps {

sh 'terraform destroy -force'

}

Map IAM User to EKS using ClusterRole & ClusterRoleBinding
Create a new file called kube-rolebinding.tf and paste the highlighted code.
(This will bind/give IAM User the permission to perform operations)
resource "kubernetes_role" "admin_role" {
  metadata {
  name   = "eks-console-dashboard-full-access-clusterrole"
  namespace = "default"
  }

  rule {
  api_groups = ["*"]
  resources  = ["*"]
  verbs   = ["get", "list", "patch", "update", "watch"]
  }
}

resource "kubernetes_role_binding" "admin_role_binding" {
  metadata {
  name   = "eks-console-dashboard-full-access-binding"
  namespace = "default"
  }
  role_ref {
  api_group = "rbac.authorization.k8s.io"
  kind   = "Role"
  name   = "eks-console-dashboard-full-access-clusterrole"
  }

  subject {
  kind   = "User"
  name   = "terraform-user"
  api_group = "rbac.authorization.k8s.io"
  }
}

Commit and push code changes to Repo via Command Line or VSCode

Run the following commands to commit code to bitbucket:
- git pull
- git add *
- git commit -m "update"
- git push

OR

In Vscode, navigate to Source Code Icon on the right tabs on the side (Note: Only works with SSH configured with bitbucket)
Enter commit message
Click the + icon to stage changes

Push changes by clicking on the 🔄0 ⬇️ 1 ⬆️ as shown below

Run Pipeline Job

Go to eks-pipeline on Jenkins and run build

Note: The pipeline job will fail the first time to capture the parameters in Jenkinsfile

The next time you run a build you should see as shown below

Select dev in the Environment field
Enter Grafana Password
Select master as the branch
Choose m4.large, m4.xlarge or m4.2xlarge for EKS Cluster.
Choose Kubernetes version 1.18, 1.20 or 1.21.
Check the box ACCEPTANCE_TESTS_LOG_TO_FILE to enable Terraform logging
Select Trace for debug logging
Go to Console Output to track progress

Note: You can abort the destroy step and rerun the step by installing Blue Ocean Plugin on Jenkins to delete the resources created.

Configure AWS Credentials
Open a GitBash Terminal
Run the following command to configure your credentials and use the secret and access keys of the terraform-user.
- aws configure

Once configured, run command vi ~/.aws/config and add the following block of code:

[profile adminrole]
role_arn = arn:aws:iam::AWS-ACCOUNT-NUMBER:role/terraform-eks-cluster
source_profile = default

Authenticate into the EKS Cluster
Open Terminal cd ~ and login to EKS with command:
aws eks --region us-east-2 update-kubeconfig --name eks-sample-dev
aws eks --region us-east-2 update-kubeconfig --name eks-sample-dev --profile adminrole
Use this command kubectl edit configmap aws-auth -n kube-system to edit and change configmap to the following:

data:
  mapRoles: |
  - groups:
  - system:masters
  rolearn: arn:aws:iam::AWS-ACCOUNT-NUMBER:user/terraform-user
  username": {{SessionName}}
  - groups:
  - system:bootstrappers
  - system:nodes
  rolearn: arn:aws:iam::AWS-ACCOUNT-NUMBER:role/terraform-eks-tf-eks-node
  username: system:node:{{EC2PrivateDNSName}}
  - groups:
  - system:masters
  rolearn: arn:aws:iam::AWS-ACCOUNT-NUMBER:role/terraform-eks-cluster
  username: {{SessionName}}
  mapUsers: |
  - userarn: arn:aws:iam::AWS-ACCOUNT-NUMBER:user/terraform-user
  username: terraform-user
  groups:
  - system:masters

Once edited, :wq! to quit and save.

Verify EKS Cluster is Active and Nodes are Visible
Login to AWS Console with terraform-user credentials
Navigate to EKS, select eks-sample-dev and the nodes should be visible.
Open Terminal cd ~ and login to EKS with command:
aws eks --region us-east-2 update-kubeconfig --name eks-sample-dev --profile adminrole
Verify you are logged in with command: kubectl get nodes or kubectl get pods --all-namespaces

Verify Helm Deployment
Note: This step is not required, this only shows the list of helm deployments.
Ensure you have Helm installed on your computer before running the following command:
- helm list

Access Grafana Dashboard
Run the following command kubectl get svc and copy grafana's EXTERNAL-IP which is a LoadBalancer DNS as shown below.
You will be able to access Grafana from http://loadbalancer-dns

Use the following default username and password to log in. Once you log in with default credentials, it will prompt you to change the default password.

User: admin
Pass: YOUR-GRAFANA-PASSWORD

Access Prometheus UI
Now you should be able to access the Prometheus UI with port forwarding using the following command.
kubectl port-forward -n default svc/prometheus-server 8080:80
You will be to access Prometheus from http://127.0.0.1:8080

Devops Training

Sunday, 7 November 2021

Deploy Prometheus and Grafana Helm Chart on Kubernetes using Terraform

Create Terraform Workspace for EKS Pipeline

No comments:

Post a Comment

Bash Script to Install Artifactory in Ubuntu 22

Search This Blog