DevOps Training Program that will provide you with in-depth knowledge of various DevOps tools including Git, Jenkins, Docker, Ansible, Puppet, Kubernetes and Nagios. This training is completely hands-on and designed in a way to help you become a certified practitioner through best practices in Continuous Development, Continuous Testing, Configuration Management and Continuous Integration, and finally, Continuous Monitoring of software throughout its development life cycle.
Docker images are an essential component for building Docker containers. Although closely related, there are major differences between containers and Docker images. A Docker image serves as the base of a container. Docker images are created by writing Dockerfiles – lists of instructions automatically executed for creating a specific Docker image. When building a Docker image, you may want to make sure to keep it light. Avoiding large images speed up the build and deployment of containers and hence, it is critical to reducing the image size of images to a minimum.
Here are some basic steps recommended to follow, which will help create smaller and more efficient Docker images.
1. USE A SMALLER BASE IMAGE
FROM ubuntu
The above command will set your image size to 128MB at the outset. Consider using smaller base images. For each apt-get installer or yum install line you add in your Dockerfile, you will be increasing the image size based on the size of the library is added. Realize that you probably don’t need many of those libraries that you are installing. Identify the ones you really need and install only those.
For example, by considering an alpine base image, the size of the image will get reduced to 5MB from 128MB.
2. DON’T INSTALL DEBUG TOOLS LIKE curl/vim/nano
Many developers will use the curl/vim tools inside the Dockerfiles for later debugging purposes inside the container. The image size will further increase because of these debugging tools.
Note: It is recommended that install these tools only in the development Dockerfile and remove it once the development is completed and is ready for deployment to staging or production environments.
3. MINIMIZE LAYERS
Try to minimize the number of layers to install the packages in the Dockerfile. Otherwise, this may cause each step in the build process to increase the size of the image.
FROM debian
RUN apt-get install -y<packageA>
RUN apt-get install -y<packageB>
Try to install all the packages on a single RUN command to reduce the number of steps in the build process and reduce the size of the image.
FROM debianRUN apt-get install -y<packageA><packageB>
Note: Using this method, you will need to rebuild the entire image each time you add a new package to install.
4. USE –no-install-recommends ON apt-get install
Adding — no-install-recommends to apt-get install -y can help dramatically reduce the size by avoiding installing packages that aren’t technically dependencies but are recommended to be installed alongside packages.
Note: apk addcommands should have–no-cache added.
5. ADD rm -rf /var/lib/apt/lists/* TO SAME LAYER AS apt-get installs
Add rm -rf /var/lib/apt/lists/* at the end of the apt-get -y install to clean up after install packages. (For yum, useyum clean all)
If you are to install wgetor curl to download some package, remember to combine them all in one RUN statement. At the end of the run, the statement performs apt-get remove curlor wget, once you no longer need them.
6. USE fromlatest.io
From Latest will Lint your Dockerfile and check for even more steps you can perform to reduce your image size.
7. MULTI-STAGE BUILDS IN DOCKER
The multi-stage build divides Dockerfile into multiple stages to pass the required artifact from one stage to another and eventually deliver the final artifact in the last stage. This way, our final image won’t have any unnecessary content except the required artifact.
CONCLUSION
Smaller the image size better the resource utilization and faster the operations. By carefully selecting and building image components by following the recommendations in this article, one can easily save space and build efficient and reliable Docker images.
There are already many tutorials on how to dockerize applications available on the internet, so why I am writing another one?
Most of the tutorials I see are focused on a specifictechnology (say Java or Python) which may not cover what you need. They also do not address all the relevantaspects that are necessary to establish a well defined contract between Dev and Ops teams (which is what containerization is all about).
I compiled the steps below based on my recent experiences and lessonslearned. It is a checklist of details and things that are overlooked by the other guides you will see around.
Disclaimer: This is NOT a beginners guide. I recommend you learn the basics of how to setup and use docker at first, and come back here after you have created and launched a few containers.
Let’s get started.
1. Choose a base Image
There are many technology specific base images, such as:
This is usually trivial. Some details you may be missing:
a-) You need to write apt-get update and apt-get install on the same line (same if you are using apk on Alpine). This is not only a common practice, you need to do it, otherwise the “apt-get update” temporary image (layer) can be cached and may not update the package information you need immediately after (see this discussion https://forums.docker.com/t/dockerfile-run-apt-get-install-all-packages-at-once-or-one-by-one/17191).
b-) Double check if you are installing ONLY what you reallyneed (assuming you will run the container on production). I have seen people installing vim and other developmenttools inside their images.
If necessary, create a different Dockerfile for build/debugging/development time. This is not only about image size, think about security, maintainability and so on.
3. Add your custom files
A few hints to improve your Dockerfiles:
a-) Understand the different between COPY and ADD:
E.g. for interpreted applications (PHP, Python), use /usr/src folder.
c-) Check the attributes of the files you are adding. If you need execution permission, there is no need to add a new layer on your image (RUN chmod +x …). Just fix the original attributes on your code repository.
There is no excuse for that, even if you are using Windows, see:
a-) You only need to run your container with a specific (fixed ID) user if your application need access to the user or group tables (/etc/passwd or /etc/group).
b-) Avoid running your container as root as much as possible.
Unfortunately, it is not hard to find popular applications requiring you to run them with specific ids (e.g. Elastic Search with uid:gid = 1000:1000).
Try not to be another one…
5. Define the exposed ports
This is usually a very trivial process, please, just don’t create the need for your container to run as root because you want it to expose a privileged low port (80). Just expose a non privileged port (e.g. 8080) and map it during the container execution.
Every application requires some kind of parametrization. There are basically two paths you can follow:
1-) Use an application specific configuration file: them you will need to document the format, fields, location and so on (not good if you have a complex environment, with applications spanning different technologies).
2-) Use (operating system) Environment variables: Simple and efficient.
If you think this is not modern or recommended approach, remember this is part of The Twelve-Factors:
This does not mean that you need to throw away your configuration files and refactor the config mechanism of your application.
Just use a simple envsubst command to replace a configuration template (inside the docker-entrypoint.sh, because it needs to be performed on run time).
This will encapsulate the application specific configuration file, layout an details inside the container.
8. Externalize your data
The golden rule is: do not save any persistent data inside the container.
The container file system is supposed and intended to be temporary, ephemeral. So any user generated content, data files, process output should be saved either on a mountedvolume or on a bind mounts (that is, on a folder on the Base OS linked inside the container).
I honestly do not have a lot of experience on mounted volumes, I have always preferred to save data on a bindmounts, using a previously created folder carefullydefined using a configurationmanagement tool (such as Salt Stack).
As carefullycreated, I mean the following:
I create a non privileged user (and group) on the Base OS.
All bind folders (-v) are created using this user as owner.
Permissions are given accordingly (only to this specific user and group, other users will have no access to that).
The container will be run with this user.
You will be in full control of that.
9. Make sure you handle the logs as well
I am aware that my previous “persistentdata” is far from being a precise definition, and logs sometimes fall into the grey area. How should you handle them?
If you are creating a new app and want it to stick to docker conventions, no logs files should be written at all. The application should use stdout and stderr as an eventstream. Just like the environment variables recommendation, it is also one of The Twelve-Factors. See:
There are some practical cases where this is particularly difficult though. If you are running a simple nginx container, you will have at least two different types of log files:
HTTP Access Logs
Error Logs
With different structures, configurations and pre existing implementations, it may not be trivial to pipe them on the standard output.
In this case, just handle the log files as described on the previous section, And make sure you rotate them.
10. Rotate logs and other append only files
If your application is writing log files or appending any files that can growindefinitely, you need to worry about file rotation.
This is critical for you to prevent the server running out of space, apply data retention policies (which is critical when it comes to GDPR and other data regulations).
If you are using bindmounts, you can count on some help from the Base OS and use the same tools you would use for a local rotation configuration, that is logrotate
Multistage builds make use of one Dockerfile with multiple FROM instructions. Each of these FROM instructions is a new build stage that can COPY artifacts from the previous stages. By going and copying the build artifact from the build stage, you eliminate all the intermediate steps such as downloading of code, installing dependencies, and testing. All these steps create additional layers, and you want to eliminate them from the final image.
The build stage is named by appending AS name-of-build to the FROM instruction. The name of the build stage can be used in a subsequent FROM and COPY command by providing a convenient way to identify the source layer for files brought into the image build. The final image is produced from the last stage executed in the Dockerfile.
Try taking the example from the previous section that used more than one Dockerfile for the React application and replacing the solution with one file that uses a multistage build.
Dockerfile
FROM node:12.13.0-alpine as build
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
FROM nginx
EXPOSE 3000
COPY ./nginx/default.conf /etc/nginx/conf.d/default.conf
COPY --from=build /app/build /usr/share/nginx/html
This Dockerfile has two FROM commands, with each one constituting a distinct build stage. These distinct commands are numbered internally, stage 0 and stage 1 respectively. However, stage 0 is given a friendly alias of build. This stage builds the application and stores it in the directory specified by the WORKDIR command. The resultant image is over 420 MB in size.
The second stage starts by pulling the official Nginx image from Docker Hub. It then copies the updated virtual server configuration to replace the default Nginx configuration. Then the COPY –from command is used to copy only the production-related application code from the image built by the previous stage. The final image is approximately 127 MB.
Case Study Lab: Dockerize an Angular Application with Ngnix
we will go through a step-by-step guide on how to write a multi-stage dockerfile to build an angular application using Docker and host the production-ready code in an NGINX container. We will also walk through some of the docker commands used to build, run and monitor the status of containers.
Creating an Angular Application:
In order to proceed with this step, ensure that you have Node.js and Angular CLI installed in your ec2 instance. Installation instructions can be found below
Step 1 – Install Node.js
First of all, you need to install node.js on your system. If you don’t have node.js installed use the following set of commands to add node.js PPA in your Ubuntu system and install it.
Make sure you have successfully installed node.js and NPM on your system
node --versionnpm --version
npm install n -g
n stable
n latest
Step 2 – Install Angular/CLI
After installation of node.js and npm on your system, use following commands to install Angular cli tool on your system.
npm install -g @angular/cli
Once the prerequisites are installed, you can start by executing the following commands to create an angular application in a local directory of your preference.
Go to your ipaddress:8001 . Make sure port 8001 is open in your security group
Now that we have an angular application successfully running, let’s start writing a dockerfile for it.
Writing a Dockerfile
Below is the dockerfile snippet we will use to dockerize our angular application with a NGINX server. The dockerfile comprises of a multi-stage docker build, which is divided into the following stages:
Building the angular source code into production ready output
Serving the application using a NGINX web server
# Stage 1: Compile and Build angular codebase
# Use official node image as the base image
FROM node:latest as build
# Set the working directory
WORKDIR /usr/local/app
# Add the source code to app
COPY ./ /usr/local/app/
# Install all the dependencies
RUN npm install
# Generate the build of the application
RUN npm run build
# Stage 2: Serve app with nginx server
# Use official nginx image as the base image
FROM nginx:latest
# Copy the build output to replace the default nginx contents.
FROM – Initializes a new build stage, and sets the latest node image from DockerHub registry as the base image for executing subsequent instructions relevant to the angular app’s configuration. The stage is arbitrarily named as build, to reference this stage in the nginx configuration stage.
WORKDIR – Sets the default working directory in which the subsequent instructions are executed. The directory is created, if the path is not found. In the above snippet, an arbitrary path of usr/local/app is chosen as the directory to move the angular source code into.
COPY – Copies the source files from the project’s root directory on the host machine to the specified working directory’s path on the container’s filesystem.
RUN – Executes the angular build in a new layer on top of the base node image. After this instruction is executed, the build output is stored under usr/local/app/dist/sample-angular-app and the compiled image will be used for the subsequent steps in the Dockerfile.
Stage 2:
FROM – Initializes a secondary build stage, and sets the latest nginx image from dockerhub registry as the base image for executing subsequent instructions relevant to nginx configuration.
COPY – Copies the build output generated in stage 1 (--from=build) to replace the default nginx contents.
EXPOSE – Informs Docker that the nginx container listens on network port 80 at runtime. By default, the nginx server runs on port 80, hence we are exposing that specific port.
Running the Docker Container
In order to build and run the docker container, open up a command prompt and navigate to the location of your Dockerfile in your project’s directory.
Execute the following command to build the docker image.
The file path . defines the location of the Dockerfile in the current directory, and the -t argument tags the resulting image, where the repository name is krish186/sample-angular-app-image and the tag is latest.
After the build is successfully finished, we can check to see if it appears in the list of docker images available locally. To do so, we can execute the below command.
The -d option, causes Docker to detach the container and have it run in the background. The -p argument establishes a port mapping, which defines that port 80 of the docker container (as specified in dockerfile), should be exposed to port 8080 of our host machine.
To check the details of our running container, type in the following command:
As per the above output, we see that the container is up and running. If we now head to ipaddress:8080/ we can see the angular application is successfully dockerized.
Now that the application is running as expected, our next step would be to push our image to an image repository, to deploy our containers to a cloud service.
If you have a DockerHub account you can execute the following commands: