A Student's Guide to Software Engineering Tools & Techniques »

Docker

Authors: Rahul Rajesh

Reviewers: Monika Manuela Hengki, Wang Junming

What is Docker?

Docker Logo

Figure 1. Docker Logo (source)

Docker is a platform that is used to develop, deploy and run applications inside “containers”.

A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another. --Docker Website

The idea of containers is not so different from Virtual Machines (VM). Before the rise of Docker, people used VMs. A VM is an emulation of a real computer that is used to isolate an application and its dependencies into a self-contained unit that can run anywhere. For example, you could use a VM to run an application on a Linux system even though you are running Windows.

However, a container posses some advantages over a VM. As per the Docker Documentation, a container runs natively on Linux and shares the kernel of the host machine with other containers. It runs a discrete process, taking no more memory than any other executable, making it lightweight.

Docker Architecture

Figure 2. Difference between a container and a VM (source)

By contrast, a virtual machine (VM) runs a full-blown “guest” operating system with virtual access to host resources through a hypervisor. In general, VMs provide an environment with more resources than most applications need.

The idea behind Docker is to spin up a lightweight container that is able to execute services quickly without the overhead of a full-blown VM. With this in mind, let us move onto the the reasons to switch over to Docker. The subtleties between a container and VM will become more clear as you read the next few sections.

Why Docker?

Docker is a powerful tool that is rapidly gaining popularity. These are some statistics for Docker:

Docker Statistics

Figure 3. Usage of Docker (source)

Many leading companies (e.g. Spotify, Nginx, ElasticSearch etc.) are using Docker for their deployment!

If the numbers aren't enough to convince you to get started on Docker, as compiled by RedHat here are some of the notable advantages that Docker is able to provide:

  • Rapid application deployment – containers include the minimal runtime requirements of the application, reducing their size and allowing them to be deployed quickly.
  • Portability across machines – an application and all its dependencies can be bundled into a single container that is independent from the host version of Linux kernel, platform distribution, or deployment model. This container can be transferred to another machine that runs Docker, and executed there without compatibility issues.
  • Version control and component reuse – you can track successive versions of a container, inspect differences, or roll-back to previous versions. Containers reuse components from the preceding layers, which makes them noticeably lightweight.
  • Sharing – you can use a remote repository to share your container with others.Red Hat provides a registry for this purpose, and it is also possible to configure your own private repository.
  • Lightweight footprint and minimal overhead – Docker images are typically very small, which facilitates rapid delivery and reduces the time to deploy new application containers.
  • Simplified maintenance – Docker reduces effort and risk of problems with application dependencies.
As you can read from above, Docker is undeniably a powerful tool that can alleviate your deployment troubles. However, be aware that Docker is not a one-size-fits-all solution to your problems. Docker also has its limitations e.g. refer to this article. Carefully consider your use case before turning to Docker.

Now that we have a good idea of how Docker works and a summary of the notable advantages it is able to provide, let us take a closer look at some of the unique features that Docker is able to provide.

Feature: Docker is Lightweight

We have discussed above that Docker makes use of a container instead of a full-fledged VM to run your application. We have briefly covered the differences between a VM and a container above.

The table below lists some of the differences between a VM and a container:

VM vs Container

Figure 4. VM vs Container (source)

The image listed previously in the “What is Docker” section helps to give a pictorial representation of the points listed above in the table. To summarise, a Docker container shares the host OS and runs a discrete process on your operating system (much like any other application). As a result, Docker has a minimal footprint and is much more lightweight. This also makes it faster to run.

Feature: Docker Allows for Sharing and Reuse

Docker containers require a base image to run. An image corresponds to the service you want e.g. python image / ubuntu image etc. An image helps to define what you want your packaged application and its dependencies to look like.

One of the reasons why Docker is so great is that it provides a shared resource known as the Docker Hub to download prebuilt images. The docker hub has over a hundred thousand images created by the community that are readily available for use.

Docker Hub

Figure 5. Docker Hub (source)

Hence, no matter what your use case is, there is a good chance that someone else has already built an image for it on the Docker Hub. With Docker, you do not have to spend hours thinking about how to configure your images. On top of that, you are free to augment existing images to fit you exact needs. You can then, share your new image back to the Docker Hub for others to use!

Feature: Docker is Accessible

On top of the abovementioned advantages, Docker has made it much easier for anyone from developers to system admins to take advantage of containers to quickly build and run applications. Docker allows anyone to package an application on their laptop which in turn can run unmodified on any public cloud or private cloud. Hence the mantra, “build once, run anywhere”.

Docker is able to do this through what is known as a DockerFile. A DockerFile is where you write the instructions to build a Docker image. Once a DockerFile is set-up, run docker build to build the container.

An example of a simple DockerFile is as follows:

#FROM - Image to start building on.
FROM ubuntu:14.04

#RUN - Runs a command in the container
RUN echo "Hello Docker!" > /tmp/hello_docker.txt

#CMD - Identifies the command that should be used by default when running the image as a container.
CMD ["cat", "/tmp/hello_docker.txt"]

As you can see from above, a DockerFile is a series of instructions that is used to build the container. You start with a base image (ubuntu for the example above) and then add more “layers” to the image, with layers representing a portion of the images file system that either adds to or replaces the layer above it.

For the example used, the layers are simple bash commands that help print “Hello Docker”. A DockerFile is also able to do much more than this! It can install specific dependencies, it can run a couple of servers, set up configuration files etc. There are plenty of guides available out there that covers the fundamentals behind getting started with this (refer to next section for some links).

As a result, Docker is easy to get started with. Once you have configured a DockerFile, you can reuse it everywhere else to run your container! The process for creating a DockerFile is also made easier due to the plethora of resources available and the Docker Hub.

Feature: Docker is Modular and Scalable

As mentioned earlier, Docker makes it much easier to deploy an application that is using a microservices based architecture. For example, you may have Postgres database running in one container, Redis server in another and a Python Flask application in another. Docker makes it much easier to group these containers together and scale/update individual components easily in the future.

In order to provide a little more clarity to this, let us consider a simple blog application that is running using Nginx, WordPress and MariaDB. We can organise this as follows:

Docker Microservice Architecture

Figure 6. Docker Architecture (source)

Each of the above services is encapsulated in a container using Docker. Docker provides an added functionality called docker-compose that allows you to run all the containers at once. docker-compose also has added advantages:

  • Preserve volume data when containers are created
  • Only recreate containers that have changed
  • Variables and moving a composition between environments

The set-up to use docker-compose involves the creation of a YAML file. Detailed information on this is available here.

How to use Docker?

At this point, you would have realised the advantages that Docker is able to give you and might be considering to switch over to Docker for your own projects.

Here is a learning path that you can follow to pick up Docker:

  • Docker's getting started guide : Docker's offical documentation is a good place to start out. It is a good overview of the fundamentals behind Docker. They will take you through setting up your own docker environment, building an image, scaling and deploying.

  • Article covering important concepts behind Docker : After reading the official documentation, this is another excellent article to look through. It covers how Docker works in detail with good examples.

  • Books covering specific use cases with Docker : Once you have a clearer picture on the fundamentals behind Docker, this resource will provide you with a collection of books that show you how to use Docker in a practical setting.

Additional Tools - Docker Swarm and Kubernetes

If you have managed to familiarise yourself with the functionality that Docker is able to provide and have used Docker for your own projects, here are some additional tools that you might want to look into that make use of Docker.

Kubernetes is open-source platform created by Google for container deployment operations, scaling up and down, and automation across the clusters of hosts. It is a tool that can help you manage many docker containers. You can read up more about it in their official documentation here

Docker Swarm is another alternate tool that is Docker’s own native clustering solution for Docker containers. It monitors the number of containers spread across clusters of servers and is a way to create clustered docker application without additional hardware. The docker official documentation gives more information on this.

Conclusion / Further Readings

In a nutshell, Docker is a lightweight solution to run you application in an isolated environment. Docker provides a convenient out-of-the-box setup to deploy your applications and has added functionality to deploy complex microservices based applications.

Apart from those listed in the article, here are some further readings/references to get moving with Docker: