Docker is everywhere nowadays and not going away. We’ve been using it for years but do we really understand it?
This post is meant to be an introduction to this technology for newcomers. The next post in the series will be all about how to work productively with Docker in a NodeJS development environment.
Instead of memorizing facts and APIs I like to think and explain from another angle:
What problem are we trying to solve?
And could we have invented this ourselves if we had never heard of this technology?
In case of Docker and Kubernetes we couldn’t write the platform outselves. But we at least can come up with the feature spec of what we need. And suddenly the choices the developers made make much more sense.
What problem are we trying to solve?
Have you ever had something running on your local computer but it broke when you pushed it to staging/production?
Maybe some dependencies wouldn’t work because you were running a different node version?
Or a library wouldn’t build because you are running Windows and the production server Linux?
All these little issues can eat up a lot of development time. Tweaking the code, re-running all tests, pushing the changes, waiting…
And what about running two different versions of the same program? Let’s say two Apache servers with different dependencies and ports?
What we want is something that isolates those programs from the rest of the system. Like a virtual machine that contains everything that our application needs to run.
How containers solve our problems
Containers are a Linux technology based on cgroups that act like really light-weight virtual machines. That means a container contains everything that it needs to run our applications. Our Code and all dependencies.
A single Linux kernel runs all the containers on a machine. But a container can only see its own files, processes, CPU and memory usage.
Docker was released in 2013 and, by making it simple to use, brought containers into the mainstream.
What do I need to know to use Docker productively?
If you’re a Mac or Windows user you probably want Docker Desktop. For anyone else you’ll be quicker to search for up-to-date information in your favorite search engine.
What does Docker consist of?
The heart of Docker is the Docker daemon, called
dockerd. This Linux process is handled by virtualization software (e.g. HyperKit on MacOS) if the host system is not Linux itself. It listens to requests sent via the Docker API.
You interact with the daemon with the
docker command-line tool. As a developer you will work most of the time with Docker Compose which enhances the docker command with additional functionality.
Docker container are based on images which can be thought of as read-only blueprints. They define what will be baked in and what will run inside this container.
Images need to be stored somewhere and for this there exist registries. These can be public or private. The default registry is the Docker Hub.
An Introduction to Docker-Compose
Docker-Compose is a tool consisting of a CLI and specification files. Together they simplify development of multi-container environments.
Why do we need it?
Think about a relatively simple setup like the following: You have a node server running in one container and a postgres database in another. You get to work writing your Docker cli command: exposing some ports, mounting volumes, setting environment variables… That’s already a pretty long command that you need to remember and teach new team members.
Also you want to run your database container first, wait until it’s finished loading and then run your node server container. You could write all this into shell scripts but there’s a nicer way.
Docker Compose let’s you write all this configuration declaratively into a
docker-compose.yaml file and you’re done. Now everyone on the team just needs to run the CLI command
docker-compose up and Docker gets to work to set up your environment exactly like you wanted it.
The Compose CLI (e.g.
docker-compose my-command) can replace the Docker CLI (e.g.
docker mycommand) completely.
A Short Intro to the YAML specification
Docker files are written in YAML. Think of it like JSON but with some nice additions, like comments. Many tools besides Docker in the operations space use YAML. For example Ansible, Salt and Kubernetes, so it’s worth it to learn.
Like JSON, YAML consists of keys and values. Values can be keys again to build a nested structure. Here is an example docker-compose YAML file:
1version: '2' # key is `version`, value is `1`23services:4 my-banana-service # indented with 2 spaces5 my-list: # indented further to be part of banana service6 - 'my-first-entry' # my-list = 'my-first-entry7 - 'my-second-entry' # my-list = 'my-second-entry8 my-second-service # indentation with only 2 spaces, so it's not part of banana service but just of `services`
As you can see colons define key-value pairs. Indentation defines relationships between data layers. In our example
my-banana-service belongs to
my-list, in turn, belongs to
Dashes define array entries.
version should I use?
On the top of your YAML file you should define your docker-compose version. Nowadays you see mostly two different versions floating around: version 2 and version 3 (as well as minor versions like ‘2.4’ and ‘3.7’). Version 3 is not a strict super-set of version 2 and thus version 2 is not obsolete!
Version 3 adds orchestration features that you might want to use if you don’t employ Docker Swarm or Kubernetes but removes others. If you use one of those two anyway it’s fine to stay on Version 2. Read more in the docker docs.
The Docker-Compose CLI
For using the command-line interface, there are two basic commands you want to know:
docker-compose up looks for the specification file (
docker-compose.yaml in the current directory by default) and creates your environment like you defined it. It will pull or build the images, create networks and volumes and run your containers.
docker-compose down stops and deletes your networks and containers. It can also delete your volumes if you add the
Other interesting commands are
docker-compose ps to list services,
docker-compose logs to see your logs and
docker-compose exec to run commands on your containers.
One caveat: don’t use the CLI in production, rather use battle-proven orchestrators like Docker Swarm or Kubernetes for this. Docker-compose should stay a development tool in your workflow.
This has been the high-level overview of Docker. In one of my next post I will explore advanced and NodeJS-specific Docker features and finally finish with a post about the winner of the orchestration wars: Kubernetes.