Building

Dockerfile

base image

Declare the base image:

FROM ubuntu:16.04

For smaller image size, consider using Alpine as the base image.

system packages

Install packages from the distribution package manager:

RUN apt-get update && apt-get install -y make python3.5 python3.5-venv

Putting the apt-get update in the same RUN instruction with apt-get install can be a nice thing to do. It results in fewer layers. Also if we add a recent package to apt-get install, we will re-run the apt-get update since we didn't cache the update separately, so we are guaranteed to see the update.

app directory

RUN mkdir /app
WORKDIR /app

This creates a trivial layer. Add the mkdir /app call to the RUN instruction which does apt-get update and apt-get install to avoid this.

Many containers will need a place for the source code from the repository. Often it is convenient to make this place the working directory. Is /app a good location for this place? It is what I use.

language specific packages

COPY requirements.txt /app/requirements.txt
RUN pip3 install -r requirements.txt

It is a good idea to put the pip install, bundle install, or npm install in their own layer so they are cached. Don't COPY all source code before you do this; just the requirements.txt, Gemfile, or package.json file necessary for installing the language specific packages. That way the layer is only rebuilt when the configuration files containing the list of packages are edited.

copy source code

COPY . /app

There will probably be stuff you don't want to copy from the repository to the container. List the paths to exclude in .dockerignore.

By the way, it is a good practice to do a fresh clone of the Git repository, following by an image build. If you have been building outside the container—which is the case when dockerization happens late in the development process—there might be build artifacts that are getting copied into the image. A docker build from a clean Git repository might fail!

check source code

RUN make check

Define a script or a build target which runs all the static code analysis checks and unit tests. Run it as part of the build process. If it fails, the final image doesn't get built.

default command

CMD ["make"]

Like the RUN instruction, the CMD instruction has a JSON syntax and a shell syntax form. If JSON syntax is used, the shell is bypassed and the first argument in the JSON list is exec'd directly. For efficiency the JSON syntax should be used if shell features such as redirection and pipes are not needed.

If the user provides a command and arguments to docker run, they take the place of the arguments to the CMD instruction.

Smaller Images

A smaller image trick

multi-stage builds

Alpine Linux vs Ubuntu Linux

I created a simple app which used python to serve a static page over HTTP.

When I used Ubuntu 16.04 as the base image, the size of the image was 213M. When I used Alpine 3.5 as the base image, the size of the image was 70M.

The name of the Alpine package manager is apk.

Here is an example of using it in a Dockerfile to install make and Python:

RUN apk update && apk add make python3

Alpine Linux uses musl for a libc implementation. Command line tools are provided by BusyBox. The shell is not bash.

Cleaning up Images

Use filter flag to narrow the list of containers.

$ docker images -q --filter="dangling=true" | xargs docker rmi

Running

Aliases for Building and Running

I find these aliases convenient:

alias dbuild='docker build -t IMAGE_TAG .'
alias dmake='docker run -v /foo-data:/app/data --rm IMAGE_TAG make'
alias dbash='docker run -v /foo-data:/app/data -it --rm IMAGE_TAG /bin/bash'

I put them in a .alias file at the root of the repository so they can easily be sourced.

Environment

The Twelve-Factor App defines some rules for designing apps to make them easy to deploy in the cloud.

One of the rules is to use environment variables for "external configuration", which includes information about the services the app communicates with.

Environment variables are set using the -e flag of docker run:

docker run -e PGHOST=127.0.0.1

If you find yourself writing code which needs to know whether it is in a container, you could establish an environment variable for this, but then the person calling docker run would have to remember to specify the environment variable.

A better approach is to check for the existence of the file /.dockerenv, which Docker creates inside every container.

The ENV instruction creates environment variables available at build time.

We could use the ENV instruction to define a name for our /app directory and achieve some DRYness in the Dockerfile:

ENV HOME /app
RUN mkdir $HOME
WORKDIR $HOME

COPY requirements.txt ${HOME}/requirements.txt
RUN pip install -r requirements.txt

COPY . $HOME

SSH Keys

  • build time (e.g. pulling from git)
  • run time (e.g. tunneling)

Command Line Tool

The ENTRYPOINT and CMD instructions determine what executes when the container runs.

The default ENTRYPOINT is /bin/sh -c.

ENTRYPOINT cannot be overriden. (It can use --entrypoint--)

TODO:

Cleaning up Containers

Remove all containers:

$ docker ps -q -a | xargs docker rm

Networking

Ports

Each container has its own namespace for networking resources, including networks and ports.

The docker engine creates three networks on the host. They are named bridge, host, and none.

$ docker network ls
7a8f832fea5a        bridge              bridge              local
3d7cdde1f796        host                host                local
1d082176606a        none                null                local

The {{docker run --network flag can be used to connect a container to one of these three networks. The bridge network is the network used by default.

The docker run -p HOST_PORT:CONTAINER_PORT flag can be used to publish a container port to a host port. If a process in the container binds to CONTAINER_PORT then a client can communicate with the container via HOST_PORT.

If the host has multiple IP address, the IP address can also be specified: docker run -p HOST_IP:HOST_PORT:CONTAINER_PORT.

exposing ports

It is possible to "expose" ports using the {{EXPOSE}} directive in a Dockerfile or the {{--expose flag of the docker run command.

I'm not sure what this does exactly. It doesn't do everything necessary to communicate with the container. Some recommend merely using the EXPOSE directive as a way to document the container namespace port used by the process in the container.

"connecting" a container to a host network

What does it mean to "connect" a container to a host network? It seems that two virtual ethernet devices are created, one of the root namespace of the host, and one in the network namespace of the container. Something like this, which creates veth0 in the root and veth1 in the container:

$ sudo ip link add veth0 type veth peer name veth1
$ sudo ip link set veth1 netns netns1

Ethernet packets sent to veth0 are also sent to veth1 and vice versa. veth1 does not have any routing tables or firewall rules, so all that must be managed somehow.

accessing host port from inside container

Use this IP address?

hostip=$(ip route show | awk '/default/ {print $3}')

accessing host port from inside container (mac)

One way to do this is to assign an IP address to the lo0 interface of your Mac; e.g.

sudo ifconfig lo0 alias 10.200.10.1/24

Links

$ run --detach --name foo --rm foo
$ docker run --detach --name bar --link foo --rm bar

Networks

Try these commands:

$ docker help network

$ docker network ls

Volumes

Each container has its own namespace for mount points.

Docker uses union file systems. OverlayFS is the default? It was added to the Linux Kernel in version 3.18.

How to get the Linux Kernel version:

$ uname -r

The container contains its own file system. This is a union file system such as OverlayFS. The process can write to file system, but it is only changing a layer on top of the file system in the image. Those changes are lost when a container is removed, but not when it is stopped and started.

Volumes allow the container to write to a part of the host files system that persists when the container is removed.

Volumes are mounted by the docker run command when the container is started. The -v flag can be used to specify the directory to be mounted.

Use this command if the host and container paths of the volume are not the same:

docker run -v HOST_PATH:CONTAINER_PATH CMD

The VOLUME Dockerfile directive can also be used, but the mount is still created by docker run.

If the app needs write access to a single volume use {{/<REPO_NAME>-data for the host name and /data for the container name. All containers see the same data.

Repositories

Docker Hub

If you have an account named USERNAME on Docker Hub, then you can create a repository on Docker Hub and push to it.

$ docker build -t USERNAME/REPO_NAME
$ docker login --username=USERNAME
$ docker push USERNAME/REPO_NAME

Private Docker Registry

$ docker run -d -p 5000:5000 --restart=always --name registry registry:2
$ docker tag 6ddf30a4f0d2 localhost:5000/REPO_NAME
$ docker push localhost:5000/REPO_NAME

Amazon ECR

This returns a docker login command which you can run to log in to the ECR registry.

$ aws ecr get-login