rohit kshirsagar
11 min readNov 5, 2020

Demystifying Container Runtime for Beginners

Wondering what a container runtime is ? There is a plethora of container runtimes available today and these runtimes are leveraged in different scenarios. This blog provides a birds eye view of what a container runtime is and demystifies various container runtimes.However not all of the container runtimes are covered, but the most commanly used.

To begin with let me just give you a snapshot of a container.

A container packages all the application code along with its dependencies and ship to a repository as an image.One can download the image and run the application anywhere.Containers are nothing but Linux process isolated by underlying kernel functions such as :

  • namespaces -what container/process can see.
  • cgroups what the container /process can use or what resources it has access to.

Now, what is a container runtime .

A container runtime is software that executes containers and manages container images on a node.The most widely known is “Docker”, but there are others such as runc ,cri-o,kata-runtime, lxc etc.

Container runtimes are divided into two categories.

  1. Low level Runtime -This focuses on actual running of a container (eg; runC)
  2. High Level Runtime-This focuses on the managing and sharing images and managing APIs(eg; containerd, cri-o).

To understand container runtime in detail, one need to understand, first the concept of Open Container Initiative (OCI ) Kubelet and Container Runtime Interface (CRI )

OCI (Open Container Initiative ) is a lightweight, open governance structure (project), formed under the auspices of the Linux Foundation, for the express purpose of creating open industry standards around container formats and runtime.This project was launched on June 22nd 2015 by Docker, CoreOS and other leaders in the container industry like IBM , Google, Docker. The OCI currently contains two specifications:

  • Runtime Specification : This defines the primitives that can be used to start, stop, post and destroy the containers. A implementation of runtime spec is runC.
  • Image Specification : This defines a image format and image distribution specification (eg;registry operation by docker)

CRI (Container Run time Interface)- As multiple container runtimes emerged in the recent times which demanded code level modification in the Kubernetes master code base to have a seamless integration, Kubernetes project felt that this approach is inappropriate as it did not want to modify the code base for each new container runtime. Kubernetes 1.5 introduced an internal plugin API named CRI to provide easy access to different container runtimes. CRI enables Kubernetes to use a variety of container runtimes without the need to recompile. In theory, Kubernetes could use any container runtime that implements CRI to manage pods, containers and container images.This meant anyone can create their own container runtime and simply have it communicated to the CRI interface in order to run containers under Kubernetes.

Kubelet — CRI connects the Kubelet to other runtimes. The Kubelet agent runs on each worker node and ensures that the containers are running in a pod. When it comes alive the Kubelet uses CRI to work with whatever runtime is present on that specific node. Kubelet needs the runtime to :

  • Provide image management .
  • Prepare the environment to start the container.
  • Prepare network for the pod .
Kubelet interaction with various runtime

Docker Shim:

Docker shim is one of the container runtime implementing the CRI.A container runtime, docker shim is a piece of software that resides between a container manager and a container runtime to facilitate communication and prevent integration problems that may arise. It talks to docker or dockerd which manages pods. It basically sits as the parent of the container’s process to facilitate a few things.

First it allows the runtimes, i.e. runc, to exit after it starts the container. This way we don’t have to have the long running runtime processes for containers.

Second it keeps the STDIO open for the container in case containerd and/or docker both die. Without it the container would exit.

The way to spot the shim is to inspect the process tree on a Linux host with a running docker container, it will appear as /usr/bin/docker-containerd-shim.

Docker (dockerd):

It was the first open source container runtime.Docker was originally developed as a monolithic daemon (dockerd) that was broken up as it evolved. The low-level runtime features that were present in early versions were separated into two daemons: runc and containerd.Dockerd listens for docker API request and has features for building images, while containerd manages and runs images ie; containerd is responsible for the complete container life cycle management .

Docker is the most widely known and used runtime .With Kubernetes as the orchestrator, current versions of Docker package, build, and run containers. Docker engine is built on runc and containerd. Docker calls their product the “Docker Engine”, and generically these full container tools suites may be referred to as Container Engines.

Containerd:

Containerd is a high level container runtime introduced in Docker 1.11 .It is responsible for managing container life cycle. Containerd, by default uses runc under the hood,acts as control daemon for runC. It takes care of :

  • Image push pull.
  • Storage Management.
  • Container execution by calling runc to run the containers.
  • Management of network namespaces containers to join existing namespaces.

Containerd fully leverages the OCI runtime and image format specifications and OCI reference implementation (runc). Like the rest of the container tools that originated from Docker, it is the current de-facto standard CRI. It provides all the core functionality of a CRI.Though initially developed by Docker but in 2017 it was donated to CNCF to serve as the industry standard for container management daemon. Docker is still using containerd, but containerd is independent of Docker.

Pros

  • Widespread adoption
  • Stability and performance
  • Big Community and support model.
  • Works with all the OCI runtimes
  • Work in progress : Linux containers over windows and windows container over windows.

Cons

  • Runtime coupled with daemon ie;when you restart the containerd daemon all the containers get restarted .
  • Standard packaging have too many daemons (docker shim, dockerd, containerd )

runC:

runC is a universal lightweight low level container runtime originally developed as part of Docker and later extracted out as a separate open source tool and library.

As per the OCI specification it’s a command line tool for spawning and running containers. Containerd is the control daemon for runC. High level container runtimes like docker and containerd will normally implement functionalities such as image creation and management and will use runC to handle tasks related to running containers — creating a container, attaching a process to an existing container and so on.

Pros

  • No dependency on the rest of the Docker platform: just the container runtime and nothing else.
  • Adoption at large scale .
  • Default OCI specs implementation.

Cons

With runC the developer must construct or export filesystem bundles from other systems to create their own starting point for a container. They also will need to put together the JSON configuration file e as the runC binary itself has a simple start, stop, pause, etc. interface with no flags.

CRI-O:

CRI-O , a higher level runtime is an implementation of the Kubernetes CRI. The name originates from the combination of the Container Runtime Interface and the Open Container Initiative (CRI-O)

It is a lightweight alternative to using Docker as the runtime for Kubernetes. It allows Kubernetes to use any OCI compliant runtime as the container runtime for running pods. Today it supports runc and Kata containers as the low level container runtimes but any OCI compliant runtime can be plugged in principle.

CRI-O has only one major task: Fulfilling the Kubernetes CRI. To achieve that, it utilizes runc for basic container management in the back, whereas the gRPC server provides the API in the front end. Everything in between is done either by CRI-O itself or by core libraries like containers/storage or containers/image.

There are two components in cri-o

  • Crio-daemon (similar to containerd): which handles storage ,networking and images .This also talks to the kubelet.
  • Common daemon (similar to docker shim) :this monitors all the containers

In brief Cri-O leverages all of the OCI standards and performs below tasks:

  • Runs containers using the OCI Runtime tools defaulting to runC.
  • Managing container images following the OCI image specification.
  • Uses the OCI- Runtime-tools for generating the OCI Runtime Specification.
  • CNI for setting up the container networking.
  • Containers/image for pulling container images from container registries like docker.io

Pros

  • OCI compliant and Kubernetes ready runtime.
  • Fast common container monitoring daemon makes it restart safe. Whenever you restart cri-o, the containers don’t get restart as in case of containerd.
  • Extendable to other runtimes such as kata.
  • Red Hat OpenShift leverages CRI-O as their container runtime .So adoption rate is good .
  • CRI-O reduces one extra hop from docker as it is packaged in 3 daemons.

Cons

  • No default for image management. Buildah or Skopeo is used for this purpose.
  • It cannot run containers outside of Kubernetes .However Red Hat OpenShift have developed Podman which does this transition.

Kata Container :

Kata Containers is based on the upstream Kata Containers OpenStack Foundation project. Kata adheres to the Open Container Initiative (OCI) and serves as the low level container runtime. It Supports industry standards including OCI container format, Kubernetes CRI interface, as well as legacy virtualization technologies. Kata containers is a combination of two projects runV and clear containers .

Contrary to the runC runtime, the Kata Containers runtime uses a hypervisor to provide isolation when spawning containers. It creates lightweight virtual machine and puts containers inside. As a result, each container runs on its own kernel eliminating security limitations of the traditional runC runtime. Kata containers seamlessly plugs to existing container orchestration platforms like Kubernetes.

Source https://medium.com/kata-containers/why-kata-containers-doesnt-replace-kubernetes-75e484679727

Pros

  • Security -Runs in a dedicated kernel, providing isolation of network, I/O and memory and can utilize hardware-enforced isolation with virtualization VT extensions.
  • Performance-Delivers consistent performance as standard Linux containers; increased isolation without the performance tax of standard virtual machines.
  • Kata containers are useful for creating stateful application where the storage can be attached to a virtual machine.

Cons

  • Generally needs bare metal
  • Slower than runC

Cri-containerd:

Cri-containerd (CRI-Plugin) is an implementation of CRI for containerd. It operates on the same node as the Kubelet and containerd. Placed between Kubernetes and containerd, Cri-containerd handles all CRI service requests from the Kubelet and uses containerd to manage containers and container images. Compared with the current Docker CRI implementation docker shim, cri-containerd eliminates an extra hop in the stack, making the stack more stable and efficient.

A snapshot of some of the other container runtimes that are not integrated with the Kubernetes ,but are used in various scenarios are :

LXC (Linux Containers):

LXC is a more virtual machine like container runtime. They boot an OS (including systemd) inside a container and manage it as though it was a virtual machine. It is a OS level virtualization technology which allows us to create and run multiple isolated linux environment on a single host. LXC leverages functions of linux kernel such as cgroups and namespace. Even Docker leveraged LXC as their runtime in their very first version in 2013

Component of LXC

LXD is the daemon that manages container and images and act as container hypervisor . LXD is built on top of LXC to provide a new, better user experience. Underneath, LXD uses the LXC through liblxc and to create and manage the containers.

Pros

  • One of the advantages of using the LXC/LXD container runtime is that it can convert a new virtual machine(as-is) to a container and import that container to a container runtime (eg; Docker runtime)
  • The “nova-lxd” project provides an OpenStack Nova plugin that seamlessly integrates system containers into a regular OpenStack deployment.
  • LXD works on any recent Linux distribution. LXD upstream directly maintains the Ubuntu packages and also publishes a snap package which can be used with most of the popular Linux distributions.

Cons

  • Not integrated with Kubernetes
  • Not an OCI compliant runtime
  • LXC/LXD has got its own image standard
  • Less adoption

Podman:

Podman is a container engine developed by RedHat. In fact the commands are similar eg: docker run and podman run will return the same result.

The main difference is podman does not need a daemon to work, unlike docker . They have decentralized all the components necessary for container management and have individualized them into smaller components that will be used only when necessary.

Similar to other common container engines (Docker, CRI-O, Containerd), Podman relies on an OCI compliant container runtime (runc, crun, runv, etc) to interface with the operating system and create the running containers. Containers under the control of Podman can either be run by root or by a non-privileged user.

Podman is capable of running containers in exactly the same way Docker does, but it is also capable of running Pods.

The Podman approach is simply to directly interact with the image registry, with the container and image storage, and with the Linux kernel through the runC container runtime process (not a daemon)

Buildah:

Buildah ,from Red Hat , is a command-line tool for building Open Container Initiative-compatible images quickly and easily. It can act as a drop-in replacement for the Docker daemon’s docker build command.

Buildah is flexible enough to build images with preferable tools. Buildah is easy to incorporate into scripts and build pipelines, and it doesn’t require a running container daemon to build its image.

Buildah is different from building images with the docker command in the following ways:

  • No Daemon Bypasses the Docker daemon! So no container runtime (Docker, CRI-O, or other) is needed to use Buildah.
  • Base image or scratch: It can build an image based on another container, but also with an empty image (scratch).
  • Build tools external: Doesn’t include build tools within the image itself. As a result, it reduces the size of the image and makes the image more secure by not having any software used to build the container (like gcc,make and yum)

Skopeo:

Skopeo is all about working with images in remote repositories — transferring them, inspecting them, and even deleting them.

Skopeo is a command line utility that performs various operations on container images and image repositories. Skopeo does not require the user to be running as root to do most of its operations. It does not require a daemon to be running to perform its operations.

Conclusion :

There are other container runtimes available as well,but my intention was to make you all understand the container runtimes at a high level and the thought behind leveraging these runtimes.I will cover each one of these in detail in some other blog.For now ,hope this effort helps you in understating the concept.

I want to sincerely thanks my colleague and friend Sujith R Pillai who inspired and helped me writing my first blog .

References :

https://events19.linuxfoundation.org/wp-content/uploads/2017/11/How-Container-Runtime-Matters-in-Kubernetes_-OSS-Kunal-Kushwaha.pdf

https://kubernetes.io/blog/2017/11/containerd-container-runtime-options-kubernetes/#:~:text=A%20container%20runtime%20is%20software,rkt%2C%20containerd%2C%20and%20lxd.

https://www.youtube.com/watch?v=RyXL1zOa8Bw

https://www.openshift.com/blog/promoting-container-images-between-registries-with-skopeo.

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/building_running_and_managing_containers/building-container-images-with-buildah_building-running-and-managing-containers.

https://pandorafms.com/blog/what-is-podman/

https://www.youtube.com/watch?v=lHv0LVEIPk8

rohit kshirsagar
rohit kshirsagar

No responses yet