Building an alpine golden image

11
292
0
27 min read

How to build an alpine image to base all your containers on.

What’s the point ?

If you start to build your own containers images, repetitions will appear sooner or later in your manifests, as well as divergences when you change something that you forget to propagate all over the places.

A golden image provides a common base to avoid repetitions and ease update and maintenance. When it has been proven to be stable enough, a base image acquires the golden status.

A simple golden image

I don’t build anything in Containerfile (or Dockerfile) anymore: The rational is explained in: “A better way to build containers images”.

If you followed “Building / consuming alpine Linux packages inside containers and images”, you know that, in order to be able to install your own package with apk (alpine package tool), you need to add a repository public key with a URL system-wide, and you need to repeat this process for each of your image that use private packages.

The default libmusl’s memory allocator is also not the fastest around (specially in multithreaded context), so you may want to switch to mimalloc for all your alpine deployment.

Well, we can write these 2 configuration tasks inside a Containerfile file :

FROM alpine:3.17 LABEL org.opencontainers.image.authors=eric@itsufficient.me ARG MIMALLOC_VERSION=2.0.9-r0 ## add repository public key and URL COPY domain.my-xxxxxxxx.rsa.pub /etc/apk/keys/ RUN sed -i "1ihttps://apk.domain.my/${TAG}/main" /etc/apk/repositories ## install mimalloc RUN apk add mimalloc=${MIMALLOC_VERSION} ## use mimalloc per default # don't forget s6 with-contenv in services to inherit variables ENV LD_PRELOAD=/lib/libmimalloc.so.2.0 ENV MIMALLOC_LARGE_OS_PAGES=1

You just need a mimalloc package to generate this image, but I’ve got you covered, and you can use this APKBUILD file to generate it :

# Maintainer: Éric BURGHARD <eric@itsufficient.me> pkgname=mimalloc pkgver=2.0.9 pkgrel=0 pkgdesc="mimalloc is a compact general purpose allocator with excellent performance." url="https://github.com/microsoft/mimalloc" arch="all" license="MIT" makedepends="cmake" options="!check" # No test suite subpackages="$pkgname-doc $pkgname-dev" source="$pkgname-$pkgver.tar.gz::https://github.com/microsoft/$pkgname/archive/v$pkgver.tar.gz" build() { mkdir build && cd build cmake -DMI_INSTALL_TOPLEVEL=ON -DCMAKE_INSTALL_PREFIX=/usr .. make -j"$(nproc)" } package() { (cd build && make DESTDIR="$pkgdir" install) mv "$pkgdir"/usr/lib "$pkgdir"/lib for file in $(ls | grep -i -e license -e copying -e copyring -e changelog -e contributing -e readme -e code_of_conduct); do install -m644 -D -t "$pkgdir"/usr/share/doc/"$pkgname" "$file" done } dev() { default_dev find "$pkgdir"/lib/cmake/mimalloc/ -name '*.cmake' -exec install -Dm644 {} -t "$subpkgdir"/usr/share/cmake/Modules/ \; rm -rf "$pkgdir"/lib/cmake } sha512sums=" bf6945bfb600ade35dab34c7f570ee4f69a77612547ad874bbbd989a4e594a6a219c222a22c90c5e36f205aae4d5cd1a5e4651caed5433db275d414c6769bf49 mimalloc-2.0.9.tar.gz "

We can now build a new base image with :

buildah bud -t reg.domain.my/containers/alpine:3.17

and use it as follows in Containerfile :

FROM reg.domain.my/containers/alpine:3.17

A more useful one

Specialization

If we want to use the same image in different contexts, we need to inject somehow the parameters needed to specialize the container to the workload context. This is inversion of control : the software doesn’t ask for parameters, we must provide them, and this is how Kubernetes is working.

But this is not working very well when something is changing frequently in the workload’s context, like short-lived secrets. The explanation lies into how the containers’ orchestrator (Kubernetes) injects parameters at run time, which can roughly be classified in 2 categories :

  1. Static ones :

  2. Dynamic ones which normally don’t require to restart the pod when something changes :

If you have frequently changing secrets (like rotated every hour), and you don’t want your pod to be killed by the orchestrator because it will likely become unresponsive with expired secrets, then sidecar container is a simpler solution.

Volume requires a (less secure) shared access, a controller (with extended rights) to update the volume content, and a local service to watch for modifications and restart services. If this solution scales better (one controller vs nth sidecars), it has broader security implications and no edge over the simplification of operations.

But even the sidecar brings his fair share of :

  • Complexity: The sidecar needs to be configured separately. A mount point and eventually the processes’ namespace (cgroup) need to be shared between the sidecar and the workload container to allow the sidecar to write configuration files and signal processes when modifications happen.

  • Resources consumption: Injecting a several hundred megabytes binary like the vault agent in the vault sidecar does not scale very well when all you need to do is calling a REST API to refresh tokens every hour or so. It will certainly be deprecated by Hashicorp in the near future for that reason.

Instead of a sidecar, a better solution is to use a side-process which runs in the same container and has automatically access to files and processes space. If it’s specialized enough (light on resources’ consumption), it can scale a lot better than a sidecar while having fewer security concerns and being simpler to operate.

We now need a process supervision suite to manage the lifecycle of the workload process and its configuration agent (side-process), as well as their interdependencies (i.e. wait for the first generation of configuration before starting the main process).

A supervision suite (as long as it is lightweight) is a perfect candidate to be included in a golden image. It is useful even if you don’t need dynamic configuration because it offers the guarantee of correctly handling the responsibilities coming with running as PID 1, (some don’t care about or simply defer to dumb-init and the likes).

I’m not considering systemd for the task which :

  • is clearly oversized (did I say bloated ?) for the task,

  • doesn’t work seamlessly inside containers, and

  • is not portable without glibc.

A small supervision suite

The natural choice for process management under alpine is s6, which should at some point become its official service manager.

s6 works by running lightweight long-lived daemons that supervise other processes, and offers simple and effective signaling and readiness notification mechanisms.

  • The main point of entry is s6-svscan which should run as PID 1 (your container entry point), and is the main supervisor.

  • The services themselves are usually execline scripts organized in services directories and which are supervised by separate instances of another lightweight daemon: s6-supervise.

This is a low level description and even if some work is underway (s6-rc, s6-frontend) to make it more declarative for easily expressing services interdependencies, it is kind of hardcore to use it directly if all you want is just to start a few services.

s6-overlay to the rescue

The quickest way of using s6 in your container is by using s6-overlay which contains the required scripts organized in init stages to start your services without thinking too much about the technical details.

An official alpine package exists, but it depends on other s6 software, and I found this subdivision very unpractical for maintenance reason: as the revisions of the dependencies are not clearly stated in its apk manifest, you can introduce runtime bugs if you have different versions of s6 software in your private repository (to stay on the edge or to backport newer s6 on older alpine versions). You also end up managing 9 packages or so instead of just one.

As I never use s6 software separately from one another, I chose instead to make an all-in-one s6-overlay package which includes everything statically linked. Nothing clever here as it simply relies on the s6-overlay Makefile which already gets and compiles all dependencies using the right (hard-coded) revisions. Feel free to grab it and adapt to your need.

# Maintainer: Éric BURGHARD <eric@itsufficient.me> pkgname=s6-overlay pkgver=3.1.3.0 pkgrel=0 _pkgdesc="s6 overlay for containers" pkgdesc="$_pkgdesc" url="https://github.com/just-containers/s6-overlay" arch="all" license="ISC" options="!check" makedepends="xz linux-headers" source="$pkgname-$pkgver.tar.gz::https://github.com/just-containers/$pkgname/archive/v$pkgver.tar.gz" install="$pkgname.post-install" subpackages="$pkgname-scripts::noarch $pkgname-symlinks::noarch $pkgname-syslogd::noarch" builddir="$srcdir/$pkgname-$pkgver" options="!check suid" build() { cd "$builddir" make } package() { cd "$builddir" mkdir -p "$pkgdir" tar xf output/$pkgname-$(uname -m).tar.xz -C "$pkgdir" # remove suid flags otherwise postcheck() fails. Add it again in post-install chmod -s "$pkgdir"/package/admin/s6-overlay-helpers/command/s6-overlay-suexec } scripts() { pkgdesc="$_pkgdesc - scripts" cd "$builddir" mkdir -p "$subpkgdir" tar xf output/$pkgname-noarch.tar.xz -C "$subpkgdir" } symlinks() { pkgdesc="$_pkgdesc - symlinks" cd "$builddir" mkdir -p "$subpkgdir" tar xf output/$pkgname-symlinks-noarch.tar.xz -C $subpkgdir tar xf output/$pkgname-symlinks-arch.tar.xz -C $subpkgdir } syslogd() { pkgdesc="$_pkgdesc - syslogd" cd "$builddir" mkdir -p "$subpkgdir" tar xf output/syslogd-overlay-noarch.tar.xz -C $subpkgdir } sha512sums=" 30e8aa212d29ff185252d8695ffa845ef1dadafc0f133b235bce2caf73ef90cccacd4678ea5e4e72eb9092276ba47fcfa5a10ea1985568c2985e41c6841748f0 s6-overlay-3.1.3.0.tar.gz "

Definition of our services

Going back to side-process and configuration/secret management, I developed a small tool in rust for that purpose: rconfd. It is similar to consul-template, but smaller, faster and with an intentionally narrowed scope (vault, jsonnet, and few backends). You can use it with Kubernetes (by using Kubernetes Auth Method) or CI/CD (by using JWT/OIDC Auth Method).

How to use rconfd should normally be the subject of another blog post, but let’s see how to start the service and how other dependent services can wait for it.

s6 services are declared as separated directories in /etc/services.d :

└── etc └── services.d └── rconfd ├── notification-fd └── run

notification-fd contains an integer: the file descriptor number that will be used by the service to signal its readiness status. It is usually 3 as 0 to 2 are taken by the standard descriptors (stdin, stdout, stderr).

run is an executable (don’t forget execution rights), and execline is the natural choice for scripting your services.

etc/services.d/rconfd/run

#!/command/execlineb -P with-contenv foreground { /usr/bin/rconfd -D -j /etc/rconfd -r 3 } importas -u ? ? if { eltest ${?} = 0 } s6-pause

execline is not an interpreter but a parser, although it offers the same Turing completeness than bash. Even if I use new lines to separate commands in the script above, you should read it as a one line instruction. The commands you normally use in execline scripts are standalone executables that consume their arguments and execute into something else (much like env in shell scripts) while passing the remaining arguments (chain-loading).

execline parses the script only once at startup to construct the chain of arguments then replace itself with the first command of the script. Only one command stays in the memory at any given step. It is much more secure and efficient than having a full-blown interpreter, generally subject to all kind of parsing and injection exploits, that just stays in memory until the very last instruction. It’s a perfect fit to start services in containers: lightweight, fast, secure, as long as the logic stays simple, and the script size is smaller than 4Kb.

foreground is used here because rconfd doesn’t do chain loading. Like other daemon (-D argument) it normally never returns unless it encounters an error. Nonetheless, rconfd can return with a success code if it generates configuration files and has nothing more to do (when it detects that only static secrets are used for instance). In that case we replace rconfd by the smallest possible daemon with s6-pause. This is a nice trick (used by docker as well with pause containers) to tell s6: “it’s ok, we are (kind of) still running normally, so please don’t restart us” (s6-rc has the concept of one-shot service).

When rconfd successfully starts, gets access to secrets and generates configuration files, it will signal on file descriptor 3 (-r 3) that it is ready. The given integer should match the one written in the notification-fd file.

Putting everything together

We now have all the required parts to assemble our image.

Here is the Containerfile I use for my golden image. I tag the image using the s6-overlay version (--build-arg TAG=w.x.y.z argument of buildah).

FROM reg.domain.my/containers/alpine:3.17 LABEL org.opencontainers.image.authors=eric@itsufficient.me ARG TAG ARG RCONFD_VERSION=0.11.4-r0 ## install s6 RUN apk add --no-cache \ s6-overlay=${TAG}-r0 \ s6-overlay-scripts=${TAG}-r0 \ rconfd=${RCONFD_VERSION} ## add s6 configuration ADD etc /etc ## run s6 as root ENV TERM xterm USER root ENTRYPOINT ["/init"]

How to use it

Let’s say we need a k8s container image for a hypothetical myservice that needs a configuration file with secrets fetched from a vault server. This is how we could use our golden image (of course we need to have compiled/deployed a private myservice apk beforehand) :

Containerfile

FROM reg.domain.my/containers/s6-overlay:3.1.3.0 LABEL org.opencontainers.image.authors=eric@itsufficient.me ARG TAG RUN apk add --no-cache myservice=${TAG}-r1 ## add s6 configuration ADD etc /etc

Now the execline script to start our service:

etc/services.d/myservice/run

#!/command/execlineb -P with-contenv # passively wait for configuration to be ready foreground { s6-svwait -U /var/run/s6/legacy-services/rconfd } # double check that everything is ok and our config file is present importas -u ? ? if { eltest ${?} = 0 -a -f /etc/myservice.yml } cd /var/lib/myservice s6-setuidgid myservice myservice

I eluded the rconfd files (/etc/rconfd/*.{json,jsonnet}) on purpose, but you understood that rconfd is responsible to generate /etc/myservice.yml referenced in the above script.

In case the service is critical, and the container should stop if the service fails (default is endless restarting), just add the following script in the service directory :

etc/services.d/myservice/finish

#!/command/execlineb -S1 if { eltest ${1} -ne 0 } if { eltest ${1} -ne 256 } /run/s6/basedir/bin/halt

That’s it.

What we have so far

Golden images hierarchy

We have one base image (alpine:3.17) we can use in place of the official alpine one, that is able to consume packages from our private repository, and that changes the default memory allocator.

We have a golden image (s6-overlay:3.1.3.0) based on the later that offers process supervision and has a configuration service that can fetch secrets from a vault server, (re)generates configuration files and signals others when something change.

In case we don’t need process supervision, we can rely on our minimalist base image: In the picture above, we derive a build image containing some GitLab tools (build:3.17) and then define sub-images containing pre-installed packages (tool chain) for a defined language (build/rust:1.64.9) to speed up compilation jobs in CI pipelines.

By using apk in Containerfile, we avoid duplicate work in CI (like resources consuming compilations), and the process of creating or rebasing new images can’t be faster as it is limited to just packaging existing files together. This is a crucial point in case of critical emergency patch.

When a new alpine version is published we just have to rebuild in CI all the required apk and images once, and then update the tag used in the FROM instruction in Containerfile to rebase all our images on the new version. This job can be driven in CI/CD by simply watching GitHub RSS feeds and using commits and triggers along the images’ dependency chain.

As usually, I highly appreciate feedbacks, comments or alternate ways of doing the same thing in the comments’ area below.

Éric BURGHARD


Related posts

Operations

vaultpostgresk8stutorial

Managing roles for PostgreSQL with Vault on Kubernetes

Vault has a database secret engine with a PostgreSQL driver that helps to create short-lived roles with random passwords for your database applications, but putting everything in production is not as simple as it seems.

9
292
0
40 min read

Operations

k8sflatcartutorial

Intalling Kubernetes with cri-o inside flatcar Container Linux

How to run containers without dockershim / containerd by installing cri-o with crun under flatcar Container Linux.

12
292
0
47 min read

Development

alpinetutorial

Building / consuming alpine Linux packages inside containers and images

How to build alpine Linux packages you can later install inside other alpine based containers or images

15
292
0
26 min read