Using supervisord as the init process of a Docker container

There are many ways of building multi-process Docker container (by multi-process I mean the ones where there are multiple processes running simultaneously inside the container). Over time I found supervisord to be the easiest one to use to achieve good and reliable outcomes. This post describes the setup I use.

Basics

Most of my containers follow a setup similar to the one below:

Generally the long-running processes are started by the initial configuration process, once the configuration files have been adjusted.

The configuration process tends to follow the same flow:

Check and validate environmental variables
Check if required files/directories are present
Generate configuration files
Verify configuration files
Start the process

A real-life example

I've created a simple proxy with certbot (nginx-certbot-revproxy) that uses this technique.

Dockerfile

This container is intended to be used only in one way, hence I chose CMD to start it. Complete Dockerfile:

FROM alpine

COPY ./*.sh /
COPY ./*.py /
COPY ./configs/nginx* /configs/
COPY ./configs/supervisord.conf /etc/supervisord.conf

RUN apk add nginx certbot curl openssl supervisor && \
    chmod +x /*.sh /*.py && \
    mkdir -p /var/run/nginx/ && \
    mkdir -p /var/letsencrypt

ADD https://letsencrypt.org/certs/lets-encrypt-x3-cross-signed.pem.txt /lets-encrypt-x3-cross-signed.pem

EXPOSE 80
EXPOSE 443

CMD [ "/usr/bin/supervisord", "-c", "/etc/supervisord.conf" ]

Not much to unpack here:

Use alpine as the base
Copy all shell and python scripts
Copy configuration files
Add required packages, make scripts executable and create directories
Add a file fetched from a URL
Expose both port 80 and 443
Specify supervisord to run on start of the container, including the config

Supervisord.conf

This is where all the heavy lifting happens. Let's look at this section by section (complete file is here, on github).

[supervisord]
user=root
loglevel=warn                
nodaemon=true

[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

Firstly - it runs as root (no point changing this to another user, as this is the de facto init process). Reducing the verbosity helps with making the application logs clearer (otherwise they tend to get interspersed with supervisord logs). Making sure that it runs in the foreground is critical here - if the process exists Docker will see that the container has stopped. The rpcinterface section is required by supervisord, but it's not supposed to be changed from the default values.

[unix_http_server]
file=/tmp/supervisor.sock   ; (the path to the socket file)
username=admin
password=revproxy

[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket
username=admin
password=revproxy

This section allows supervisorctl access to the supervisord process via the API to start/stop processes. Authentication is required to avoid the CRIT warning, but not strictly required since supervisord runs in a container.

[program:initsh]
command=/init.sh
autorestart=false
startsecs=0
redirect_stderr=true
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0

This section starts the initial configuration script. It specifies the location of the script, the fact that the process should not be restarted when it exits, it will be allowed to exit immediately after start ( startsec=0). Next three lines make sure that the stderr are redirected to stdout, that stdout is sent to supervisord stdoutand that it's unbuffered, so each character send by the process is logged immediately.

[program:nginx]
command=/usr/sbin/nginx -g "daemon off;"
autostart=false
startsecs=3
redirect_stderr=true
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0

This section controls the setup for the nginx. Nginx process needs to stay up for at least 3 seconds. It should not be autostarted (I do it manually from my init.sh). It also redirects the stderr and stdout as above.

[eventlistener:process_monitor]
command=/kill_supervisor.py
process_name=state_monitor
events=PROCESS_STATE

This section creates an eventlistener - a process that will receive PROCESS_STATE events on it's stdin and is expected to acknowledge them. I use a script that's a slight modification of the default one:

#!/usr/bin/env python

import sys
import os
import signal

def write_stdout(s):
    # only eventlistener protocol messages may be sent to stdout
    sys.stdout.write(s)
    sys.stdout.flush()

def write_stderr(s):
    sys.stderr.write(s)
    sys.stderr.flush()

def main():
    while 1:
        # transition from ACKNOWLEDGED to READY
        write_stdout('READY\n')

        # read header line and print it to stderr
        line = sys.stdin.readline()
        write_stderr(line)

        # read event payload and print it to stderr
        headers = dict([ x.split(':') for x in line.split() ])
        sys.stdin.read(int(headers['len']))
        if headers['eventname'] == 'PROCESS_STATE_FATAL':
            try:
                    pidfile = open('/supervisord.pid','r')
                    pid = int(pidfile.readline());
                    os.kill(pid, signal.SIGQUIT)
            except Exception as e:
                    write_stderr('Could not kill supervisor: ' + e.strerror + '\n')

        # transition from READY to ACKNOWLEDGED
        write_stdout('RESULT 2\nOK')

if __name__ == '__main__':
    main()

It checks if any of the processes got itself into a 'broken' state of PROCESS_STATE_FATAL, which means that it couldn't be started and stay up for prescribed amount of time. In that case the script reads the PID of supervisord and sends it a SIGQUIT, which stops the container (which allows Docker to restart it properly).

init.sh

This script checks the environment variables and carries out other preparations. There are few important things to remember here.

if [ "${DOMAIN}x" == "x" ];
then
    echo "DOMAIN not set, can't continue";
    kill -SIGQUIT 1
    exit
fi

If the script has to exit due to an error it must first send a signal to suprvisord to let it know to exit (and stop the container).

supervisorctl start nginx

supervisorctl should be used to control other processes defined in supervisord.conf. In most cases start, stop and restart are most of what's needed.

The script is expected to exit, once it's done.