Using supervisord as the init process of a Docker container
There are many ways of building multi-process Docker container (by multi-process I mean the ones where there are multiple processes running simultaneously inside the container). Over time I found supervisord to be the easiest one to use to achieve good and reliable outcomes. This post describes the setup I use.
Basics
Most of my containers follow a setup similar to the one below:
Generally the long-running processes are started by the initial configuration process, once the configuration files have been adjusted.
The configuration process tends to follow the same flow:
- Check and validate environmental variables
- Check if required files/directories are present
- Generate configuration files
- Verify configuration files
- Start the process
A real-life example
I've created a simple proxy with certbot (nginx-certbot-revproxy) that uses this technique.
Dockerfile
This container is intended to be used only in one way, hence I chose CMD
to start it. Complete Dockerfile:
FROM alpine
COPY ./*.sh /
COPY ./*.py /
COPY ./configs/nginx* /configs/
COPY ./configs/supervisord.conf /etc/supervisord.conf
RUN apk add nginx certbot curl openssl supervisor && \
chmod +x /*.sh /*.py && \
mkdir -p /var/run/nginx/ && \
mkdir -p /var/letsencrypt
ADD https://letsencrypt.org/certs/lets-encrypt-x3-cross-signed.pem.txt /lets-encrypt-x3-cross-signed.pem
EXPOSE 80
EXPOSE 443
CMD [ "/usr/bin/supervisord", "-c", "/etc/supervisord.conf" ]
Not much to unpack here:
- Use alpine as the base
- Copy all shell and python scripts
- Copy configuration files
- Add required packages, make scripts executable and create directories
- Add a file fetched from a URL
- Expose both port 80 and 443
- Specify supervisord to run on start of the container, including the config
Supervisord.conf
This is where all the heavy lifting happens. Let's look at this section by section (complete file is here, on github).
[supervisord]
user=root
loglevel=warn
nodaemon=true
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
Firstly - it runs as root (no point changing this to another user, as this is the de facto init process). Reducing the verbosity helps with making the application logs clearer (otherwise they tend to get interspersed with supervisord logs). Making sure that it runs in the foreground is critical here - if the process exists Docker will see that the container has stopped. The rpcinterface
section is required by supervisord, but it's not supposed to be changed from the default values.
[unix_http_server]
file=/tmp/supervisor.sock ; (the path to the socket file)
username=admin
password=revproxy
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
username=admin
password=revproxy
This section allows supervisorctl
access to the supervisord
process via the API to start/stop processes. Authentication is required to avoid the CRIT
warning, but not strictly required since supervisord
runs in a container.
[program:initsh]
command=/init.sh
autorestart=false
startsecs=0
redirect_stderr=true
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
This section starts the initial configuration script. It specifies the location of the script, the fact that the process should not be restarted when it exits, it will be allowed to exit immediately after start ( startsec=0
). Next three lines make sure that the stderr
are redirected to stdout
, that stdout
is sent to supervisord
stdout
and that it's unbuffered, so each character send by the process is logged immediately.
[program:nginx]
command=/usr/sbin/nginx -g "daemon off;"
autostart=false
startsecs=3
redirect_stderr=true
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
This section controls the setup for the nginx. Nginx process needs to stay up for at least 3 seconds. It should not be autostarted (I do it manually from my init.sh
). It also redirects the stderr
and stdout
as above.
[eventlistener:process_monitor]
command=/kill_supervisor.py
process_name=state_monitor
events=PROCESS_STATE
This section creates an eventlistener
- a process that will receive PROCESS_STATE
events on it's stdin
and is expected to acknowledge them. I use a script that's a slight modification of the default one:
#!/usr/bin/env python
import sys
import os
import signal
def write_stdout(s):
# only eventlistener protocol messages may be sent to stdout
sys.stdout.write(s)
sys.stdout.flush()
def write_stderr(s):
sys.stderr.write(s)
sys.stderr.flush()
def main():
while 1:
# transition from ACKNOWLEDGED to READY
write_stdout('READY\n')
# read header line and print it to stderr
line = sys.stdin.readline()
write_stderr(line)
# read event payload and print it to stderr
headers = dict([ x.split(':') for x in line.split() ])
sys.stdin.read(int(headers['len']))
if headers['eventname'] == 'PROCESS_STATE_FATAL':
try:
pidfile = open('/supervisord.pid','r')
pid = int(pidfile.readline());
os.kill(pid, signal.SIGQUIT)
except Exception as e:
write_stderr('Could not kill supervisor: ' + e.strerror + '\n')
# transition from READY to ACKNOWLEDGED
write_stdout('RESULT 2\nOK')
if __name__ == '__main__':
main()
It checks if any of the processes got itself into a 'broken' state of PROCESS_STATE_FATAL
, which means that it couldn't be started and stay up for prescribed amount of time. In that case the script reads the PID
of supervisord
and sends it a SIGQUIT
, which stops the container (which allows Docker to restart it properly).
init.sh
This script checks the environment variables and carries out other preparations. There are few important things to remember here.
if [ "${DOMAIN}x" == "x" ];
then
echo "DOMAIN not set, can't continue";
kill -SIGQUIT 1
exit
fi
If the script has to exit due to an error it must first send a signal to suprvisord
to let it know to exit (and stop the container).
supervisorctl start nginx
supervisorctl
should be used to control other processes defined in supervisord.conf
. In most cases start
, stop
and restart
are most of what's needed.
The script is expected to exit, once it's done.