Building multi-process Docker containers
Docker docs (https://docs.docker.com/config/containers/multi-service_container/) aren't as categoric as they once used to be when it comes to building multi-process containers, but doing so comes at a cost of additional complexity and potentially decreased reliability.
Single process containers
Let's start with a Docker default case - a single process container. For a single-process container the lifecycle is very straightforward, because there is only one process to watch and that process is forked by Docker process itself:
so if the container process exits (either gracefully or not) the Docker process gets notified about this and can act accordingly. When a signal needs to be send - Docker can send it directly to the process (or process group to be more specific).
Multi-process containers - using Bash
TL;DR Bash is not the best choice when it comes to containers with multiple processes. It can be made to work, but unless there are other reasons to use it I suggest trying a different approach.
Bash feels like an easy way of solving multiple problems here. Many good containers use Bash script as the ENTRYPOINT
. I'll use my nginx-certbot-revproxy container as an example here. I needed two functions to run inside it - the nginx process and the script to renew the certificate periodically, plus initially I needed an extra process to request the certificate if one wasn't supplied. That led to the following process tree:
Some of those processes where expected to keep running (nginx), some where transient (both certbot) and some were expected to run for a longer period of time but then exit (sleep).
There are two key things to take care of:
- Keeping processes running
- Handling signals
Keeping processes running
This simply means restarting processes when they unexpectedly stop. The easiest way is to just run them in a loop:
#!/bin/bash
while [ 1 ]; do nginx -g "daemon off;"; sleep 1; done
Then the code can be put into a separate script so it can be called then the container starts:
#!/bin/bash
start_nginx
So that will keep restarting the process no matter what. Let's look at a real example here. After starting such container the process tree looks like this:
PID TTY STAT TIME COMMAND
1 ? Ss 0:00 /bin/bash /entrypoint.sh
8 ? S 0:00 /bin/sh /start-nginx.sh
9 ? S 0:00 \_ nginx: master process nginx -g daemon off; -c /configs/nginx-http-only.conf
10 ? S 0:00 \_ nginx: worker process
Let's pretend that the master nginx process exited abruptly:
# kill -SIGKILL 9
And the process tree looks like this now:
PID TTY STAT TIME COMMAND
1 ? Ss 0:00 /bin/bash /entrypoint.sh
8 ? S 0:00 /bin/sh /start-nginx.sh
60 ? S 0:00 \_ nginx -g daemon off; -c /configs/nginx-http-only.conf
10 ? S 0:00 nginx: worker process
Looks healthy, doesn't it? But in reality there's an issue here that can be spotted by looking at the logs:
$ docker logs my_container1
Killed
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to [::]:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to [::]:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to [::]:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to [::]:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to [::]:80 failed (98: Address already in use)
nginx: [emerg] still could not bind()
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
This is because the child process forked by nginx keeps running in the background., preventing the new nginx from starting. So after all keeping a process running is not always enough.
Handling signals
There is also another problem here. Stopping that container is no that easy either. Bash traps both SIGTERM
and SIGINT
signals by default, meaning that Docker will have to resort to SIGKILL
to stop the processes. That also means that restart will take longer.
In order to help with this the signals could be trap
ped inside Bash script:
trap "echo 'stopping now'; kill -SIGINT -1" INT TERM EXIT
but there's a catch - trapping only happens once another command finished, which is not always feasible. In my case I don't actually have a process that is expected to finish, which means that the signal handler will not be executed at all.
Multi-process containers - using supervisord
Supervisord is relatively lightweight process management framework written in python. It provides a way to start, stop and monitor processes. Can be configured to be used as the init
process inside a Docker container.
Keeping processes running
Supervisord keeps processes running by default. All is needed is a declaration of the process:
[program:nginx]
command=/usr/sbin/nginx -g "daemon off;"
autostart=false
startsecs=3
The same as in Bash case this is not always sufficient. If the nginx master process dies suddenly it's going to run into the same problem as before - the child process is still around and the main one can not be started. Supervisord does have a significant advantage here - it allows to create an eventlistener
, which can respond to the PROCESS_STATE_FAILED
event (after the process failed to start) and stop the whole container by stopping the supervisord.
Handling signals
Signals are automatically handled by supervisord and passed onto forked processes. For example stopping the container results in immediate termination of all the processes:
2018-11-19 23:25:27,429 WARN received SIGTERM indicating exit request
Summary
Running a multi-process container Docker container is not difficult, but requires taking care of some details that usually taken care of by Docker.