-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Description
Description
While troubleshooting an issue that resulted in a dockerd crash, I have found that docker-compose does not close a unix socket with the docker daemon when a container exits and restarts.
This means that exponentially as containers restart more sockets are left open with dockerd, which spins up a new thread for each time a new one is opened, eventually leading dockerd to hit max-threads limit in the Kernel and crash.
In our case, the production workload has about 10 containers that are in an on-demand deployment style, so not all 10 need to be up and running depending on the pool of "devices" needing data processed upstream of them. Ones that aren't used will restart and re-query the database for a new endpoint to process input from.
While this approach of restarting containers is probably not the greatest, at the very least the unix sockets should be closed when finished.
From looking into this on the dockerd side, it appears that dockerd sends a notifyClosed via the socket when it exits, does docker compose handle that?
Steps To Reproduce
- create
docker-compose.yaml
---
services:
occupied:
image: container-used:latest
logging:
driver: "json-file"
options:
max-file: "5"
max-size: "30m"
surplus:
restart: unless-stopped
image: container-surplus:latest
logging:
driver: "json-file"
options:
max-file: "5"
max-size: "30m"- create Dockerfile
FROM alpine:latest AS occupied
CMD ["sleep", "3600"]
FROM alpine:latest AS surplus
CMD ["sleep", "10"]- build images
docker build --target occupied -t container-used:latest .
docker build --target surplus -t container-surplus:latest .- start project
docker compose -f docker-compose.yaml --scale occupied=8 --scale surplus=2 up- concurrently with 4, observe total number of tasks/threads under dockerd:
watch -n1 "ls /proc/`pgrep dockerd`/task | wc -l"- concurrently with 4, observe unix socket fds under the
docker-composepid:
watch -n1 "lsof -p `pgrep docker-compose` | grep unix"You will observe the following:
- Docker compose will restart the exiting containers as they gracefully close.
- In step 5, you will see the total number of threads under dockerd will begin to rise.
- in step 6, you will see more unix sockets being added to the fd list for docker-compose process.
Compose Version
Docker Compose version v2.29.1
Docker Environment
Client: Docker Engine - Community
Version: 27.1.1
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.16.1
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.29.1
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 11
Running: 0
Paused: 0
Stopped: 11
Images: 7
Server Version: 27.1.1
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
runc version: v1.1.13-0-g58aa920
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
Kernel Version: 5.4.0-190-generic
Operating System: Ubuntu 20.04.6 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.74GiB
Name: moby48236
ID: f5179277-2595-4df6-9ec6-44fc55c19f74
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
Anything else?
Initially thought this was a regression of an existing bug in dockerd that was fixed in 27.0.1, and opened an issue there: moby/moby#48236
However with observation and feedback from the devs, I discovered this to be an issue with the compose plugin. So here I am :)