-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
Description
Not sure this is related, but I have in production after a few days / weeks that dockerd starts leaking tons of file descriptors of dead connections that it simply stops responding at all (not even to SIGUSR1).
While trying to understand how this can happen (since when it happens I can't do anything but kill it), I tried to figure out how you get a dead connection, and have found some conclusions on how to get dockerd to leak such file descriptors, although I'm not sure that what I'm about to say is the issue that is happening in production.
How do those leaked file descriptors look ? I run the command on linux:
ss | grep docker.sock
and I see tons of entries that most look like this:
u_str ESTAB 0 0 /var/run/docker.sock 272245 * 0
u_str ESTAB 0 0 /var/run/docker.sock 279780 * 0
u_str ESTAB 0 0 /var/run/docker.sock 279118 * 0
u_str ESTAB 0 0 /var/run/docker.sock 272201 * 0
u_str ESTAB 0 0 /var/run/docker.sock 272217 * 0
You can see this are dead connections because the peer info (the * 0 in the end) points to nothing which means the other side closed the unix domain socket, but the docker daemon did not.
also this sockets take up file descriptors (as expected):
ls -l /proc/$(pidof dockerd)/fd
...
lrwx------ 1 root root 64 may 31 08:37 33 -> socket:[272217]
...
an alive connection looks like this (there is peer info):
u_str ESTAB 0 0 /var/run/docker.sock 280489 * 279381
I'm going now to show you a convoluted way to cause this resource leak, but remember that still dockerd shouldn't leak resources and that in the end there might be similar code that really needs to do something like this...
This also seems to happen on variety of docker versions from old to new.
Steps to reproduce the issue:
- docker create -it --rm python:2.7 python -c "while True: print 1;"
<container id> - docker start <container id>
<container id> - python2
import socket
s = socket.socket(socket.AF_UNIX)
s.connect('/var/run/docker.sock')
s.send('POST /v1.24/containers/<container id>/attach?logs=1&stream=0&stdout=1 HTTP/1.1\r\nHost: localhost\r\n\r\n')
s.recv(1024)
s.close()You can also try changing the logs=1&stream=0 with logs=0&stream=1
Also don't think the v1.24 is of particulate importance.
Also you can kill the python instead of s.close() (sometimes it acts differently)
- ss | grep docker.sock
u_str ESTAB 0 0 /var/run/docker.sock 280489 * 0
Describe the results you received:
There is a dead unclosed unix domain socket on the docker daemon side.
Describe the results you expected:
The docker daemon should figure the socket has been closed on the other side, and close it's side as well.
Versions
Tried on multiple versions, now both client and server: 17.12.1-ce and on ubuntu 16.04.4 with kernel 4.13.0-36-generic with default settings.