-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Description
Description
When an exec process exits at around the same time as the init process, a TaskExit event for the exec process can be sent after the TaskExit event for the init process.
When using docker, this causes the client (running docker exec) to hang.
Steps to reproduce the issue
The following test case hangs for me after at most a few iterations:
#!/usr/bin/env bash
set -x
while true; do
d=$((40000 + $RANDOM % 10000))
ctrid=$(docker run --rm --detach alpine usleep $d)
docker exec $ctrid true
done
The range of the value d may need to be adjusted.
Describe the results you received and expected
Expected: the test case runs infinitely.
Actual: the docker client (docker exec) hangs after a few iterations.
TaskExit is sent for the exec process (40004) after the init process (39954):
2024-01-31 08:36:37.082622552 +0000 UTC moby /tasks/start {"container_id":"57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e","pid":39954}
2024-01-31 08:36:37.108081891 +0000 UTC moby /tasks/exec-added {"container_id":"57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e","exec_id":"bf5b57cffb8ca1f7aa054e03bf9aa706c265543d14adeed39152c36020e86151"}
2024-01-31 08:36:37.150185466 +0000 UTC moby /tasks/exec-started {"container_id":"57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e","exec_id":"bf5b57cffb8ca1f7aa054e03bf9aa706c265543d14adeed39152c36020e86151","pid":40004}
2024-01-31 08:36:37.151223505 +0000 UTC moby /tasks/exit {"container_id":"57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e","id":"57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e","pid":39954,"exited_at":"2024-01-31T08:36:37.151180595Z"}
2024-01-31 08:36:37.151675639 +0000 UTC moby /tasks/exit {"container_id":"57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e","id":"bf5b57cffb8ca1f7aa054e03bf9aa706c265543d14adeed39152c36020e86151","pid":40004,"exit_status":137,"exited_at":"2024-01-31T08:36:37.151196343Z"}
The following errors are seen from docker:
Jan 31 08:36:37 lima-docker dockerd[37321]: time="2024-01-31T08:36:37.294825280Z" level=error msg="failed to process event" container=57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e error="could not find container 57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e: No such container: 57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e" event=exit event-info="{57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e bf5b57cffb8ca1f7aa054e03bf9aa706c265543d14adeed39152c36020e86151 40004 137 2024-01-31 08:36:37.151196343 +0000 UTC false <nil>}" module=libcontainerd namespace=moby
Jan 31 08:36:37 lima-docker dockerd[37321]: time="2024-01-31T08:36:37.295188596Z" level=error msg="exit event" container=57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e error="no such container" module=libcontainerd namespace=moby process=bf5b57cffb8ca1f7aa054e03bf9aa706c265543d14adeed39152c36020e86151
I see no guarantees are made on the ordering of TaskExit for the init process vs TaskExit for execs in runtime/v2/README.md.
Is this a bug in containerd, or an omission in the runtime v2 API, or a bug in docker?
What version of containerd are you using?
containerd containerd.io 1.6.27 a149601
Any other relevant information
Docker version:
Client: Docker Engine - Community
Version: 25.0.0
API version: 1.44
Go version: go1.21.6
Git commit: e758fe5
Built: Thu Jan 18 17:10:21 2024
OS/Arch: linux/arm64
Context: default
Server: Docker Engine - Community
Engine:
Version: 25.0.1
API version: 1.44 (minimum version 1.24)
Go version: go1.21.6
Git commit: 71fa3ab
Built: Tue Jan 23 23:09:55 2024
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: 1.6.27
GitCommit: a1496014c916f9e62104b33d1bb5bd03b0858e59
runc:
Version: 1.1.11
GitCommit: v1.1.11-0-g4bccb38
docker-init:
Version: 0.19.0
GitCommit: de40ad0
Seen on containerd 1.6.22. Not seen on containerd 1.6.21.
#8617 seems a likely candidate, although I have not confirmed this via cherry-pick.
Show configuration if it is related to CRI plugin.
No response