Skip to content

TaskExit event can be sent for an exec process after TaskExit is sent for the init process #9719

@tjohnes

Description

@tjohnes

Description

When an exec process exits at around the same time as the init process, a TaskExit event for the exec process can be sent after the TaskExit event for the init process.

When using docker, this causes the client (running docker exec) to hang.

Steps to reproduce the issue

The following test case hangs for me after at most a few iterations:

#!/usr/bin/env bash
set -x
while true; do
    d=$((40000 + $RANDOM % 10000))
    ctrid=$(docker run --rm --detach alpine usleep $d)
    docker exec $ctrid true
done

The range of the value d may need to be adjusted.

Describe the results you received and expected

Expected: the test case runs infinitely.

Actual: the docker client (docker exec) hangs after a few iterations.

TaskExit is sent for the exec process (40004) after the init process (39954):

2024-01-31 08:36:37.082622552 +0000 UTC moby /tasks/start {"container_id":"57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e","pid":39954}
2024-01-31 08:36:37.108081891 +0000 UTC moby /tasks/exec-added {"container_id":"57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e","exec_id":"bf5b57cffb8ca1f7aa054e03bf9aa706c265543d14adeed39152c36020e86151"}
2024-01-31 08:36:37.150185466 +0000 UTC moby /tasks/exec-started {"container_id":"57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e","exec_id":"bf5b57cffb8ca1f7aa054e03bf9aa706c265543d14adeed39152c36020e86151","pid":40004}
2024-01-31 08:36:37.151223505 +0000 UTC moby /tasks/exit {"container_id":"57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e","id":"57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e","pid":39954,"exited_at":"2024-01-31T08:36:37.151180595Z"}
2024-01-31 08:36:37.151675639 +0000 UTC moby /tasks/exit {"container_id":"57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e","id":"bf5b57cffb8ca1f7aa054e03bf9aa706c265543d14adeed39152c36020e86151","pid":40004,"exit_status":137,"exited_at":"2024-01-31T08:36:37.151196343Z"}

The following errors are seen from docker:

Jan 31 08:36:37 lima-docker dockerd[37321]: time="2024-01-31T08:36:37.294825280Z" level=error msg="failed to process event" container=57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e error="could not find container 57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e: No such container: 57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e" event=exit event-info="{57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e bf5b57cffb8ca1f7aa054e03bf9aa706c265543d14adeed39152c36020e86151 40004 137 2024-01-31 08:36:37.151196343 +0000 UTC false <nil>}" module=libcontainerd namespace=moby
Jan 31 08:36:37 lima-docker dockerd[37321]: time="2024-01-31T08:36:37.295188596Z" level=error msg="exit event" container=57aab64157c14148b12b9bda4e35588f4d534b50ef78ffab14c7ea280493592e error="no such container" module=libcontainerd namespace=moby process=bf5b57cffb8ca1f7aa054e03bf9aa706c265543d14adeed39152c36020e86151

I see no guarantees are made on the ordering of TaskExit for the init process vs TaskExit for execs in runtime/v2/README.md.

Is this a bug in containerd, or an omission in the runtime v2 API, or a bug in docker?

What version of containerd are you using?

containerd containerd.io 1.6.27 a149601

Any other relevant information

Docker version:

Client: Docker Engine - Community
 Version:           25.0.0
 API version:       1.44
 Go version:        go1.21.6
 Git commit:        e758fe5
 Built:             Thu Jan 18 17:10:21 2024
 OS/Arch:           linux/arm64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          25.0.1
  API version:      1.44 (minimum version 1.24)
  Go version:       go1.21.6
  Git commit:       71fa3ab
  Built:            Tue Jan 23 23:09:55 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.27
  GitCommit:        a1496014c916f9e62104b33d1bb5bd03b0858e59
 runc:
  Version:          1.1.11
  GitCommit:        v1.1.11-0-g4bccb38
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Seen on containerd 1.6.22. Not seen on containerd 1.6.21.

#8617 seems a likely candidate, although I have not confirmed this via cherry-pick.

Show configuration if it is related to CRI plugin.

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions