-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Description
- Shares a root cause with dockerd panic: runtime error at monitor.go:189 #45770
When an exec fails to start, the daemon sets the exec's exit code according to the error from containerd
Lines 286 to 293 in eb76c93
| ec.Process, err = tsk.Exec(ctx, ec.ID, p, cStdin != nil, ec.InitializeStdio) | |
| // the exec context should be ready, or error happened. | |
| // close the chan to notify readiness | |
| close(ec.Started) | |
| if err != nil { | |
| defer ec.Unlock() | |
| return setExitCodeFromError(ec.SetExitCode, err) | |
| } |
but then clobbers it, overwriting with the value 126, in a deferred block.
Lines 183 to 195 in eb76c93
| defer func() { | |
| if err != nil { | |
| ec.Lock() | |
| ec.Container.ExecCommands.Delete(ec.ID) | |
| ec.Running = false | |
| exitCode := 126 | |
| ec.ExitCode = &exitCode | |
| if err := ec.CloseStreams(); err != nil { | |
| logrus.Errorf("failed to cleanup exec %s streams: %s", ec.Container.ID, err) | |
| } | |
| ec.Unlock() | |
| } | |
| }() |
If it fails in such a way that containerd also publishes an exit event, daemon.ProcessEvent also races to handle the situation. Because the mutex on the exec config is repeatedly unlocked and relocked, daemon.ContainerExecStart races daemon.ProcessEvent to delete the exec from the container's ExecStore and set the exec's exit code. Depending on whether ContainerExecStart deletes the exec from the container's ExecStore before or after ProcessEvent handles the exit event, an "exec_die" container event is logged with the "exitCode" attribute set to either 127 or the actual exit code reported by containerd, while the exec's exit code is set to 126.