-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Description
Description
While running containerd tests we wanted to capture the containerd-shim output to figure out how shim works. We found out that when doing so the tests started to hang when test runner waits for the containerd here (you can easily reproduce it via running TestDaemonRestart)
It turned out that tests hang while trying to Wait for the containerd daemon to stop/restart. This is due to the implementation of (*Cmd).Wait in the Go standard library, which will "wait for any copying to stdin or copying from stdout or stderr to complete".
Enabling the plugin.linux/shim_debug flag in the containerd configuration will make it wire the shim's stdout and stderr to the system ones.
Given the shim is designed to survive the killing of the server, we believe the following is happening:
- the tests try to kill the
containerddaemon; - the deamon dies, but the shim survives;
- the tests hang while waiting for the shim's stdout/stderr to be flushed.
We have confirmed that kill -9ing the shim unlocks the tests.
In order to demonstrate the issue we have updated the TestDaemonRestart test in a containerd fork. After killing the shim process there is a containerd leftovers which one can clean up using this cleanup script
Steps to reproduce the issue:
- Clone https://github.com/gcapizzi/containerd
- Run the test -
sudo -E go test -run "TestDaemonRestart\b" -test.root=true
Describe the results you received:
Test times out
Describe the results you expected:
Test passes
Output of containerd --version:
containerd github.com/containerd/containerd v1.2.0-beta.2-27-gacced5d5.m acced5d58f61f342ef012643d6c5d6405f709f26.m
cc @gcapizzi