Skip to content

TestDaemonRestart hangs if shim_debug is enabled #2606

@danail-branekov

Description

@danail-branekov

Description

While running containerd tests we wanted to capture the containerd-shim output to figure out how shim works. We found out that when doing so the tests started to hang when test runner waits for the containerd here (you can easily reproduce it via running TestDaemonRestart)

It turned out that tests hang while trying to Wait for the containerd daemon to stop/restart. This is due to the implementation of (*Cmd).Wait in the Go standard library, which will "wait for any copying to stdin or copying from stdout or stderr to complete".

Enabling the plugin.linux/shim_debug flag in the containerd configuration will make it wire the shim's stdout and stderr to the system ones.

Given the shim is designed to survive the killing of the server, we believe the following is happening:

  • the tests try to kill the containerd daemon;
  • the deamon dies, but the shim survives;
  • the tests hang while waiting for the shim's stdout/stderr to be flushed.

We have confirmed that kill -9ing the shim unlocks the tests.

In order to demonstrate the issue we have updated the TestDaemonRestart test in a containerd fork. After killing the shim process there is a containerd leftovers which one can clean up using this cleanup script

Steps to reproduce the issue:

  1. Clone https://github.com/gcapizzi/containerd
  2. Run the test - sudo -E go test -run "TestDaemonRestart\b" -test.root=true

Describe the results you received:
Test times out

Describe the results you expected:
Test passes

Output of containerd --version:

containerd github.com/containerd/containerd v1.2.0-beta.2-27-gacced5d5.m acced5d58f61f342ef012643d6c5d6405f709f26.m

cc @gcapizzi

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions