Skip to content

Zombie reaping not occurring as expected #39326

@cwpenhale

Description

@cwpenhale

I'm working on an issue, where I've got a test harness for reliable reproduction (and it's fairly simple), and I think I've tracked the source of the issue down to containerd. In any version after 17.09.1, the test fails. I think this is because of a change introduced in containerd 1.0.0. The test is essentially this:

  1. create a container called bad_parent (we're using ubuntu:xenial)
    docker run --init -itd --name bad_parent bad_parent /bin/bash
  2. run docker exec bad_parent date on a loop to show that the container is responsive
  3. create a zombie pid
#!/usr/bin/env python
import os
import subprocess
import time


pid = os.fork()
if pid == 0:  # child
    # time.sleep(100)
    pid2 = os.fork()
    if pid2 != 0:  # parent
        while True:
            print('The zombie pid will be: {}'.format(pid2))
            time.sleep(30)
else:  # parent
    # os.waitpid(pid, 0)
    subprocess.check_call(('ps', 'xawuf'))
    time.sleep(5)
  1. observe that the container stops responding

I've tried the latest containerd (compiled from git source). I've tried 18.09, but haven't been able to build rpms for 18.10 (build errors) from master. Any thoughts on where to go with this?

Here are the log files (in debug) for the issue: https://gist.github.com/cwpenhale/2eb23ffcb2abd2b4fccc509a50f2fd0a

Testing with the following host, virtualized in VirtualBox:

[root@localhost docker_exec_hang_issue]# uname -a
Linux localhost.localdomain 5.1.5-1.el7.elrepo.x86_64 #1 SMP Sat May 25 16:10:51 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost docker_exec_hang_issue]# cat /etc/redhat-release 
CentOS Linux release 7.6.1810 (Core) 

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions