Fix container pid race condition.#3857
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3857 +/- ##
=======================================
Coverage 42.34% 42.34%
=======================================
Files 130 130
Lines 14683 14683
=======================================
Hits 6217 6217
Misses 7540 7540
Partials 926 926
Continue to review full report at Codecov.
|
|
Build succeeded.
|
|
Can we have a test? |
Done. To make it easy to reproduce the race condition, I added a Here is the test result with containerd v1.3.0: Here is the test result with this patch: BTW, I tried to add retry in the test to help reproducing the race condition, but it is hard. However, it can be easily reproduced by running |
8050209 to
8c8e830
Compare
Signed-off-by: Lantao Liu <lantaol@google.com>
8c8e830 to
a6b6097
Compare
|
Build succeeded.
|
|
LGTM |
containerd 1.3.2 Welcome to the v1.3.2 release of containerd! The second patch release for `containerd` 1.3 includes a fix for a race condition related to the reported pid on exit when called from Docker. ### Runtime * Fix containerd pid race condition [containerd#3857](containerd#3857) * Use cached process state to reduce exec cost [containerd#3711](containerd#3711) ### CRI * Added `insecure_skip_verify` option in the registry tls config to allow skipping registry certificate verification [containerd#3847](containerd#3847) Please try out the release binaries and report any issues at https://github.com/containerd/containerd/issues. ### Contributors * Lantao Liu * Derek McGowan * Michael Crosby * Phil Estes * Maksym Pavlenko ### Changes * [`ff48f57fc8`](containerd@ff48f57) Merge pull request [containerd#3866](containerd#3866) from dmcgowan/prepare-1.3.2 * [`99005c2647`](containerd@99005c2) Add release notes for v1.3.2 * [`e987ea3cac`](containerd@e987ea3) Merge pull request [containerd#3864](containerd#3864) from Random-Liu/update-cri-release-1.3 * [`8bbd71e560`](containerd@8bbd71e) Update cri to b1bef15fbeb6c6f0569b67322acfa74ca3597755. * [`070c27205c`](containerd@070c272) Merge pull request [containerd#3863](containerd#3863) from Random-Liu/cherrypick-#3857-release-1.3 * [`306d6d4b55`](containerd@306d6d4) Fix container pid. * [`5028701f1a`](containerd@5028701) Merge pull request [containerd#3753](containerd#3753) from thaJeztah/1.3_backport_avoid_unnecessary_runc_state * [`e3d2b608cc`](containerd@e3d2b60) Merge pull request [containerd#3855](containerd#3855) from fuweid/cp-3853 * [`04fbb97ad0`](containerd@04fbb97) Fix cleanup error on content client test * [`be24f39454`](containerd@be24f39) Merge pull request [containerd#3840](containerd#3840) from ameyag/1.3_backport_log_file_win * [`333679343b`](containerd@3336793) Add `--log-file` flag for windows service. * [`9abfc70043`](containerd@9abfc70) Use cached state instead of `runc state`. ### Changes from containerd/cri * [`b1bef15f`](containerd/cri@b1bef15) Merge pull request [containerd#1346](containerd/cri#1346) from Random-Liu/cherrypick-#1345-release-1.3 * [`27edfa06`](containerd/cri@27edfa0) Add insecure_skip_verify option. * [`e5dd8053`](containerd/cri@e5dd805) Merge pull request [containerd#1322](containerd/cri#1322) from Random-Liu/cherrypick-#1319-release-1.3 * [`c0dee957`](containerd/cri@c0dee95) Fix containerd build, use `libbtrfs-dev` when available. * [`1fb415d2`](containerd/cri@1fb415d) Merge pull request [containerd#1314](containerd/cri#1314) from Random-Liu/cherrypick-#1312-release-1.3 * [`0973459d`](containerd/cri@0973459) Update based on default xenial distro. * [`a46f6baf`](containerd/cri@a46f6ba) Configure golangci-lint ### Dependency Changes Previous release can be found at [v1.3.1](https://github.com/containerd/containerd/releases/tag/v1.3.1) * **github.com/containerd/cri** 5d49e7e51b43e36a6b9c4386257c7d08c602237f -> b1bef15fbeb6c6f0569b67322acfa74ca3597755 # gpg: directory '/c/Users/juterry/.gnupg' created # gpg: keybox '/c/Users/juterry/.gnupg/pubring.kbx' created # gpg: Signature made Tue Dec 3 11:09:10 2019 PST # gpg: using RSA key 8C7A111C21105794B0E8A27BF58C5D0A4405ACDB # gpg: Can't check signature: No public key
If you try docker with containerd 1.3, and run
docker run hello-world, it will stuck.After #3366, we change pid after container/exec process created, this causes various issues:
task.Startmight be-1if the container is a short running container, becauseStartandStateare not called in a transaction: https://github.com/containerd/containerd/blob/master/services/tasks/local.go#L230.uint, returning-1will eventually make it4294967295, which is not very user friendly:This PR partially reverts #3366, and use
exitedto figure out whether the process has already exited. I tested the change with Docker, anddocker run hello-worldworks now.Signed-off-by: Lantao Liu lantaol@google.com