Conversation
b3019e3 to
5e80dad
Compare
Codecov Report
@@ Coverage Diff @@
## master #3206 +/- ##
==========================================
- Coverage 44.63% 39.87% -4.77%
==========================================
Files 113 76 -37
Lines 12164 10106 -2058
==========================================
- Hits 5430 4030 -1400
+ Misses 5899 5510 -389
+ Partials 835 566 -269
Continue to review full report at Codecov.
|
5e80dad to
c92a4dc
Compare
|
Note that we send out
I can move the events into |
|
@Random-Liu - In general this is a great change. We have noticed this as well if the shim panic's for any reason CRI/Containerd struggle to clean up this state. I appreciate the change here! One thing I am concerned about but don't quite know how to fix is when we have a multi-container shim we need to send exits for all containers in the shim. But |
c92a4dc to
b010dc4
Compare
I think the cleanup is done container by container, so it should no matter it is a multi-container shim or per-container shim. Note that we don't try to kill shim here like #3226, because we know there might be multi-container shim. One downside is that if the shim ttrpc server stops working, but the shim process is still running, containerd will cleanup all the containers, but leave the shim process running. We have no solution for that yet. |
|
@crosbymichael Can you take a look at this? Without this, once shim process exits, container will be left running forever. |
Signed-off-by: Lantao Liu <lantaol@google.com>
b010dc4 to
5c9811d
Compare
|
LGTM |
|
ping @containerd/containerd-maintainers |
Fixes #3199
Depends on containerd/ttrpc#35.
When containerd-shim is killed when containerd is running:
When containerd-shim is killed when containerd is down, after containerd restarts:
I'll add integration test in the CRI plugin for the above 2 cases. Not sure whether we want these kind of tests in the containerd integration test, given that we may need to
pgrep containerd-shimandpkill containerd-shim.Signed-off-by: Lantao Liu lantaol@google.com