-
Notifications
You must be signed in to change notification settings - Fork 3.8k
cri: fix lost container exit events if they arrive before info is cached #11579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @ningmingxiao. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
aac07bc to
0abdf93
Compare
9eea9ec to
62a5780
Compare
0a62d75 to
74a9943
Compare
|
/ok-to-test |
|
/retest |
74a9943 to
55c91c1
Compare
f7d24fa to
b5590c4
Compare
|
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm :-)
mikebrow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the tightening of the exit monitor window on restart..
That said, I still think we need to clean up the containers on restart that are not restartable. For example, can't load metadata.. those containers need to be cleaned up.
|
That cleanup should create another pr. |
b5590c4 to
4a18570
Compare
Signed-off-by: ningmingxiao <ning.mingxiao@zte.com.cn>
4a18570 to
ead5c1e
Compare
|
/retest |
279819a to
ead5c1e
Compare
|
@mikebrow is this PR critical enough to warrant the cherry pick? |
|
|
I'd like to see the test added first, and possibly another deeper fix. This does not eliminate the window of opportunity. We might have to change our init process up a bit to remove this possibility entirely. Should be a way to ensure the shim waits to notify the exit event or it is cached until we have properly completed our restart. |
|
@samuelkarp: new pull request created: #11657 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@mikebrow That's fair. I'll remove the cherry-pick label and close the PR it just made. |
|
or at least before we merge the cherry into 2.0.. let's get some legs under this issue |
if you'd like to "test" it on 2.0 that won't hurt :-) |
|
@ningmingxiao were you planning to work on it? |
|
I want to add some test, I can add some sleep in load container and then and then kill pidof container. but can't find a good way to add some sleep in testcae.I have to change containerd code. |
fix container events sometimes lost when restart containerd
@mikebrow @AkihiroSuda @mxpv