-
Notifications
You must be signed in to change notification settings - Fork 3.8k
runc-shim: check/kill init's children even if skipping init exit #10608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runc-shim: check/kill init's children even if skipping init exit #10608
Conversation
|
Skipping CI for Draft Pull Request. |
|
exec process may have child process. if father process exited, process 1 will be child process' mother.It may be not be killed. |
|
Sorry, I don't quite understand what you mean. This PR doesn't change anything about which processes are killed, just when. We have a check – However, this is only called when we're about to send the exit event for the init process – which is now delayed if there are pending execs. We shouldn't delay this – we can run this logic as soon as we receive the exit for the init process, regardless of whether we're going to delay sending that exit or not. |
8db9224 to
1c4fd15
Compare
1c4fd15 to
47f0829
Compare
crictl exec -it containerdA sh sleep will not be killed by ip.KillAll |
|
That might be true, but that's not a new feature of this PR. Also, wouldn't |
47f0829 to
1c8283a
Compare
|
I try this pr, issue still exist. for k8s just set shareProcessNamespace =true (share pause pid ns) in pod yaml. (if we don't use crictl create pods) |
|
As I mentioned before, I unfortunately can't repro – could you share containerd logs that you get when you reproduce the issue? If possible, with some extra logs such as 511ed30 |
If this issue occurs when sharing the PID namespace, then we have no guarantee that by the time we receive the init's exit, we have/will soon receive the exec's exits. In the meantime, I think (need to confirm) that other execs could "start" execution/increment I don't think we guaranteed ordering before all the recent changes when sharing a pid namespace. Maybe we shouldn't skip processing the init exit in that case. |
a1751c7 to
1c8283a
Compare
If a container isn't running in it's own pid namespace/it's init process is not the init process of the pid namespace, it's possible for an exec started just before the init process exited/was reaped to start successfully/not early exit. If we previously skipped sending an init processes exit due to pending execs, we need to emit it regardless of whether the pending exec was an early exit or not, otherwise this exit won't get emitted at all. Signed-off-by: Laura Brehm <laurabrehm@hey.com>
1c8283a to
8543b69
Compare
|
Superseded by #10647 |
This PR moves the init exit processing in
s.handleStartedout of theif exitedblock – regardless of whether the exec was an early exit or not, if it was the last pending exec we should process it's exit. We've already removedcfroms.pendingExecs, and this being the last pending exec means that after this we'll never process the exit if it doesn't happen here.This is important to ensure correctness in the shared-pid namespace scenario (together with the changes from #10634) – with these changes we: