Skip to content

Conversation

@dcantah
Copy link
Member

@dcantah dcantah commented Sep 22, 2023

When a shim process is unexpectedly killed in a way that was not initiated through containerd - containerd reports the pod as not ready but the containers as running. This results in kubelet repeatedly sending container kill requests that fail since containerd cannot connect to the shim.

Changes:

  • In the container exit handler, treat err: Unavailable as if the container has already exited out
  • When attempting to get a connection to the shim, if the controller isn't available assume that the shim has been killed (needs to be done since we have a separate exit handler that cleans up the reference to the shim controller - before kubelet has the chance to call StopPodSandbox)

(cherry picked from commit 729c97c)

@dmcgowan
Copy link
Member

/retest

Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

When a shim process is unexpectedly killed in a way that was not initiated through containerd - containerd reports the pod as not ready but the containers as running. This results in kubelet repeatedly sending container kill requests that fail since containerd cannot connect to the shim.

Changes:

- In the container exit handler, treat `err: Unavailable` as if the container has already exited out
- When attempting to get a connection to the shim, if the controller isn't available assume that the shim has been killed (needs to be done since we have a separate exit handler that cleans up the reference to the shim controller - before kubelet has the chance to call StopPodSandbox)

Signed-off-by: Aditya Ramani <a_ramani@apple.com>
(cherry picked from commit 729c97c)
Signed-off-by: Danny Canter <danny@dcantah.dev>
@dcantah
Copy link
Member Author

dcantah commented Sep 30, 2023

Rebased after the Windows mingw fixes. Thanks!

Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fuweid fuweid merged commit 7a75a30 into containerd:release/1.7 Sep 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants