Skip to content

Handle teardown failure to avoid blocking cleanup#10839

Merged
estesp merged 1 commit intocontainerd:mainfrom
matteopulcini6:sandbox-deferring-teardown-main
Oct 16, 2024
Merged

Handle teardown failure to avoid blocking cleanup#10839
estesp merged 1 commit intocontainerd:mainfrom
matteopulcini6:sandbox-deferring-teardown-main

Conversation

@matteopulcini6
Copy link
Copy Markdown

Rebased main branch #10800

@k8s-ci-robot
Copy link
Copy Markdown

Hi @matteopulcini6. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@dosubot dosubot bot added the area/cri Container Runtime Interface (CRI) label Oct 15, 2024
@mikebrow
Copy link
Copy Markdown
Member

/ok-to-test

@matteopulcini6
Copy link
Copy Markdown
Author

Note - Ran a ./retest as there were several OOM containers

@matteopulcini6
Copy link
Copy Markdown
Author

matteopulcini6 commented Oct 15, 2024

Trying again, as I believe the container didn't start [OOM error] due to memory limits likely related to limited memory on GH Action, trying again

@matteopulcini6
Copy link
Copy Markdown
Author

/retest

@matteopulcini6 matteopulcini6 force-pushed the sandbox-deferring-teardown-main branch from 691e735 to 5d49f2e Compare October 15, 2024 22:20
Signed-off-by: Matteo Pulcini <Matteo.Pulcini@ibm.com>
@matteopulcini6
Copy link
Copy Markdown
Author

/ok-to-test

@matteopulcini6
Copy link
Copy Markdown
Author

/retest

@estesp estesp changed the title Handle teardown failure to avoid blocking cleanup -- offshoot [#10800] Handle teardown failure to avoid blocking cleanup Oct 16, 2024
@estesp estesp added cherry-picked/1.6.x PR commits are cherry-picked into release/1.6 branch cherry-picked/1.7.x PR commits are cherry-picked into release/1.7 branch labels Oct 16, 2024
@estesp
Copy link
Copy Markdown
Member

estesp commented Oct 16, 2024

This is (pre)cherry picked to #10770 (for release/1.7) and #10778 (for release/1.6). @mikebrow I assume you are still LGTM on this re-based PR for main?

Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mikebrow mikebrow added this pull request to the merge queue Oct 16, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 16, 2024
@estesp estesp added this pull request to the merge queue Oct 16, 2024
Merged via the queue into containerd:main with commit 0abada6 Oct 16, 2024
@matteopulcini6 matteopulcini6 deleted the sandbox-deferring-teardown-main branch October 16, 2024 17:12
MrHohn added a commit to MrHohn/containerd that referenced this pull request Jul 21, 2025
This is a follow up of both containerd#10744 and containerd#10839, where teardownPodNetwork
may be skipped if setup networks failed in the first place.

In the situation illusrated on containerd#12130, for chained CNI pattern the
skipping logic may be problematic and leads to network resource
leakage (e.g. Pod IP).

This PR attempts to make the sandbox network handling more robust by
conditionally ignoring the teardown network when error is related to CNI
initialization, while allowing teardown retry to happen when it makes
sense.
MrHohn added a commit to MrHohn/containerd that referenced this pull request Jul 21, 2025
This is a follow up of both containerd#10744 and containerd#10839, where teardownPodNetwork
may be skipped if setup networks failed in the first place.

In the situation illustrated on containerd#12130, for chained CNI pattern the
skipping logic may be problematic and can lead to network resource
leakage (e.g. Pod IP).

This PR attempts to make the sandbox network handling more robust by
conditionally ignoring the teardown network when error is related to CNI
initialization, while allowing teardown retry to happen when it makes
sense.
MrHohn added a commit to MrHohn/containerd that referenced this pull request Jul 21, 2025
This is a follow up of both containerd#10744 and containerd#10839, where teardownPodNetwork
may be skipped if setup networks failed in the first place.

In the situation illustrated on containerd#12130, for chained CNI pattern the
skipping logic may be problematic and can lead to network resource
leakage (e.g. Pod IP).

This PR attempts to make the sandbox network handling more robust by
conditionally ignoring the teardown network when error is related to CNI
initialization, while allowing teardown retry to happen when it makes
sense.

Signed-off-by: Zihong Zheng <zihongz@google.com>
MrHohn added a commit to MrHohn/containerd that referenced this pull request Jul 25, 2025
This is a follow up of both containerd#10744 and containerd#10839, where teardownPodNetwork
may be skipped if setup networks failed in the first place.

In the situation illustrated on containerd#12130, for chained CNI pattern the
skipping logic may be problematic and can lead to network resource
leakage (e.g. Pod IP).

This PR attempts to make the sandbox network handling more robust by
conditionally ignoring the teardown network when error is related to CNI
initialization, while allowing teardown retry to happen when it makes
sense.

Signed-off-by: Zihong Zheng <zihongz@google.com>
MrHohn added a commit to MrHohn/containerd that referenced this pull request Aug 4, 2025
This is a follow up of both containerd#10744 and containerd#10839, where teardownPodNetwork
may be skipped if setup networks failed in the first place.

In the situation illustrated on containerd#12130, for chained CNI pattern the
skipping logic may be problematic and can lead to network resource
leakage (e.g. Pod IP).

This PR attempts to make the sandbox network handling more robust by
conditionally skipping the teardown network only when error is related
to CNI initialization, while allowing teardown retry to happen if setup
already allocated network resoruce partially.

Signed-off-by: Zihong Zheng <zihongz@google.com>
MrHohn added a commit to MrHohn/containerd that referenced this pull request Aug 4, 2025
This is a follow up of both containerd#10744 and containerd#10839, where teardownPodNetwork
may be skipped if setup networks failed in the first place.

In the situation illustrated on containerd#12130, for chained CNI pattern the
skipping logic may be problematic and can lead to network resource
leakage (e.g. Pod IP).

This PR attempts to make the sandbox network handling more robust by
conditionally skipping the teardown network only when error is related
to CNI initialization, while allowing teardown retry to happen if setup
already allocated network resoruce partially.

Signed-off-by: Zihong Zheng <zihongz@google.com>
mikebrow pushed a commit to MrHohn/containerd that referenced this pull request Jan 30, 2026
This is a follow up of both containerd#10744 and containerd#10839, where teardownPodNetwork
may be skipped if setup networks failed in the first place.

In the situation illustrated on containerd#12130, for chained CNI pattern the
skipping logic may be problematic and can lead to network resource
leakage (e.g. Pod IP).

This PR attempts to make the sandbox network handling more robust by
conditionally skipping the teardown network only when error is related
to CNI initialization, while allowing teardown retry to happen if setup
already allocated network resoruce partially.

Signed-off-by: Zihong Zheng <zihongz@google.com>
mikebrow pushed a commit to MrHohn/containerd that referenced this pull request Feb 3, 2026
This is a follow up of both containerd#10744 and containerd#10839, where teardownPodNetwork
may be skipped if setup networks failed in the first place.

In the situation illustrated on containerd#12130, for chained CNI pattern the
skipping logic may be problematic and can lead to network resource
leakage (e.g. Pod IP).

This PR attempts to make the sandbox network handling more robust by
conditionally skipping the teardown network only when error is related
to CNI initialization, while allowing teardown retry to happen if setup
already allocated network resoruce partially.

Signed-off-by: Zihong Zheng <zihongz@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cri Container Runtime Interface (CRI) cherry-picked/1.6.x PR commits are cherry-picked into release/1.6 branch cherry-picked/1.7.x PR commits are cherry-picked into release/1.7 branch ok-to-test size/XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants