Skip to content

Containerd IP leakage #5768

@Random-Liu

Description

@Random-Liu

Description

We see a problem in production that containerd may leak IP on the node.

Steps to reproduce the issue:

  • When pod network setup is quite slow, RunPodSandbox may timeout or fail;
  • Once RunPodSandbox fails, it tries to teardown the pod network in defer;
  • However, because CNI is slow, the teardown also failed;
  • At this point, the pod sandbox is gone, but the network is not properly tore down.

Proposed solution
We should probably change how RunPodSandbox works.

It should:

  1. Create the sandbox container first;
  2. Setup network for the sandbox container;
  3. Create the sandbox container task.

In this way, when there is any issue in RunPodSandbox, we can still try to cleanup in defer. However, if any cleanup step failed, the sandbox container on disk can still represent the sandbox, and kubelet will try to guarantee it is properly cleaned up eventually.

Metadata

Metadata

Assignees

Labels

area/criContainer Runtime Interface (CRI)kind/bug

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions