-
Notifications
You must be signed in to change notification settings - Fork 3.9k
IP Address Leak with Chained CNI Plugins When Teardown Fails #12130
Description
Description
When using chained CNI plugins, if a pod fails to be created due to an error in a later CNI plugin, an IP address can be leaked from an earlier plugin in the chain in every iteration of sandbox creation. Consequently, a large number of IP addresses might be allocated and not cleaned up from a single pod creation in a relatively short amount of time, until no IP addresses available.
Steps to reproduce the issue
This occurs when the following sequence of events happens:
- During sandbox creation,
CNI ADDis called for the CNI plugin chain. - The first plugin (e.g.,
host-localIPAM) successfully allocates an IP address. - A subsequent plugin in the chain fails the
CNI ADDoperation. This causes the overallsetupPodNetworkstep to fail. - Containerd then attempts to clean up the failed sandbox.
- During cleanup,
teardownPodNetworkis called, which invokesCNI DELon the plugin chain. - The same CNI plugin that failed on
ADDalso fails onDEL(note in practice, this might be transient, e.g. CNI provider may happen to be loaded and unable to process the request). - Because the teardown failed, and the initial setup also failed (
CNIResultwas never populated), containerd ignores the teardown error and considers network cleanup finished. - As a result, the
CNI DELcommand is never sent to the first plugin, and the IP address it allocated is permanently leaked. Kubelet retries creating the pod, leading to multiple leaked IPs for a single pod creation attempt.
Describe the results you received and expected
In the case illustrated above, a Kubernetes node will experience IP leakage and in worst case burning all available IP on the node in a short amount of time.
In a more general view, it seems containerd is more fragile to churns/errors in CNI plugins, especially the ones used in a chain pattern.
Ask - The logic that swallows the cleanupErr should be reconsidered. Even if the initial setup failed, a teardown error indicates that some resources might be left behind and cleanup should be retried or handled more robustly.
What version of containerd are you using?
Containerd v1.7.22+ or v1.6.37+.
Any other relevant information
The skipping behavior for teardownPodNetwork was added in #10744 and got cherrypicked into 1.7 and 1.6.
Show configuration if it is related to CRI plugin.
N/A
Metadata
Metadata
Assignees
Labels
Type
Projects
Status