Skip to content

[v1.18] Keep non-serving terminating backends#42708

Merged
joamaki merged 3 commits intocilium:v1.18from
joamaki:pr/joamaki/v1.18-keep-terminating-not-serving
Nov 17, 2025
Merged

[v1.18] Keep non-serving terminating backends#42708
joamaki merged 3 commits intocilium:v1.18from
joamaki:pr/joamaki/v1.18-keep-terminating-not-serving

Conversation

@joamaki
Copy link
Copy Markdown
Contributor

@joamaki joamaki commented Nov 11, 2025

This backports #40969, #41234 and #42170 to v1.18. This fixes losing connectivity to terminating endpoints that are marked unready due to pod readiness checks.

Kubernetes endpoints that are terminating are retained in the backends BPF state regardless of the "serving" condition to avoid connection disruptions when a pod no longer signals readiness to process new connections.

[ upstream commit 6f41c98 ]
[ backporter's notes: adapted ParseEndpointSliceV1Beta & tests ]

A terminating pod that no longer signals readiness will be seen as an
endpoint with {ready: false, serving: false, terminating: true}. These
were skipped leading to the backend being removed from the backends map
which causes connection disruptions.

Fix this by keeping the non-serving terminating backends in the backends BPF
map, but not using it as a fallback backend if only terminating backends exists.

This is implemented by:
- Including all conditions in [k8s.Backend], not just Terminating.
- Adding a new backend state (TerminatingNotServing) used with Terminating&!Serving conditions.
- Not considering TerminatingNotServing backends when selecting the backends.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit af7dde3 ]

Skip endpoints in endpoint slices that have no conditions set as those
are not yet ready to serve traffic. Before this they were marked quarantined
which is unnecessary and it caused issues as the quarantined state was restored
and not cleared until health checked (and we usually don't have a health checker).

Fixes: cilium#41194
Fixes: 6f41c98 ("loadbalancer: Keep non-serving terminating backends")
Signed-off-by: Jussi Maki <jussi@isovalent.com>
[ upstream commit 73812ed ]

To avoid disrupting connections to backends that are flapping on their
readiness state, keep the backends in the backends BPF map, but not in
the services map.

This in effect reverts the temporary workaround in cilium#41234.

Fixes: cilium#41244
Signed-off-by: Jussi Maki <jussi@isovalent.com>
@maintainer-s-little-helper maintainer-s-little-helper bot added backport/1.18 This PR represents a backport for Cilium 1.18.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. labels Nov 11, 2025
@joamaki
Copy link
Copy Markdown
Contributor Author

joamaki commented Nov 11, 2025

/test

@joamaki joamaki marked this pull request as ready for review November 12, 2025 13:15
@joamaki joamaki requested review from a team as code owners November 12, 2025 13:15
@joamaki joamaki requested a review from squeed November 12, 2025 13:15
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 13, 2025
@joamaki joamaki added this pull request to the merge queue Nov 17, 2025
Merged via the queue into cilium:v1.18 with commit 2ded558 Nov 17, 2025
77 checks passed
@joamaki joamaki deleted the pr/joamaki/v1.18-keep-terminating-not-serving branch November 17, 2025 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/1.18 This PR represents a backport for Cilium 1.18.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants