Skip to content

bpf: Don't resolve remote cluster NodePort services at the source#41337

Merged
pchaigno merged 2 commits intomainfrom
pr/pchaigno/fix-clustermesh-bug-duplicate-svc
Sep 8, 2025
Merged

bpf: Don't resolve remote cluster NodePort services at the source#41337
pchaigno merged 2 commits intomainfrom
pr/pchaigno/fix-clustermesh-bug-duplicate-svc

Conversation

@pchaigno
Copy link
Copy Markdown
Member

@pchaigno pchaigno commented Aug 22, 2025

This pull request fixes a bug in clustermesh mode that causes us to redirect requests to the wrong NodePort service when the same NodePort service exists on a remote and the local clusters. The first commit adds a new ipcache flag to indicate that a node is in a remote cluster. The second commit uses the new flag to skip load balancing at the source for this case. See commits for details.

Fixes: #24692.

Fix a bug that would cause NodePort requests to be sent to the wrong backends when using KPR and Clustermesh with two identical, non-global NodePort services on different clusters.

@pchaigno pchaigno added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Aug 22, 2025
@pchaigno pchaigno force-pushed the pr/pchaigno/fix-clustermesh-bug-duplicate-svc branch 8 times, most recently from e91ed1a to 288cfb2 Compare August 25, 2025 07:58
@pchaigno pchaigno added area/clustermesh Relates to multi-cluster routing functionality in Cilium. needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch labels Aug 25, 2025
@pchaigno pchaigno force-pushed the pr/pchaigno/fix-clustermesh-bug-duplicate-svc branch 2 times, most recently from 35a5274 to 4d616e8 Compare August 29, 2025 13:51
In the subsequent commit, we will need to know if a remote node is in
the local or not. This commit therefore adds a new flag to the ipcache
value, to indicate whether a node IP address belongs to a remote cluster
node.

This flag is only defined for nodes at the moment and remains unset for
remote cluster pods.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
We currently have a bug when using socket-level load balancing (part of
KPR) with Clustermesh. When we have the same non-global service existing
on two clusters A and B, if we try to reach the B version from cluster
A, we end up with an answer from the A version.

We don't have a way to distinguish between remote node IPs belonging to
the local or to a remote cluster. Hence, when the VIP is a remote node
IP, we perform a wildcard lookup (0.0.0.0) and find the local service as
a result.

The previous commit introduces a way to distinguish between local and
remote cluster node IP address, via the ipcache. This commit therefore
uses it to only perform a wildcard lookup if the VIP is remote node IP
belonging to the local cluster.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
@pchaigno pchaigno force-pushed the pr/pchaigno/fix-clustermesh-bug-duplicate-svc branch from 4d616e8 to 84d8782 Compare August 29, 2025 14:22
@pchaigno pchaigno added area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. feature/socket-lb Impacts the Socket-LB part of Cilium's kube-proxy replacement. labels Aug 29, 2025
@pchaigno pchaigno changed the title Test ipc bpf: Don't resolve remote cluster NodePort services at the source Aug 29, 2025
@pchaigno pchaigno marked this pull request as ready for review September 2, 2025 11:45
@pchaigno pchaigno requested review from a team as code owners September 2, 2025 11:45
@pchaigno pchaigno enabled auto-merge September 2, 2025 11:45
@pchaigno pchaigno mentioned this pull request Sep 2, 2025
Copy link
Copy Markdown
Member

@YutaroHayakawa YutaroHayakawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm, it would be nice if we could solve this at control-plane-level, but yeah I don't come up with any better way. This makes sense to me.

Copy link
Copy Markdown
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Could you please also revert 9e83a6f?

@pchaigno pchaigno added this pull request to the merge queue Sep 8, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Sep 8, 2025
Merged via the queue into main with commit 12acdb5 Sep 8, 2025
442 of 447 checks passed
@pchaigno pchaigno deleted the pr/pchaigno/fix-clustermesh-bug-duplicate-svc branch September 8, 2025 16:58
@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Sep 8, 2025

Thanks! Could you please also revert 9e83a6f?

@giorio94 I didn't get to it before it got merged, but will definitely follow up on this.

@tklauser tklauser mentioned this pull request Sep 9, 2025
14 tasks
@tklauser tklauser added backport-pending/1.18 The backport for Cilium 1.18.x for this PR is in progress. and removed needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch labels Sep 9, 2025
@github-actions github-actions bot added backport-done/1.18 The backport for Cilium 1.18.x for this PR is done. and removed backport-pending/1.18 The backport for Cilium 1.18.x for this PR is in progress. labels Sep 10, 2025
@julianwiedmann julianwiedmann added the affects/v1.17 This issue affects v1.17 branch label Oct 8, 2025
@cilium-release-bot cilium-release-bot bot moved this to Released in cilium v1.19.0 Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

affects/v1.17 This issue affects v1.17 branch area/clustermesh Relates to multi-cluster routing functionality in Cilium. area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. backport-done/1.18 The backport for Cilium 1.18.x for this PR is done. feature/socket-lb Impacts the Socket-LB part of Cilium's kube-proxy replacement. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.

Projects

No open projects
Status: Released

Development

Successfully merging this pull request may close these issues.

Strange behavior when using NodePort with clustermesh

7 participants