datapath: Fix BPF masquerade IP selection with multiple IPs per interface#43996
Conversation
|
Commit 1d86d3f does not match "(?m)^Signed-off-by:". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
…face When a network device has multiple IP addresses (both public and private), BPF masquerading was incorrectly selecting the Kubernetes Node IP even when it was a private address and a public address was available on the same interface. The issue was introduced in PR cilium#33629 which added K8s Node IP prioritization. The code was setting both ipv4PublicIndex and ipv4PrivateIndex to the K8s Node IP index, effectively forcing it to be selected regardless of public/private status. This broke the documented "prefer public over private" logic for Primary address selection used by BPF masquerading. The fix ensures that K8s Node IP prioritization only applies within its own category (public or private): - If K8s Node IP is public, it takes precedence over other public IPs - If K8s Node IP is private, it takes precedence over other private IPs - But public IPs still take precedence over private IPs for masquerading This restores the correct behavior where egress traffic is masqueraded using the public IP address when available, which is required for proper routing in environments with both public and private IPs on the same interface. Fixes: cilium#41866 Signed-off-by: Jonathan Siegel <248302+usiegj00@users.noreply.github.com>
1d86d3f to
0dddd39
Compare
joamaki
left a comment
There was a problem hiding this comment.
Thanks! Please add a release note label to the PR description to be included in release notes.
|
/test |
|
Thank you @joamaki -- let me know if there's anything else needed. We were suffering this in our own cluster and glad to see it get accepted! |
Not yet working. Have to wait for cilium/cilium#43996 to be released.
|
Are there plans to backport this to v1.16, v1.17 and v1.18? |
|
@julianwiedmann Do you know if this fix will be added to the next patch releases of 1.16, 1.17 and 1.18? I saw it's been backported to 1.19 but we're currently stuck on 1.15 and we need it in all the intermediate cilium versions in order to upgrade to 1.19. Thank you! |
You can look at the backport policy here. Imho this shouldn't go back further, those branches are meant to be fairly stable and not have big behavioral changes at this point of their lifetime. If you strictly need this change for your upgrade and don't have any other workarounds, I'd suggest to build your own v1.16 / v1.17 / v.1.18 images and apply a backport. |
Summary
When a network device has multiple IP addresses (both public and private), BPF masquerading was incorrectly selecting the Kubernetes Node IP even when it was a private address and a public address was available on the same interface.
The issue was introduced in PR #33629 which added K8s Node IP prioritization. The code was setting both
ipv4PublicIndexandipv4PrivateIndexto the K8s Node IP index, effectively forcing it to be selected regardless of public/private status. This broke the documented "prefer public over private" logic for Primary address selection used by BPF masquerading.The fix ensures that K8s Node IP prioritization only applies within its own category (public or private):
This restores the correct behavior where egress traffic is masqueraded using the public IP address when available, which is required for proper routing in environments with both public and private IPs on the same interface.
Fixes: #41866
Test plan