Skip to content

Exclude Node-IPAM VIPs from LB wildcard entries (v2)#43565

Merged
aanm merged 8 commits intocilium:mainfrom
ajmmm:pr/nodeipam-wildcard-fix-v2
Jan 12, 2026
Merged

Exclude Node-IPAM VIPs from LB wildcard entries (v2)#43565
aanm merged 8 commits intocilium:mainfrom
ajmmm:pr/nodeipam-wildcard-fix-v2

Conversation

@ajmmm
Copy link
Copy Markdown
Member

@ajmmm ajmmm commented Jan 5, 2026

At a high level, this PR:

  • Repositions both LB-IPAM Config and Node-IPAM Config into their own reusable packages
  • Updates the LoadBalancer BPF Reconciler so that VIPs allocated by Node-IPAM are NOT candidates for wildcard service entries

This work is to fix regression in v1.19-dev that causes node instability when using Node-IPAM in PR: #40684

Fixes: #43206

Fix bug in LoadBalancer that would cause node connectivity faults if Service LoadBalancer VIPs are allocated with Node-IPAM.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 5, 2026
@ajmmm ajmmm force-pushed the pr/nodeipam-wildcard-fix-v2 branch from be3ab08 to 91ba93f Compare January 6, 2026 15:38
@ajmmm
Copy link
Copy Markdown
Member Author

ajmmm commented Jan 6, 2026

/test

@ajmmm
Copy link
Copy Markdown
Member Author

ajmmm commented Jan 6, 2026

/ci-l3-l4

@ajmmm
Copy link
Copy Markdown
Member Author

ajmmm commented Jan 6, 2026

/ci-l7

@joestringer joestringer added the dont-merge/wait-until-release Freeze window for current release is blocking non-bugfix PRs label Jan 8, 2026
@julianwiedmann julianwiedmann added release-blocker/1.19 This issue will prevent the release of the next version of Cilium. and removed dont-merge/wait-until-release Freeze window for current release is blocking non-bugfix PRs labels Jan 9, 2026
@julianwiedmann
Copy link
Copy Markdown
Member

tentatively marking as blocker for v1.19, since this addresses a regression

ajmmm added 8 commits January 9, 2026 16:53
This commit fixes the helm documentation for defaultLBServiceIPAM,
specifically the schema metadata, to avoid it leaking into human-
readable documentation.

Fixes: dfc7752 ("lbipam: nodeipam: introduce a common config to set the default ipam")
Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
Split configuration out of NodeIPAM into its own cell so that other
packages can pull in NodeIPAM configuration.

The rationale to position this outside the operator is to avoid
shared packages building dependencies into the operator.

Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
This commit moves NodeSvcLBClass and NodeSvcLBMatchLabelsAnnotation
strings out of the nodeipam package and into the nodeipamconfig package,
such that the strings are reusable in other areas of the codebase.

Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
Split configuration out of LBIPAM into its own cell so that other
packages can pull in LBIPAM configuration.

The rationale to position this outside the operator is to avoid
shared packages	building dependencies into the operator.

Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
The following commit split out LB-IPAM configuration structures into
their own package, but this still expressed two configuration structs
which seemed unnecessary.

This commit consolidates the properties of lbipamconfig.SharedConfig
into the lbipamconfig.lbipamConfig structure to simplify this a little
while also exporting an interface rather than structure members directly.

Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
This commit refactors the NewExternalConfig() function to accept an
input params structure, rather than a long list of input parameters, so
the following commit can extend this with additional configs.

Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
Introduce additional values in the loadbalancer ExternalConfig struct
so that the loadbalancer is aware of LB-IPAM and Node-IPAM configs, which
are required by the wildcard generation logic in the BPF reconciler.

This commit also updates various test infrastructure to properly include
the new dependencies, as well as documentation.

Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
Currently, the loadbalancer's BPF reconciler will create "wildcard"
service entries in the Cilium datapath for any externally facing VIP,
to ensure the datapath does not punt IP traffic towards VIPs on ports
that are not actually mapped to a service.

This works as intended with LB-IPAM, but if VIPs are allocated with
Node-IPAM, the result is that Node IPs are used instead. This leads to
wildcard entries being programmed with Node IPs, which disrupts control
traffic within a cluster.

This commit introduces additional logic in the loadbalancer's BPF
reconciler to only program wildcard service entries if either of
the following cases are true.

1. There is no loadBalancerClass defined on the service, and the
   default LB service IPAM is LB-IPAM.

2. There is a defined loadBalancerClass defined on the service, and
   it is set to either of the following strings:
   - BGPLoadBalancerClass ("io.cilium/bgp-control-plane")
   - L2AnnounceLoadBalancerClass ("io.cilium/l2-announcer")

This ensures the loadbalancer does not program wildcard service entries
for Node-IPAM allocated VIPs.

Example with service allocated via LB-IPAM:

  $ kubectl get svc netshoot-service -o json | jq '{loadBalancerClass: .spec.loadBalancerClass, ingress: .status.loadBalancer.ingress}'
  {
    "loadBalancerClass": null,
    "ingress": [
      {
        "ip": "10.244.255.200",
        "ipMode": "VIP"
      },
      {
        "ip": "fd00:10:244:ff:ffff:ffff:ffff:ff00",
        "ipMode": "VIP"
      }
    ]
  }

Note presence of wildcard "ANY" entry in datapath:

  # cilium bpf lb list |  grep -E '10.244.255.200.*LoadBalancer'
  10.244.255.200:8000/TCP (0)                         0.0.0.0:0 (19) (0) [LoadBalancer]
  10.244.255.200:8000/UDP (0)                         0.0.0.0:0 (20) (0) [LoadBalancer]
  10.244.255.200:0/ANY (0)                            0.0.0.0:0 (0) (0) [LoadBalancer, non-routable]

Example with service allocated via Node-IPAM:

  $ kubectl get svc netshoot-service2 -o json | jq '{loadBalancerClass: .spec.loadBalancerClass, ingress: .status.loadBalancer.ingress}'
  {
    "loadBalancerClass": "io.cilium/node",
    "ingress": [
      {
        "ip": "172.18.0.3",
        "ipMode": "VIP"
      },
      {
        "ip": "fc00:c111::3",
        "ipMode": "VIP"
      }
    ]
  }

Note absence of wildcard "ANY" entry in datapath:

  # cilium bpf lb list | grep -E '172.18.0.3.*LoadBalancer'
  172.18.0.3:8001/TCP (0)                             0.0.0.0:0 (35) (0) [LoadBalancer]
  172.18.0.3:8001/UDP (0)                             0.0.0.0:0 (36) (0) [LoadBalancer]

Fixes: b3b0fc8 ("loadbalancer: Add logic to program wildcard (drop) entries in eBPF.")
Signed-off-by: Alasdair McWilliam <alasdair.mcwilliam@isovalent.com>
@ajmmm ajmmm force-pushed the pr/nodeipam-wildcard-fix-v2 branch from 91ba93f to fbcaef8 Compare January 9, 2026 17:14
@ajmmm
Copy link
Copy Markdown
Member Author

ajmmm commented Jan 9, 2026

/test

@ajmmm ajmmm marked this pull request as ready for review January 9, 2026 17:15
@ajmmm ajmmm requested review from a team as code owners January 9, 2026 17:15
@ajmmm ajmmm requested review from Artyop, derailed and tklauser January 9, 2026 17:15
Copy link
Copy Markdown
Member

@mhofstetter mhofstetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

envoy ✔️

@aanm aanm enabled auto-merge January 12, 2026 13:14
@aanm aanm added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. labels Jan 12, 2026
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Jan 12, 2026
@aanm aanm added this pull request to the merge queue Jan 12, 2026
Merged via the queue into cilium:main with commit c4d6072 Jan 12, 2026
110 of 112 checks passed
@github-project-automation github-project-automation bot moved this from Proposed to Done in Release blockers Jan 12, 2026
@ajmmm ajmmm deleted the pr/nodeipam-wildcard-fix-v2 branch January 12, 2026 13:42
@julianwiedmann julianwiedmann added area/loadbalancing Impacts load-balancing and Kubernetes service implementations area/kpr Anything related to our kube-proxy replacement. labels Jan 12, 2026
@qmonnet
Copy link
Copy Markdown
Member

qmonnet commented Jan 12, 2026

This PR introduced some new packages that don't have an owner in CODEOWNERS. @ajmmm Could you please open a PR to fix it the CODEOWNERS file? I don't have all the context, but on a quick look it seems that @cilium/sig-ipam would be a good candidate.

image

@ajmmm
Copy link
Copy Markdown
Member Author

ajmmm commented Jan 12, 2026

This PR introduced some new packages that don't have an owner in CODEOWNERS. @ajmmm Could you please open a PR to fix it the CODEOWNERS file?

@aanm kindly fixed it already under: #43692

@qmonnet
Copy link
Copy Markdown
Member

qmonnet commented Jan 12, 2026

Oh I missed it, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/kpr Anything related to our kube-proxy replacement. area/loadbalancing Impacts load-balancing and Kubernetes service implementations kind/bug This is a bug in the Cilium logic. release-blocker/1.19 This issue will prevent the release of the next version of Cilium. release-note/bug This PR fixes an issue in a previous release of Cilium.

Projects

Archived in project
Status: Released

Development

Successfully merging this pull request may close these issues.

KPR misbehaving when a kubernetes service of type LB advertise a node IP in its status