Skip to content

Node IPAM LB and GatewayClass parameters leads to Reconciler error #42890

@shallot

Description

@shallot

Is there an existing issue for this?

  • I have searched the existing issues

Version

equal or higher than v1.18.4 and lower than v1.19.0

What happened?

I set up three VMs with kubeadm, each with an a public IP NAT'd to their private IP, which I want to use to access HTTP/S services in k8s via DNS round-robin.

I tried installing Cilium with the Gateway API support based on the docs at
https://docs.cilium.io/en/stable/network/node-ipam/ and https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/parameterized-gatewayclass/

I used these Helm chart values:

kubeSystemNamespace: cilium

ipam:
  mode: "cluster-pool"

cluster:
  pool:
    ipv4:
      cidr: "10.242.0.0/16"

kubeProxyReplacement: true

aws:
  enabled: true

nodeIPAM:
  enabled: true

defaultLBServiceIPAM: nodeipam

gatewayAPI:
  enabled: true
  hostNetwork:
    enabled: true

envoy:
  enabled: true
  securityContext:
    capabilities:
      keepCapNetBindService: true
      envoy:
        # defaults for eBPF/networking
        - NET_ADMIN
        - SYS_ADMIN
        # to bind to port 80/443 on the host
        - NET_BIND_SERVICE

Without nodeIPAM options, this got me as far as my Gateway being able to serve traffic, but would never end up Programmed, staying pending, saying "Address not ready yet".

So then I tried adding this:

apiVersion: cilium.io/v2alpha1
kind: CiliumGatewayClassConfig
metadata:
  name: node-ipam-lb
  namespace: default
spec:
  service:
    type: LoadBalancer
    loadBalancerClass: io.cilium/node

This gets accepted. Then tried adding this:

apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: node-ipam-lb-cilium
spec:
  controllerName: io.cilium/gateway-controller
  description: The Cilium GatewayClass for Node IPAM LB
  parametersRef:
    group: cilium.io
    kind: CiliumGatewayClassConfig
    name: node-ipam-lb
    namespace: default

Now that gets stuck with "Invalid GatewayClass", "InvalidParameters". The logs of cilium-operator say:

2025-11-20T10:44:49.334886475Z time=2025-11-20T10:44:49.334748078Z level=info msg="Reconciling GatewayClass" module=operator.operator-controlplane.leader-lifecycle.gateway-api resource=/node-ipam-lb-cilium
2025-11-20T10:44:49.343058447Z time=2025-11-20T10:44:49.342925548Z level=info msg="Reconciler error" module=operator.operator-controlplane.leader-lifecycle.controller-runtime controller=gatewayclass controllerGroup=gateway.networking.k8s.io controllerKind=GatewayClass GatewayClass.name=node-ipam-lb-cilium namespace="" name=node-ipam-lb-cilium reconcileID=8f656355-cefb-4cad-947c-a8c44f5b61a7 error="CiliumGatewayClassConfig.cilium.io "node-ipam-lb" not found"

I have no idea why it's "not found" when I can see it:

% kubectl get CiliumGatewayClassConfig.cilium.io -o yaml
apiVersion: v1
items:
- apiVersion: cilium.io/v2alpha1
  kind: CiliumGatewayClassConfig
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"cilium.io/v2alpha1","kind":"CiliumGatewayClassConfig","metadata":{"annotations":{},"name":"node-ipam-lb","namespace":"default"},"spec":{"service":{"loadBalancerClass":"io.cilium/node","type":"LoadBalancer"}}}
    creationTimestamp: "2025-11-20T10:10:20Z"
    generation: 1
    name: node-ipam-lb
    namespace: default
    resourceVersion: "23924931"
    uid: fa25e60b-c516-42d9-bd0f-0e62b8fe8319
  spec:
    service:
      externalTrafficPolicy: Cluster
      loadBalancerClass: io.cilium/node
      loadBalancerSourceRangesPolicy: Allow
      type: LoadBalancer
  status:
    conditions:
    - lastTransitionTime: "2025-11-20T10:10:20Z"
      message: Valid GatewayClassConfig
      observedGeneration: 1
      reason: Accepted
      status: "True"
      type: Accepted
kind: List
metadata:
  resourceVersion: ""

I thought it might have been some sort of an implicit dependency on having to be in the cilium namespace, so I tried moving it there, but it didn't help, still the same message:

2025-11-20T10:48:47.563926662Z time=2025-11-20T10:48:47.56376508Z level=info msg="Reconciler error" module=operator.operator-controlplane.leader-lifecycle.controller-runtime controller=gatewayclass controllerGroup=gateway.networking.k8s.io controllerKind=GatewayClass GatewayClass.name=node-ipam-lb-cilium namespace="" name=node-ipam-lb-cilium reconcileID=dba40eee-13c2-487c-8d47-424371ebb30c error="CiliumGatewayClassConfig.cilium.io "node-ipam-lb" not found"

Yet:

% kubectl get CiliumGatewayClassConfig.cilium.io -o yaml -n cilium
apiVersion: v1
items:
- apiVersion: cilium.io/v2alpha1
  kind: CiliumGatewayClassConfig
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"cilium.io/v2alpha1","kind":"CiliumGatewayClassConfig","metadata":{"annotations":{},"name":"node-ipam-lb","namespace":"cilium"},"spec":{"service":{"loadBalancerClass":"io.cilium/node","type":"LoadBalancer"}}}
    creationTimestamp: "2025-11-20T10:47:53Z"
    generation: 1
    name: node-ipam-lb
    namespace: cilium
    resourceVersion: "23933257"
    uid: 2a8b974c-c125-4cf3-bdad-41326724f346
  spec:
    service:
      externalTrafficPolicy: Cluster
      loadBalancerClass: io.cilium/node
      loadBalancerSourceRangesPolicy: Allow
      type: LoadBalancer
  status:
    conditions:
    - lastTransitionTime: "2025-11-20T10:47:53Z"
      message: Valid GatewayClassConfig
      observedGeneration: 1
      reason: Accepted
      status: "True"
      type: Accepted
kind: List
metadata:
  resourceVersion: ""

Also, the same thing happens if I apply the example from the docs, https://raw.githubusercontent.com/cilium/cilium/1.18.4/examples/kubernetes/gateway/gateway-with-parameters.yaml

I'm not sure where to look next. operator/pkg/gateway-api/gatewayclass_reconcile.go in the code seems like it would explain its own errors, and this 'not found' is coming from r.Client.Get(ctx, req.NamespacedName, original)?

Please help. TIA.

How can we reproduce the issue?

The nodes I used are AWS EC2 arm64 instances within a VPC. Let me know if you need more information to reproduce.

Cilium Version

1.18.3 and 1.18.4

Kernel Version

Linux myhostnames 6.1.0-41-cloud-arm64 #1 SMP Debian 6.1.158-1 (2025-11-09) aarch64 GNU/Linux

Kubernetes Version

Client Version: v1.33.2
Kustomize Version: v5.6.0
Server Version: v1.33.3

Regression

No response

Sysdump

No response

Relevant log output

Anything else?

No response

Cilium Users Document

  • Are you a user of Cilium? Please add yourself to the Users doc

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/agentCilium agent related.area/servicemeshGH issues or PRs regarding servicemeshfeature/k8s-gateway-apikind/bugThis is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.needs/triageThis issue requires triaging to establish severity and next steps.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions