-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Cilium LoadBalancer w/Local and BGP drops traffic on no endpoints #40438
Copy link
Copy link
Closed
Copy link
Labels
area/loadbalancingImpacts load-balancing and Kubernetes service implementationsImpacts load-balancing and Kubernetes service implementationskind/community-reportThis was reported by a user in the Cilium community, eg via Slack.This was reported by a user in the Cilium community, eg via Slack.
Description
Is there an existing issue for this?
- I have searched the existing issues
Version
equal or higher than v1.16.11 and lower than v1.17.0
What happened?
Hi
For some reason when I have a LoadBalancer service with no endpoints (due to health checks etc) and BGP (with anycast using ECMP on the TOR/router level), it just drops all traffic to that given IP from within the k8s cluster when no endpoints exist and externalTrafficPolicy: Local
How can we reproduce the issue?
- install Cilium
- set up BGP with switch/router
- Create a cilium loadbalancer with BGP
- Create a svc with no existing endpoints like this
apiVersion: v1
kind: Service
metadata:
annotations:
io.cilium/lb-ipam-ips: 10.229.8.15
io.cilium/lb-ipam-sharing-key: ashley1.oa
lbipam.cilium.io/ips: 10.229.8.15
lbipam.cilium.io/sharing-key: ashley1.oa
labels:
bgp: yespls
name: apiserver-test
spec:
allocateLoadBalancerNodePorts: false
externalTrafficPolicy: Local
internalTrafficPolicy: Cluster
loadBalancerClass: io.cilium/bgp-control-plane
ports:
- name: apiserver
port: 6443
protocol: TCP
targetPort: 6443
selector:
app: apiservers
sessionAffinity: None
type: LoadBalancer- have NO endpoints working
- try CURLing to the
10.229.8.15IP from within the cluster - observe
root@worker-az2-b8f5bfb5-l6h4q:/# curl -v http://10.229.8.15:6443
* Trying 10.229.8.15:6443...
* Immediate connect fail for 10.229.8.15: Operation not permitted
* Closing connection 0
curl: (7) Couldn't connect to server- Observe IP being dropped somewhere in bpf code instead of being let go through the BGP completely disobeying any reason to ever use BGP and anycast
Cilium Version
tested with 1.15.x (multiple versions) and 1.16.8
Kernel Version
6.11.0 branch, Ubuntu
Kubernetes Version
tested 1.32.x and 1.33.x
Regression
No response
Sysdump
No response
Relevant log output
Anything else?
No response
Cilium Users Document
- Are you a user of Cilium? Please add yourself to the Users doc
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area/loadbalancingImpacts load-balancing and Kubernetes service implementationsImpacts load-balancing and Kubernetes service implementationskind/community-reportThis was reported by a user in the Cilium community, eg via Slack.This was reported by a user in the Cilium community, eg via Slack.