-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Cluster external routing on EKS in ENI mode w/ disabled masquerading does not work for Pods on secondary ENIs #23933
Copy link
Copy link
Closed as not planned
Closed as not planned
Copy link
Labels
kind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.This was reported by a user in the Cilium community, eg via Slack.needs/triageThis issue requires triaging to establish severity and next steps.This issue requires triaging to establish severity and next steps.staleThe stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.
Description
Is there an existing issue for this?
- I have searched the existing issues
What happened?
When deploying Cilium to an EKS cluster on AWS in ENI mode and with masquerading disabled, any routing leaving the VPC of the EKS cluster is not working for pods which have an IP from a secondary ENI. The following provides a standalone example to reproduce the problem.
Configuration files:
# EKS cluster configuration
cat <<EOF >eks.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: cilium-test
region: eu-central-1
version: "1.24"
vpc:
cidr: 10.0.0.0/16
managedNodeGroups:
- name: ng-1
instanceType: t3.large
desiredCapacity: 2
privateNetworking: true
EOF
# Cilium Helm configuration
cat <<EOF >values.yaml
enableIPv4Masquerade: false
tunnel: disabled
ipv4:
enabled: true
ipv6:
enabled: false
bpf:
masquerade: false
ipam:
mode: eni
eni:
enabled: true
EOF
# K8s deployment configuration for testing the routing
cat <<EOF >app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: quay.io/cilium/json-mock:v1.3.2
readinessProbe:
timeoutSeconds: 7
exec:
command:
- curl
- -sS
- --fail
- --connect-timeout
- "5"
- -o
- /dev/null
- https://httpbin.org/status/200
EOFReproducing the problem:
# create the cluster and connect to it
eksctl create cluster -f eks.yaml
aws eks --region eu-central-1 update-kubeconfig --name cilium-test
# delete the AWS CNI before installing Cilium and wait for it to be completely deleted
kubectl -n kube-system delete daemonset aws-node
kubectl -n kube-system wait --for=delete pod -l app.kubernetes.io/name=aws-node
# install Cilium and wait for it
helm upgrade -i cilium cilium --repo https://helm.cilium.io --version 1.12.7 -f values.yaml -n kube-system
cilium status --wait
# deploy the test app and scale it up to ensure that multiple ENIs are allocated for the cluster nodes
kubectl apply -f app.yaml
kubectl scale deployment/myapp --replicas 40
# watch the pods to see that some of them (around half) do never become "Ready" due to the routing problem
watch kubectl get podsAnalyzing the problem:
# list the Cilium pods and nodes (preparation for next step)
kubectl -n kube-system get pods -l k8s-app=cilium
kubectl get nodes
# check the IP rules of any Cilium pod to see that routing through different ENIs is configured
kubectl exec -it -n kube-system ds/cilium -- ip rule
# check the traffic on any secondary ENI (replace the Cilium pod name w/ your own)
#
# you will see that there is traffic where the source IP belongs to one of the cluster nodes.
# the IP of cluster nodes always belongs to the eth0 interface but the traffic is routed
# through some secondary ENI (in this case eth1) which is not correct. it actually looks
# like Cilium is performing some masquerading even though masquerading is completely
# turned off.
kubectl debug -it cilium-294p4 -n kube-system --image itsthenetwork/alpine-tcpdump -- tcpdump -i eth1 -vvv host httpbin.orgDetected issues:
- Cilium seems to route traffic from a node IP on eth0 through some other ENI (eth1, eth2, ...).
- Cilium seems to apply some masquerading even though it is not supposed to do so.
A short excerpt of the TCP dump traffic is shown below, along with the list of nodes where the traffic originates.
Cilium Version
1.12.7 (1.13.0 seems to have the same issue)
Kernel Version
AWS Linux 5.10.165-143.735.amzn2.x86_64 (the same issue occurs w/ other AWS Linux distributions like Bottlerocket)
Kubernetes Version
1.24
Sysdump
cilium-sysdump-20230222-095406.zip
Relevant log output
> kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-138-15.eu-central-1.compute.internal Ready <none> 94m v1.24.10-eks-48e63af
ip-10-0-182-84.eu-central-1.compute.internal Ready <none> 94m v1.24.10-eks-48e63af
> kubectl debug -it cilium-294p4 -n kube-system --image itsthenetwork/alpine-tcpdump -- tcpdump -i eth1 -vvv host httpbin.org
Defaulting debug container name to debugger-brxfv.
If you don't see a command prompt, try pressing enter.
09:55:06.724743 IP (tos 0x0, ttl 63, id 53548, offset 0, flags [DF], proto TCP (6), length 60)
ip-10-0-182-84.eu-central-1.compute.internal.16930 > ec2-34-224-50-110.compute-1.amazonaws.com.443: Flags [S], cksum 0x15d1 (incorrect -> 0x15f7), seq 3754796598, win 62727, options [mss 8961,sackOK,TS val 158348992 ecr 0,nop,wscale 7], length 0
09:55:06.828237 IP (tos 0x0, ttl 63, id 10864, offset 0, flags [DF], proto TCP (6), length 60)
ip-10-0-182-84.eu-central-1.compute.internal.49232 > ec2-52-1-93-201.compute-1.amazonaws.com.443: Flags [S], cksum 0x524d (incorrect -> 0xa70b), seq 3823517694, win 62727, options [mss 8961,sackOK,TS val 540423770 ecr 0,nop,wscale 7], length 0
09:55:06.831772 IP (tos 0x0, ttl 63, id 24570, offset 0, flags [DF], proto TCP (6), length 60)
ip-10-0-182-84.eu-central-1.compute.internal.2401 > ec2-52-200-117-68.compute-1.amazonaws.com.443: Flags [S], cksum 0x6a8f (incorrect -> 0x452a), seq 1142144685, win 62727, options [mss 8961,sackOK,TS val 3394940392 ecr 0,nop,wscale 7], length 0Anything else?
No response
Code of Conduct
- I agree to follow this project's Code of Conduct
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
kind/bugThis is a bug in the Cilium logic.This is a bug in the Cilium logic.kind/community-reportThis was reported by a user in the Cilium community, eg via Slack.This was reported by a user in the Cilium community, eg via Slack.needs/triageThis issue requires triaging to establish severity and next steps.This issue requires triaging to establish severity and next steps.staleThe stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.