Skip to content

CFP: Add extra logic for cilium ENI mode for picking the right subnet #32052

@liyihuang

Description

@liyihuang

Cilium Feature Proposal

Previously, when cilium in the ENI mode, it will randomly pick some subnet causing some issues(#23933) where user need to SNAT the pod IP address to the node IP. which is not necessary in the AWS ENI mode environment in most of the cases(SNAT should be handled by the AWS NAT gateway if user follow the AWS practice to put the k8s nodes in the private subnet and have a AWS NAT gateway to handle outgoing Internet traffic).

We got the PR(#22000) to solve it but I still feel we should improve the logic here(

cilium/pkg/aws/eni/node.go

Lines 842 to 866 in 0007e35

// findSuitableSubnet attempts to find a subnet to allocate an ENI in according to the following heuristic.
// 0. In general, the subnet has to be in the same VPC and match the availability zone of the
// node. If there are multiple candidates, we choose the subnet with the most addresses
// available.
// 1. If we have explicit ID or tag constraints, chose a matching subnet. ID constraints take
// precedence.
// 2. If we have no explicit constraints, try to use the subnet the first ENI of the node was
// created in, to avoid putting the ENI in a surprising subnet if possible.
// 3. If none of these work, fall back to just choosing the subnet with the most addresses
// available.
func (n *Node) findSuitableSubnet(spec eniTypes.ENISpec, limits ipamTypes.Limits) *ipamTypes.Subnet {
var subnet *ipamTypes.Subnet
if len(spec.SubnetIDs) > 0 {
return n.manager.FindSubnetByIDs(spec.VpcID, spec.AvailabilityZone, spec.SubnetIDs)
} else if len(spec.SubnetTags) > 0 {
return n.manager.FindSubnetByTags(spec.VpcID, spec.AvailabilityZone, spec.SubnetTags)
}
subnet = n.manager.GetSubnet(spec.NodeSubnetID)
if subnet != nil && subnet.AvailableAddresses >= limits.IPv4 {
return subnet
}
return n.manager.FindSubnetByTags(spec.VpcID, spec.AvailabilityZone, nil)
}
). After discussing with @gandro, I think we can add a logic between 2 and 3 to pick the subnet that's in the same routing table with the primary subnet so there will be less routing surprise on the AWS side. we should gradually remove current logic 3 so users will not run into the situation some pods can't ping the Internet but some can(I did the troubleshooting on this and it's really confusing)

Metadata

Metadata

Assignees

Labels

area/eniImpacts ENI based IPAM.kind/cfpCilium Feature Proposalkind/enhancementThis would improve or streamline existing functionality.kind/featureThis introduces new functionality.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions