Problem Description
Our goal is to assign static public IPs to pods running in AWS, GCP and Azure. To achieve this, we would like Cilium to assign static public IPs to nodes and then use masquerading for pods. Masquerading is already a GA feature of Cilium, however, static IP assignment to nodes is not yet supported.
Note that in our case, we want these IPs to be public but the implementation will be the same regardless of whether they are public or private.
Also note that this functionality is not covered by Cilium Egress Gateway as it is now. Having this functionality would actually simplify the operations for Egress Gateway users. According to Egress Gateway docs:
Cilium must make use of network-facing interfaces and IP addresses present on the designated gateway nodes. These interfaces and IP addresses must be provisioned and configured by the operator based on their networking environment. The process is highly-dependent on said networking environment.
For example, in AWS/EKS, and depending on the requirements, this may mean creating one or more Elastic Network Interfaces with one or more IP addresses and attaching them to instances that serve as gateway nodes so that AWS can adequately route traffic flowing from and to the instances. Other cloud providers have similar networking requirements and constructs.
Our aim is to automate this IP address provisioning and configuration. It is true that "the process is highly-dependent on [...] networking environment", however when it comes to Cloud Providers, they have fairly similar and fairly straightforward mechanisms for managing IPs (see examples below). Moreover, Cilium already supports Cloud Provider-specific code for various IPAM operations so that functionality could be built on top.
Proposal
AWS
In AWS, we will have to leverage Elastic IPs. The first version of the implementation could look something like this:
- We manage the allocation of EIP pools "manually" (for example via Terraform)
- We then let the Cilium Operator associate the IPs with instances or ENIs based on user-specified EIP tags
The simplest solution is to let users specify the desired EIP tags through the ENI CNI spec. Additionally, users can pass a boolean switch to choose whether the EIP is associated with the instance or the ENI.
Example
We allocate 2 sets of EIPs: one tagged with kubernetes_cluster:a and another one tagged with kubernetes_cluster:b.
Then in the CNI spec for nodes running in Cluster A, we set something like this:
{
"name": "cilium",
"type": "cilium-cni",
"eni": {
"eip-tags": {
"kubernetes_cluster": "a"
},
"eip-associate-with-instance": "true",
...
Then, when the Operator creates the ENI in Node.CreateInterface(), it would do something like this:
if resource.Spec.ENI.EIPTags != nil {
n.manager.api.AssociateEIP(ctx, eniID, n.node.InstanceID(), resource.Spec.ENI.EIPTags, resource.Spec.ENI.EIPAssociateWithInstance)
}
The AssociateEIP function will call the relevant AWS APIs to associate the first EIP corresponding to the tags. You can find the function draft here:
AssociateEIP
func (c *Client) AssociateEIP(ctx context.Context, eniID, instanceID string, eipTags ipamTypes.Tags, eipAssociateWithInstance bool) error {
filters := make([]ec2_types.Filter, 0, len(eipTags))
for k, v := range eipTags {
filters = append(filters, ec2_types.Filter{
Name: aws.String(fmt.Sprintf("tag:%s", k)),
Values: []string{v},
})
}
describeAddressesInput := &ec2.DescribeAddressesInput{
Filters: filters,
}
// TODO rate-limiting
addresses, err := c.ec2Client.DescribeAddresses(ctx, describeAddressesInput)
// TODO metrics
if err != nil {
return err
}
log.Infof("Found %d EIPs corresponding to tags %v", len(addresses.Addresses), eipTags)
for _, address := range addresses.Addresses {
// Only pick unassociated EIPs
if address.AssociationId == nil {
associateAddressInput := &ec2.AssociateAddressInput{
AllocationId: address.AllocationId,
AllowReassociation: aws.Bool(false),
}
if eipAssociateWithInstance {
associateAddressInput.InstanceId = aws.String(instanceID)
} else {
associateAddressInput.NetworkInterfaceId = aws.String(eniID)
}
// TODO rate-limiting
association, err := c.ec2Client.AssociateAddress(ctx, associateAddressInput)
// TODO metrics
if err != nil {
// TODO some errors can probably be skipped and next EIP can be tried
return err
}
log.Infof("Associated EIP %s with Instance %s (association ID: %s)", *address.PublicIp, instanceID, *association.AssociationId)
return nil
}
}
return fmt.Errorf("no unassociated EIPs found for tags %v", eipTags)
}
There is a fair number of edge cases which need to be addressed here but it gives an idea of the desired functionality.
As a next step, Cilium could not only manage the association of EIPs but also manage their allocation by provisioning them on-demand.
The downside of this implementation is that it is very AWS-specific and doesn't really use any abstraction which would be useful for Azure or GCP (see abstraction ideas in the GCP section).
Azure
In Azure, the static IP assignment mechanism is very similar: Public IP addresses can be created (and tagged) and then added to VM network interfaces.
With the ipam: azure mode, we could either add a new CNI parameter (similar to what was described for AWS) or, even better, use a cloud-agnostic abstraction (see the GCP section below).
GCP (and cloud-agnostic abstraction ideas)
In GCP, we will have to leverage Static External IP Addresses. The main idea is once again the same as for AWS and Azure: we will reserve static IPs "manually" (Terraform or something else) and will let Cilium assign the IP to the interface of the instance based on user-specified IP labels.
The difference with AWS and Azure is that Cilium runs in ipam:kubernetes mode in GCP. This means that we can't have a setup similar to what was described in previous sections. There are a couple of other possibilities:
- Add new fields in the
ciliumnode CRD spec (under .spec.ipam?). The most important field will be something like .spec.ipam.static-ip.tags. There could possibly be other fields for cloud-provider specific customizations. The Operator will watch the CiliumNode objects and will call the relevant Cloud Provider APIs when it sees the static-ip fields being set.
- Add a new annotation on the nodes, similarly to what Calico has for pods: https://docs.tigera.io/calico/latest/networking/ipam/use-specific-ip. The Operator will then act on the annotation.
- (I also thought that Cloud Provider static IPs could integrate with the Multi-Pool IPAM feature however it's only used for pod IPAM so it's not directly relevant here)
Using the first two ideas we could build an abstraction which would work for all three main Cloud Providers.
I know this proposal is a bit rough but I would love to get some preliminary feedback from the maintainers / the community on whether this feature makes sense and what the best implementation plan will be here.
Problem Description
Our goal is to assign static public IPs to pods running in AWS, GCP and Azure. To achieve this, we would like Cilium to assign static public IPs to nodes and then use masquerading for pods. Masquerading is already a GA feature of Cilium, however, static IP assignment to nodes is not yet supported.
Note that in our case, we want these IPs to be public but the implementation will be the same regardless of whether they are public or private.
Also note that this functionality is not covered by Cilium Egress Gateway as it is now. Having this functionality would actually simplify the operations for Egress Gateway users. According to Egress Gateway docs:
Our aim is to automate this IP address provisioning and configuration. It is true that "the process is highly-dependent on [...] networking environment", however when it comes to Cloud Providers, they have fairly similar and fairly straightforward mechanisms for managing IPs (see examples below). Moreover, Cilium already supports Cloud Provider-specific code for various IPAM operations so that functionality could be built on top.
Proposal
AWS
In AWS, we will have to leverage Elastic IPs. The first version of the implementation could look something like this:
The simplest solution is to let users specify the desired EIP tags through the ENI CNI spec. Additionally, users can pass a boolean switch to choose whether the EIP is associated with the instance or the ENI.
Example
We allocate 2 sets of EIPs: one tagged with
kubernetes_cluster:aand another one tagged withkubernetes_cluster:b.Then in the CNI spec for nodes running in Cluster A, we set something like this:
{ "name": "cilium", "type": "cilium-cni", "eni": { "eip-tags": { "kubernetes_cluster": "a" }, "eip-associate-with-instance": "true", ...Then, when the Operator creates the ENI in Node.CreateInterface(), it would do something like this:
The
AssociateEIPfunction will call the relevant AWS APIs to associate the first EIP corresponding to the tags. You can find the function draft here:AssociateEIP
There is a fair number of edge cases which need to be addressed here but it gives an idea of the desired functionality.
As a next step, Cilium could not only manage the association of EIPs but also manage their allocation by provisioning them on-demand.
The downside of this implementation is that it is very AWS-specific and doesn't really use any abstraction which would be useful for Azure or GCP (see abstraction ideas in the GCP section).
Azure
In Azure, the static IP assignment mechanism is very similar: Public IP addresses can be created (and tagged) and then added to VM network interfaces.
With the
ipam: azuremode, we could either add a new CNI parameter (similar to what was described for AWS) or, even better, use a cloud-agnostic abstraction (see the GCP section below).GCP (and cloud-agnostic abstraction ideas)
In GCP, we will have to leverage Static External IP Addresses. The main idea is once again the same as for AWS and Azure: we will reserve static IPs "manually" (Terraform or something else) and will let Cilium assign the IP to the interface of the instance based on user-specified IP labels.
The difference with AWS and Azure is that Cilium runs in
ipam:kubernetesmode in GCP. This means that we can't have a setup similar to what was described in previous sections. There are a couple of other possibilities:ciliumnodeCRD spec (under.spec.ipam?). The most important field will be something like.spec.ipam.static-ip.tags. There could possibly be other fields for cloud-provider specific customizations. The Operator will watch the CiliumNode objects and will call the relevant Cloud Provider APIs when it sees thestatic-ipfields being set.Using the first two ideas we could build an abstraction which would work for all three main Cloud Providers.
I know this proposal is a bit rough but I would love to get some preliminary feedback from the maintainers / the community on whether this feature makes sense and what the best implementation plan will be here.