Skip to content

AWS ENI IPAM: Reduce API calls during ENI creation when using IP prefix delegation#44154

Merged
julianwiedmann merged 1 commit intocilium:mainfrom
sh1un:eni/optimize-prefix-delegation-allocation
Feb 25, 2026
Merged

AWS ENI IPAM: Reduce API calls during ENI creation when using IP prefix delegation#44154
julianwiedmann merged 1 commit intocilium:mainfrom
sh1un:eni/optimize-prefix-delegation-allocation

Conversation

@sh1un
Copy link
Copy Markdown
Contributor

@sh1un sh1un commented Feb 3, 2026

Background

Hi Cilium team! I'm an engineer at an e-commerce company that uses Cilium with AWS ENI IPAM.

We are currently experiencing slow pod scaling issues in our production environment. Our goal is to reduce pod scale-out latency, which is constrained by AWS API rate limits when using ENI IPAM.

To achieve faster scaling while staying within AWS rate limits, we want to minimize the number of API calls required to reach the expected state.
For example:

  • Expected state: A worker node's CiliumNode should have 48 available IPs
  • Goal: Reach this state with fewer API calls to reduce the risk of AWS throttling

Problem

When using AWS ENI IPAM with IP prefix delegation mode (e.g., min-allocate: 45):

  • The operator was making multiple API calls to allocate prefixes
  • CreateNetworkInterface only requested a limited number of prefixes (capped by limits.IPv4-1)

cilium/pkg/aws/eni/node.go

Lines 595 to 596 in 75421ca

// Must allocate secondary ENI IPs as needed, up to ENI instance limit - 1 (reserve 1 for primary IP)
toAllocate := min(allocation.IPv4.MaxIPsToAllocate, limits.IPv4-1)

  • Additional AssignPrivateIpAddresses calls were needed for remaining prefixes
  • This increases AWS API usage and susceptibility to rate limiting
image

Solution

Remove the per-ENI secondary IP limit for prefix delegation mode, allowing all needed prefixes to be requested in the initial CreateNetworkInterface call.

Before (with min-allocate: 45):

  1. CreateNetworkInterface with ipv4PrefixCount=1
  2. AssignPrivateIpAddresses with ipv4PrefixCount=2
    → 2 API calls

After (with this change):

  1. CreateNetworkInterface with ipv4PrefixCount=3
    → 1 API call
image

Testing

  • Built and deployed a custom operator image to an EKS cluster
  • Verified via AWS CloudTrail that CreateNetworkInterface now requests the correct number of prefixes in a single call
  • Observed reduced AWS API call volume during node scaling
AWS ENI IPAM: Reduce API calls during ENI creation when using prefix delegation

@sh1un sh1un requested a review from a team as a code owner February 3, 2026 15:35
@sh1un sh1un requested a review from liyihuang February 3, 2026 15:35
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 3, 2026
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Feb 3, 2026
@HadrienPatte HadrienPatte added the area/eni Impacts ENI based IPAM. label Feb 4, 2026
@liyihuang
Copy link
Copy Markdown
Contributor

/test

@liyihuang
Copy link
Copy Markdown
Contributor

thanks for this contribution.

May I know which version are you using? you might get the benefit from #42529 and #41783. that should reduce a lot of aws operation.

@liyihuang liyihuang added the release-note/misc This PR makes changes that have no direct user impact. label Feb 5, 2026
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 5, 2026
@sh1un
Copy link
Copy Markdown
Contributor Author

sh1un commented Feb 5, 2026

Hi @liyihuang,

Thanks for the review :)

We're on v1.17.5 right now and planning to upgrade to 1.18.x soon.

I Checked out PRs you mentioned, they look great and should help us a lot!
Are #42529 and #41783 available in 1.18+, or do I need a specific version?

Once we upgrade and get those other fixes too, hoping to see it drop even further!

Thanks for pointing those out!


More context on this PR

Our main pain point was AssignPrivateIpAddresses getting throttled pretty badly during pods scale out:

image

So I made this change to reduce the total API calls when creating new ENIs. Tested it in our UAT env and saw our pod scale-out P99 latency time go from 30 mins -> 5 mins, and no more throttling.

[ img 1 - before ]
image

[ img 2 - after ]

image

@liyihuang
Copy link
Copy Markdown
Contributor

Oh, they are on 1.19. but I also have to admit that it doesn't resolve your issue if you run into the rate limit for the assigning IP address.

@sh1un sh1un requested a review from liyihuang February 6, 2026 14:01
@sh1un sh1un force-pushed the eni/optimize-prefix-delegation-allocation branch 2 times, most recently from ada5692 to 788365f Compare February 11, 2026 10:47
@liyihuang
Copy link
Copy Markdown
Contributor

liyihuang commented Feb 11, 2026

@sh1un sh1un force-pushed the eni/optimize-prefix-delegation-allocation branch from 788365f to 852ae99 Compare February 12, 2026 12:18
@sh1un
Copy link
Copy Markdown
Contributor Author

sh1un commented Feb 12, 2026

@sh1un you need to check out and fix https://github.com/cilium/cilium/actions/runs/21902051015/job/63233593561?pr=44154

@liyihuang
Fixed the commit subject length. Sorry for the noise.

@HadrienPatte
Copy link
Copy Markdown
Member

/test

When using prefix delegation mode, the CreateInterface function was
limiting the number of IPs to allocate based on the per-ENI secondary
IP limit (limits.IPv4-1). This caused the operator to make additional
API calls to AssignPrivateIpAddresses to allocate remaining prefixes.

This change removes the secondary IP limit restriction for prefix
delegation mode, allowing CreateNetworkInterface to request all needed
prefixes in a single API call. This reduces API calls and potential
race conditions during ENI creation.

Signed-off-by: Shiun Chiu <shiun.chiu@shopline.com>
@liyihuang liyihuang force-pushed the eni/optimize-prefix-delegation-allocation branch from 852ae99 to 046a1da Compare February 16, 2026 20:57
@liyihuang
Copy link
Copy Markdown
Contributor

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Feb 21, 2026
@julianwiedmann julianwiedmann added this pull request to the merge queue Feb 25, 2026
@julianwiedmann julianwiedmann added the area/ipam IP address management, including cloud IPAM label Feb 25, 2026
Merged via the queue into cilium:main with commit 4054d73 Feb 25, 2026
79 of 81 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/eni Impacts ENI based IPAM. area/ipam IP address management, including cloud IPAM kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants