Skip to content

ipam: Operator dual-write Spec.IPAM.Pools.Allocated for ENI mode#45110

Merged
pippolo84 merged 1 commit intomainfrom
pr/HadrienPatte/eni-ipam-dualwrite
Apr 3, 2026
Merged

ipam: Operator dual-write Spec.IPAM.Pools.Allocated for ENI mode#45110
pippolo84 merged 1 commit intomainfrom
pr/HadrienPatte/eni-ipam-dualwrite

Conversation

@HadrienPatte
Copy link
Copy Markdown
Member

@HadrienPatte HadrienPatte commented Apr 1, 2026

In syncToAPIServer, populate Spec.IPAM.Pools.Allocated alongside Spec.IPAM.Pool for nodes with ENI status. This enables the ENI multi-pool migration (cilium/design-cfps#87): new agents (1.20) will read CIDRs from Pools.Allocated (with the multipool allocator) while old agents (1.19) will continue reading from Pool (with the CRD allocator).

Note: this PR only includes the double write logic on the operator side, a followup PR will change the read path on the agent side.

For secondary IP mode, each IP is written as a /32 CIDR. For prefix delegation mode, each /28 prefix is written directly alongside the secondary IP /32s. All CIDRs are placed under the "default" pool.

The dual-write is low-frequency (fires on ENI capacity changes, not on every pod event) and will be removed in 1.21. The dual write is also atomic, both Pool and Pools.Allocated get written in the same CilimNode update operation so they are garanteed to always be in sync and consistent.

Relates to cilium/design-cfps#87

Testing

Here's what this looks like (with kubectl get ciliumnode $node -ojson | jq '.spec.ipam'):

  • On a node without prefix delegation:
{
  "min-allocate": 3,
  "pool": {
    "100.120.134.75": {
      "resource": "eni-0e979576c285fd4d5"
    },
    "100.120.137.182": {
      "resource": "eni-0e979576c285fd4d5"
    },
    "100.120.248.50": {
      "resource": "eni-0e979576c285fd4d5"
    }
  },
  "pools": {
    "allocated": [
      {
        "cidrs": [
          "100.120.134.75/32",
          "100.120.248.50/32",
          "100.120.137.182/32"
        ],
        "pool": "default"
      }
    ]
  },
  "pre-allocate": 1
}
  • On a node with prefix delegation:
{
  "min-allocate": 3,
  "pool": {
    "100.121.95.100": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.101": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.102": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.103": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.104": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.105": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.106": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.107": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.108": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.109": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.110": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.111": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.96": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.97": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.98": {
      "resource": "eni-0f05f2edf01f99001"
    },
    "100.121.95.99": {
      "resource": "eni-0f05f2edf01f99001"
    }
  },
  "pools": {
    "allocated": [
      {
        "cidrs": [
          "100.121.95.96/28"
        ],
        "pool": "default"
      }
    ]
  },
  "pre-allocate": 1
}

In `syncToAPIServer`, populate `Spec.IPAM.Pools.Allocated` alongside
`Spec.IPAM.Pool` for nodes with ENI status. This enables the ENI
multi-pool migration (cilium/design-cfps#87): new agents (1.20) will read
CIDRs from `Pools.Allocated` (with the multipool allocator) while old
agents (1.19) will continue reading from `Pool` (with the CRD allocator).

Note: this PR only includes the double write logic on the operator side,
a followup PR will change the read path on the agent side.

For secondary IP mode, each IP is written as a /32 CIDR. For prefix
delegation mode, each /28 prefix is written directly alongside the
secondary IP /32s. All CIDRs are placed under the "default" pool.

The dual-write is low-frequency (fires on ENI capacity changes, not on
every pod event) and will be removed in 1.21. The dual write is also
atomic, both `Pool` and `Pools.Allocated` get written in the same
CilimNode update operation so they are garanteed to always be in sync
and consistent.

Relates to cilium/design-cfps#87

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 1, 2026
@HadrienPatte HadrienPatte added area/operator Impacts the cilium-operator component area/eni Impacts ENI based IPAM. release-note/misc This PR makes changes that have no direct user impact. area/ipam IP address management, including cloud IPAM labels Apr 1, 2026
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 1, 2026
@HadrienPatte
Copy link
Copy Markdown
Member Author

/test

1 similar comment
@cilium-ariane
Copy link
Copy Markdown

cilium-ariane bot commented Apr 1, 2026

/test

@HadrienPatte HadrienPatte marked this pull request as ready for review April 1, 2026 13:48
@HadrienPatte HadrienPatte requested a review from a team as a code owner April 1, 2026 13:48
@HadrienPatte HadrienPatte requested a review from pippolo84 April 1, 2026 13:48
HadrienPatte added a commit that referenced this pull request Apr 1, 2026
This PR is the "double read" equivalent to #45110.

In the operator's `recalculate()`, detect whether the agent is using the
multi-pool or CRD allocator by checking `Spec.IPAM.Pools.Requested`
entries and `Status.IPAM.Used`:
* 1.20 agents write their total desired IP count to `Pools.Requested` and
  stop writing `Status.IPAM.Used`.
* 1.19 agents only write `Status.IPAM.Used`.

The dual-check handles the downgrade case: if a 1.20 agent wrote
`Pools.Requested` and was then rolled back to 1.19, causing the
`CiliumNode` to keep a stale `Pools.Requested` from its time under a
1.20 agent. The operator will properly handle that case by detecting
that the agent is now populating `Status.IPAM.Used` again and will
ignore `Pools.Requested`.

One observable difference between both modes is that under the CRD
allocator, the agents communicate their IP usage, so the operator needs
to do additional computations on top of that to take into account
pre-allocation buffers and watermarks to get the IP needed value. Under
the multipool allocator, all those computations are handled by the
agents and they directly communicate the resulting number of IP
requested.

A side effect of that difference, is that under multipool, the operator
does not actually have access to the number of used IPs on a node. We
now apply the reverse of the agent's "IP used to IP needed" computation
logic to infer an approximation of the IP used value to emit it as a
metric (`ipam.used_ips`).

Relates to cilium/design-cfps#87

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
HadrienPatte added a commit that referenced this pull request Apr 1, 2026
This PR is the "double read" equivalent to #45110.

In the operator's `recalculate()`, detect whether the agent is using the
multi-pool or CRD allocator by checking `Spec.IPAM.Pools.Requested`
entries and `Status.IPAM.Used`:
* 1.20 agents write their total desired IP count to `Pools.Requested` and
  stop writing `Status.IPAM.Used`.
* 1.19 agents only write `Status.IPAM.Used`.

The dual-check handles the downgrade case: if a 1.20 agent wrote
`Pools.Requested` and was then rolled back to 1.19, causing the
`CiliumNode` to keep a stale `Pools.Requested` from its time under a
1.20 agent. The operator will properly handle that case by detecting
that the agent is now populating `Status.IPAM.Used` again and will
ignore `Pools.Requested`.

Relates to cilium/design-cfps#87

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
HadrienPatte added a commit that referenced this pull request Apr 1, 2026
This PR is the "double read" equivalent to #45110.

In the operator's `recalculate()`, detect whether the agent is using the
multi-pool or CRD allocator by checking `Spec.IPAM.Pools.Requested`
entries and `Status.IPAM.Used`:
* 1.20 agents write their total desired IP count to `Pools.Requested` and
  stop writing `Status.IPAM.Used`.
* 1.19 agents only write `Status.IPAM.Used`.

The dual-check handles the downgrade case: if a 1.20 agent wrote
`Pools.Requested` and was then rolled back to 1.19, causing the
`CiliumNode` to keep a stale `Pools.Requested` from its time under a
1.20 agent. The operator will properly handle that case by detecting
that the agent is now populating `Status.IPAM.Used` again and will
ignore `Pools.Requested`.

Relates to cilium/design-cfps#87

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
HadrienPatte added a commit that referenced this pull request Apr 1, 2026
This PR is the "double read" equivalent to #45110.

In the operator's `recalculate()`, detect whether the agent is using the
multi-pool or CRD allocator by checking `Spec.IPAM.Pools.Requested`
entries and `Status.IPAM.Used`:
* 1.20 agents write their total desired IP count to `Pools.Requested` and
  stop writing `Status.IPAM.Used`.
* 1.19 agents only write `Status.IPAM.Used`.

The dual-check handles the downgrade case: if a 1.20 agent wrote
`Pools.Requested` and was then rolled back to 1.19, causing the
`CiliumNode` to keep a stale `Pools.Requested` from its time under a
1.20 agent. The operator will properly handle that case by detecting
that the agent is now populating `Status.IPAM.Used` again and will
ignore `Pools.Requested`.

Relates to cilium/design-cfps#87

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
HadrienPatte added a commit that referenced this pull request Apr 1, 2026
This PR is the "double read" equivalent to #45110.

In the operator's `recalculate()`, detect whether the agent is using the
multi-pool or CRD allocator by checking `Spec.IPAM.Pools.Requested`
entries and `Status.IPAM.Used`:
* 1.20 agents write their total desired IP count to `Pools.Requested` and
  stop writing `Status.IPAM.Used`.
* 1.19 agents only write `Status.IPAM.Used`.

The dual-check handles the downgrade case: if a 1.20 agent wrote
`Pools.Requested` and was then rolled back to 1.19, causing the
`CiliumNode` to keep a stale `Pools.Requested` from its time under a
1.20 agent. The operator will properly handle that case by detecting
that the agent is now populating `Status.IPAM.Used` again and will
ignore `Pools.Requested`.

Relates to cilium/design-cfps#87

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
Copy link
Copy Markdown
Member

@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

@pippolo84 pippolo84 added this pull request to the merge queue Apr 3, 2026
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 3, 2026
Merged via the queue into main with commit 93e7759 Apr 3, 2026
485 of 489 checks passed
@pippolo84 pippolo84 deleted the pr/HadrienPatte/eni-ipam-dualwrite branch April 3, 2026 13:54
HadrienPatte added a commit that referenced this pull request Apr 3, 2026
This PR is the "double read" equivalent to #45110.

In the operator's `recalculate()`, detect whether the agent is using the
multi-pool or CRD allocator by checking `Spec.IPAM.Pools.Requested`
entries and `Status.IPAM.Used`:
* 1.20 agents write their total desired IP count to `Pools.Requested` and
  stop writing `Status.IPAM.Used`.
* 1.19 agents only write `Status.IPAM.Used`.

The dual-check handles the downgrade case: if a 1.20 agent wrote
`Pools.Requested` and was then rolled back to 1.19, causing the
`CiliumNode` to keep a stale `Pools.Requested` from its time under a
1.20 agent. The operator will properly handle that case by detecting
that the agent is now populating `Status.IPAM.Used` again and will
ignore `Pools.Requested`.

Relates to cilium/design-cfps#87

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
github-merge-queue bot pushed a commit that referenced this pull request Apr 3, 2026
This PR is the "double read" equivalent to #45110.

In the operator's `recalculate()`, detect whether the agent is using the
multi-pool or CRD allocator by checking `Spec.IPAM.Pools.Requested`
entries and `Status.IPAM.Used`:
* 1.20 agents write their total desired IP count to `Pools.Requested` and
  stop writing `Status.IPAM.Used`.
* 1.19 agents only write `Status.IPAM.Used`.

The dual-check handles the downgrade case: if a 1.20 agent wrote
`Pools.Requested` and was then rolled back to 1.19, causing the
`CiliumNode` to keep a stale `Pools.Requested` from its time under a
1.20 agent. The operator will properly handle that case by detecting
that the agent is now populating `Status.IPAM.Used` again and will
ignore `Pools.Requested`.

Relates to cilium/design-cfps#87

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
HadrienPatte added a commit to DataDog/cilium that referenced this pull request Apr 3, 2026
Replace the CRD allocator with the multi-pool allocator for ENI IPAM mode
on the agent side. Previous PRs cilium#45110 and cilium#45124 ensure the operator
already supports this new agent setup.

The new `eniMultiPoolAllocator` is a light wrapper on the standard
`multiPoolAllocator` that enriches `AllocationResult` with ENI-specific
required metadata via `buildENIAllocationResult` (see cilium#45089).

Key differences from the standard multi-pool allocator:
* `AllowFirstLastIPs` is enabled so /28 prefix delegation ranges are
  fully allocatable (see cilium#45025 and cilium#45082).
* `LinearPreAlloc` uses a simple `inUse + preAlloc` formula for demand
  computation instead of `neededIPCeil` rounding. This matches the CRD
  allocator's `calculateNeededIPs` semantics and is necessary to ensure
  the operator can recover the exact IP usage from the demand signal
  (requested - preAllocate) (see cilium#45124).
* No dependency on `CiliumPodIPPool` CRDs, pools are managed by the
  operator via Spec.IPAM.Pools.Allocated (see cilium#45110).

The agents now read CIDRs from `Spec.IPAM.Pools.Allocated`, allocate IPs
locally, and write aggregate demand to `Spec.IPAM.Pools.Requested`. They
no longer writes per-IP usage to `Status.IPAM.Used`, achieving a
reduction of kubernetes API pressure.

Relates to cilium/design-cfps#87

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
HadrienPatte added a commit to DataDog/cilium that referenced this pull request Apr 3, 2026
Replace the CRD allocator with the multi-pool allocator for ENI IPAM mode
on the agent side. Previous PRs cilium#45110 and cilium#45124 ensure the operator
already supports this new agent setup.

The new `eniMultiPoolAllocator` is a light wrapper on the standard
`multiPoolAllocator` that enriches `AllocationResult` with ENI-specific
required metadata via `buildENIAllocationResult` (see cilium#45089).

Key differences from the standard multi-pool allocator:
* `AllowFirstLastIPs` is enabled so /28 prefix delegation ranges are
  fully allocatable (see cilium#45025 and cilium#45082).
* `LinearPreAlloc` uses a simple `inUse + preAlloc` formula for demand
  computation instead of `neededIPCeil` rounding. This matches the CRD
  allocator's `calculateNeededIPs` semantics and is necessary to ensure
  the operator can recover the exact IP usage from the demand signal
  (requested - preAllocate) (see cilium#45124).
* No dependency on `CiliumPodIPPool` CRDs, pools are managed by the
  operator via Spec.IPAM.Pools.Allocated (see cilium#45110).

The agents now read CIDRs from `Spec.IPAM.Pools.Allocated`, allocate IPs
locally, and write aggregate demand to `Spec.IPAM.Pools.Requested`. They
no longer writes per-IP usage to `Status.IPAM.Used`, achieving a
reduction of kubernetes API pressure.

Relates to cilium/design-cfps#87

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
HadrienPatte added a commit that referenced this pull request Apr 3, 2026
Replace the CRD allocator with the multi-pool allocator for ENI IPAM mode
on the agent side. Previous PRs #45110 and #45124 ensure the operator
already supports this new agent setup.

The new `eniMultiPoolAllocator` is a light wrapper on the standard
`multiPoolAllocator` that enriches `AllocationResult` with ENI-specific
required metadata via `buildENIAllocationResult` (see #45089).

Key differences from the standard multi-pool allocator:
* `AllowFirstLastIPs` is enabled so /28 prefix delegation ranges are
  fully allocatable (see #45025 and #45082).
* `LinearPreAlloc` uses a simple `inUse + preAlloc` formula for demand
  computation instead of `neededIPCeil` rounding. This matches the CRD
  allocator's `calculateNeededIPs` semantics and is necessary to ensure
  the operator can recover the exact IP usage from the demand signal
  (requested - preAllocate) (see #45124).
* No dependency on `CiliumPodIPPool` CRDs, pools are managed by the
  operator via Spec.IPAM.Pools.Allocated (see #45110).

The agents now read CIDRs from `Spec.IPAM.Pools.Allocated`, allocate IPs
locally, and write aggregate demand to `Spec.IPAM.Pools.Requested`. They
no longer writes per-IP usage to `Status.IPAM.Used`, achieving a
reduction of kubernetes API pressure.

Relates to cilium/design-cfps#87

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
HadrienPatte added a commit that referenced this pull request Apr 4, 2026
Replace the CRD allocator with the multi-pool allocator for ENI IPAM mode
on the agent side. Previous PRs #45110 and #45124 ensure the operator
already supports this new agent setup.

The new `eniMultiPoolAllocator` is a light wrapper on the standard
`multiPoolAllocator` that enriches `AllocationResult` with ENI-specific
required metadata via `buildENIAllocationResult` (see #45089).

Key differences from the standard multi-pool allocator:
* `AllowFirstLastIPs` is enabled so /28 prefix delegation ranges are
  fully allocatable (see #45025 and #45082).
* `LinearPreAlloc` uses a simple `inUse + preAlloc` formula for demand
  computation instead of `neededIPCeil` rounding. This matches the CRD
  allocator's `calculateNeededIPs` semantics and is necessary to ensure
  the operator can recover the exact IP usage from the demand signal
  (requested - preAllocate) (see #45124).
* No dependency on `CiliumPodIPPool` CRDs, pools are managed by the
  operator via Spec.IPAM.Pools.Allocated (see #45110).

The agents now read CIDRs from `Spec.IPAM.Pools.Allocated`, allocate IPs
locally, and write aggregate demand to `Spec.IPAM.Pools.Requested`. They
no longer writes per-IP usage to `Status.IPAM.Used`, achieving a
reduction of kubernetes API pressure.

Relates to cilium/design-cfps#87

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
HadrienPatte added a commit that referenced this pull request Apr 4, 2026
In #45110 I added logic to have the operator write
`Spec.IPAM.Pools.Allocated` alongside `Spec.IPAM.Pool`, but this is
actually not correct as this `Allocated` field is supposed to be written
to by the agent.

This commit removes the logic writing `Allocated` from the operator and
moves and adapt the supporting functions so they can be used by the agent
in a following commit.

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
HadrienPatte added a commit that referenced this pull request Apr 5, 2026
In #45110 I added logic to have the operator write
`Spec.IPAM.Pools.Allocated` alongside `Spec.IPAM.Pool`, but this is
actually not correct as this `Allocated` field is supposed to be written
to by the agent.

This commit removes the logic writing `Allocated` from the operator and
moves and adapt the supporting functions so they can be used by the agent
in a following commit.

Signed-off-by: Hadrien Patte <hadrien.patte@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/eni Impacts ENI based IPAM. area/ipam IP address management, including cloud IPAM area/operator Impacts the cilium-operator component ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants