Skip to content

daemon/multipool: Streamline implementation and improve reusability#44030

Merged
pippolo84 merged 11 commits intocilium:mainfrom
pippolo84:pr/pippolo84/agent-multipool-impro
Jan 30, 2026
Merged

daemon/multipool: Streamline implementation and improve reusability#44030
pippolo84 merged 11 commits intocilium:mainfrom
pippolo84:pr/pippolo84/agent-multipool-impro

Conversation

@pippolo84
Copy link
Copy Markdown
Member

@pippolo84 pippolo84 commented Jan 27, 2026

Multi-pool IPAM has been proposed as the default IPAM mode to allocate IP addresses to DRA resources in the upcoming Cilium Network Driver.
In the context of the Network Driver, the core logic of the multi-pool mode (i.e: the ability to assign and release addresses from specific pools on demand) will be kept unchanged and reused through an additional instance of the multi-pool manager in the Driver. From the point of view of the codebase, this allows to have a single package that will be included in both pod IPAM and DRA resources IPAM. An example can be seen in this branch, where parts of the multi-pool IPAM code is duplicated and adapted to support DRA resources IPAM.

To achieve this goal, this PR aims to improve the multi-pool reusability on the agent side, moving all the parts specific to pod IPAM in the Allocator type and leaving the manager implementation as generic as possible. Specifically:

  • remove direct dependency from the daemon configuration
  • extract the parts related to the local CiliumNode update specifically tied to pod IPAM

Since we are at it, the manager is also modernized using hive-provided jobs instead of controllers.

The PR does not intend to change anything regarding the current behavior of multi pool IPAM.

Note to reviewers: please review each commit individually

Related: #43937

@pippolo84 pippolo84 added area/daemon Impacts operation of the Cilium daemon. dont-merge/preview-only Only for preview or testing, don't merge it. release-note/misc This PR makes changes that have no direct user impact. area/ipam IP address management, including cloud IPAM area/multipool Affects Multi-Pool IPAM labels Jan 27, 2026
@pippolo84
Copy link
Copy Markdown
Member Author

/ci-multi-pool

@pippolo84 pippolo84 force-pushed the pr/pippolo84/agent-multipool-impro branch from dd493b9 to 868d530 Compare January 27, 2026 13:57
@pippolo84
Copy link
Copy Markdown
Member Author

/ci-multi-pool

@pippolo84 pippolo84 force-pushed the pr/pippolo84/agent-multipool-impro branch 2 times, most recently from 2213cac to 4ce36e6 Compare January 27, 2026 17:43
@pippolo84
Copy link
Copy Markdown
Member Author

/ci-multi-pool

@pippolo84 pippolo84 force-pushed the pr/pippolo84/agent-multipool-impro branch from 4ce36e6 to 0013804 Compare January 28, 2026 17:53
@pippolo84
Copy link
Copy Markdown
Member Author

/ci-multi-pool

@pippolo84 pippolo84 force-pushed the pr/pippolo84/agent-multipool-impro branch 3 times, most recently from 10f326b to 19d6015 Compare January 29, 2026 11:41
Since IPAM has been modularized, we can rely on the dependency injected
job.Group provided by the hive. Therefore, modernize the background
operations in the multi-pool manager (namely, the handling of local
CiliumNode events and the update of the pools in the local CiliumNode).

Regarding the streams of local CiliumNode events, there is no need to
explicitly disable retries in the resource: events are always acked as
successful. In case of errors, the timer job will retry another local
node update later.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
The nodeUpdater interface was meant to reduce the methods needed to
implement a fake CiliumNode client in multi-pool unit tests. For the
sake of simplicity, let's remove the interface and use the typed client
directly, just like the majority of the other Cilium packages. Moreover,
now that the multi-pool manager is being modernized, the unit tests will
be reworked lated to rely on the hive provided fake clientset, removing
the need for a custom fake CiliumNode client altogether.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
In preparation for reusing the multi-pool manager for the upcoming
Cilium Network driver IPAM, let's remove the direct dependency of the
manager from the daemon configuration. Instead, just pass each
configuration option as a function parameter. This allows to easily
instantiate an additional manager instance feeding it with a separate
set of options.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
There is no need to create the CiliumNode resource in the multi-pool
manager initializer, since for cluster pool and multi pool modes it is
already done during the daemon initial configuration (see
configureDaemon). Therefore, remove initial CiliumNode update and the
dependency from node discovery.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
In multi-pool IPAM mode, after a restart the manager is in charge of
propagating an initial update of the local CiliumNode. This should be
done once the information about the requested and allocated pools in the
k8s resource have been received, so that the node discovery does not
propagate spurious update the external kvstore.  Since this operation is
needed only for pod IPAM, in order to reuse the multi-pool code in other
contexts (e.g: multi-pool IPAM for DRA resources) let's move the local
node initial update outside of the manager code. Since we are at it,
rename some internal fields to improve readability.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
The multi-pool manager is in charge of propagating the local node
allocation CIDRs updates to the local node store that, in turn,
propagates this information to all the other cluster's nodes for
inter-node routing.

Since this logic is specific to the context of pod IPAM, let's move it
to a separate job that is started then the daemon bootstraps the IPAM
subsystem. Thanks to the resource internal implementation, opening an
additional events stream for the local node does not have any additional
overhead (i.e: the same informer is used to feed the additional
subscriber).

Also, fix unit tests accordingly.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
In order to improve the reusability of the multi-pool package, let's
clearly separate the parts specific for pod IPAM, like the Allocator and
the local node allocations CIDRs synchronization, and the ones that
implement the core logic, like the manager.

This allows to reuse the multi-pool IPAM core logic in the context of
the upcoming Cilium Network driver.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Since multi-pool package is about to be used in contexts other than the
pod IPAM (like DRA resources IPAM), let's remove all the explicit
reference to pods in the multi-pool code and instead mention only
generic "CIDRs".

This commit does not intend to change anything in the multi-pool
behavior.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
In the context of pod IPAM, at startup, the multi-pool manager should
wait for all preallocated CiliumPodIPPools to be loaded into stateDB.
This is needed to check the existing annotations on the pools later
during addresses assignment.

In order to keep all pod IPAM specific logic into the allocator and away
from the core manager, let's move this logic into the pod IPAM
allocator.  The timeout for this wait is limited to 1 minute since the
pools should be inserted immediately at startup by the k8s reflector.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
Move the skip masquerade feature to the pod IPAM Allocator. The
multi-pool manager exposes a function parameter that, given a pool name,
returns a boolean to understand if masquerading for pool addresses
should be skipped or not. Doing this allows to keep the logic tied to
CiliumPodIPPool, specific for pod IPAM, outside of the core manager.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
With the upcoming multi-pool resources IPAM for the Cilium Network
Driver, the multi-pool manager should be able to read pools information
from different CiliumNode fields (e.g: Spec.IPAM.Pools for pods and
Spec.IPAM.ResourcePools for DRA resources). Similarly, the manager
should update the allocations status using different fields in the
CiliumNode. Therefore, add a function parameter to the manager to
extract the proper pools reference from the CiliumNode object. This
allows the customization of the manager instance based on the needs of
the specific context.

Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com>
@pippolo84 pippolo84 force-pushed the pr/pippolo84/agent-multipool-impro branch from 19d6015 to 12d9e15 Compare January 29, 2026 19:35
@pippolo84 pippolo84 removed the dont-merge/preview-only Only for preview or testing, don't merge it. label Jan 29, 2026
@pippolo84
Copy link
Copy Markdown
Member Author

/test

@pippolo84 pippolo84 marked this pull request as ready for review January 30, 2026 09:31
@pippolo84 pippolo84 requested review from a team as code owners January 30, 2026 09:31
Copy link
Copy Markdown
Contributor

@squeed squeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved just for my (trivial) set of files.

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 30, 2026
@pippolo84 pippolo84 added this pull request to the merge queue Jan 30, 2026
Merged via the queue into cilium:main with commit 24fcf4f Jan 30, 2026
88 checks passed
@pippolo84 pippolo84 deleted the pr/pippolo84/agent-multipool-impro branch January 30, 2026 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/daemon Impacts operation of the Cilium daemon. area/ipam IP address management, including cloud IPAM area/multipool Affects Multi-Pool IPAM ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants