Skip to content

CES: add option to create CES directly from pods#38388

Merged
giorio94 merged 18 commits intocilium:mainfrom
jshr-w:jshr/podtoces
Jan 28, 2026
Merged

CES: add option to create CES directly from pods#38388
giorio94 merged 18 commits intocilium:mainfrom
jshr-w:jshr/podtoces

Conversation

@jshr-w
Copy link
Copy Markdown
Contributor

@jshr-w jshr-w commented Mar 21, 2025

This PR makes a change to add an option to create CiliumEndpointSlices directly from pods instead of from CiliumEndpoints. This is a step towards eliminating the need for CEPs when CES are present, which will allow us to reduce pressure created on the kube-apiserver by the CEP CRD at high scale.

Descriptions of changes are in each commit.

Add option to create CiliumEndpointSlices directly from pods instead of CiliumEndpoints

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 21, 2025
@jshr-w jshr-w force-pushed the jshr/podtoces branch 5 times, most recently from 3bc44bf to 42456bb Compare March 25, 2025 02:45
@jshr-w jshr-w changed the title [WIP] ces: create ces from pods CES: create CES directly from pods Mar 25, 2025
@jshr-w
Copy link
Copy Markdown
Contributor Author

jshr-w commented Mar 25, 2025

/test

@jshr-w jshr-w marked this pull request as ready for review March 25, 2025 03:15
@jshr-w jshr-w requested review from a team as code owners March 25, 2025 03:15
@jshr-w
Copy link
Copy Markdown
Contributor Author

jshr-w commented Mar 25, 2025

cc @dlapcevic @ovidiutirla

@dlapcevic
Copy link
Copy Markdown
Contributor

cc @sypakine

Copy link
Copy Markdown
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK for doc change, thanks!

@qmonnet qmonnet added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. feature/ces Impacts the Cilium Endpoint Slice logic. labels Mar 25, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 25, 2025
Copy link
Copy Markdown
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a first look at the PR, and left a couple of initial comments inline. I still need to fully digest all the logic (which is fairly new to me) though, as well as look at the tests.

This PR doesn't allow to disable the creation of CEPs yet (due to [1]), but if that was the case (which I assume being a follow-up step), CES would become incompatible with the clustermesh-apiserver. This component currently only relies on CEP information (which is fine being only a handful of replicas at most). Essentially, this logic [2] should be extended to work with CES as well, starting either one or the other based on the Cilium configuration. IMO that should be addressed before moving further, to avoid introducing yet another incompatibility between features.

FYI, one of the integration tests failures looks legitimate.

[1]:

cilium/pkg/option/config.go

Lines 3242 to 3248 in 81666d2

// Ensure CiliumEndpointSlice is enabled only if CiliumEndpointCRD is enabled too.
c.EnableCiliumEndpointSlice = vp.GetBool(EnableCiliumEndpointSlice)
if c.EnableCiliumEndpointSlice && c.DisableCiliumEndpointCRD {
log.Fatalf("Running Cilium with %s=%t requires %s set to false to enable CiliumEndpoint CRDs.",
EnableCiliumEndpointSlice, c.EnableCiliumEndpointSlice, DisableCiliumEndpointCRDName)
}

[2]:
go synchronize(ctx, resources.CiliumSlimEndpoints, newEndpointSynchronizer(ctx, logger, cinfo, backend, factory, syncState.WaitForResource()))

@jshr-w
Copy link
Copy Markdown
Contributor Author

jshr-w commented Mar 26, 2025

Thanks for taking a look @giorio94! I will open a PR to ensure compatibility between the clustermesh apiserver and CES is retained with these changes.

Copy link
Copy Markdown
Member

@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! 💯

Spotted a couple of leftovers and left a comment about a minor nit inline, but overall LGTM.

Export WireGuard --enable-wireguard flag for use in Operator.
CES controller requires --enable-wireguard to set the encryption of the
CES.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
Add config option for `ces-controller-mode`. CES Controller can be
allowed to run with 'default' mode, as it functions today, or the new
mode 'slim', where CES completely replace CEPs in the cluster and CEPs
are no longer created.

When in default mode, the DefaultController is used, which watches CEPs
and creates CES based off those changes.
When in slim mode, the new SlimController will be used instead.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
The CES Manager must behave differently in 'default' and 'slim' mode, as
we are required to store different state. The default mode will use the
existing ceptocesmap as the datastore, while we introduce a new CESCache
as the datastore for slim mode since we will need to correlate all the
information when we do not rely on CEPs.

We separate out the managers for these modes into a DefaultManager and
SlimManager, which implement the Manager interface. The reconciler depends on the
Manager interface.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
CES Controller in slim mode will watch CiliumNodes to process updates,
particularly to the EncryptionKey that is stored in CESs.
This requires watching CiliumNodes to handle update and delete. We will
track the set of CEPs on a node, so we know which CESs need to be
reconciled on a node update.

Adds unit tests for node state.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
The Slim CES Controller process CiliumNode updates with an extra handler
to pass updates to CiliumNode encryption keys to the local CES cache, so
that the affected CESs can be reconciled on changes.

Adds unit tests for CES manager node upsert/delete.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
We extend the CESCache to handle updates to CIDs. We will need to
associate pods with their identities for creating CES.

The SecIds data structure contains a selected ID, which is used if there
are multiple ids associated with a label to minimize churn.

Adds unit test for cid state.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
Add flow where controller will process CID updates, update the relevant
mappings in the manager and local CES cache.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
Allow the local CES datastore to store CES and namespace information.

We also introduce the CESCacher interface, to allow the default and slim
CES managers to handle their own local cache types.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
Signed-off-by: jshr-w <shjayaraman@microsoft.com>
Signed-off-by: jshr-w <shjayaraman@microsoft.com>
Add store for pod state to CESCache, along with necessary helper
functions.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
We split the reconciler into the default and slim, to pass the
appropriate params depending on the configuration initialized.

Reconciler requires two new interfaces to handle the default/slim modes,
on reconciliation of CES.

The slim reconciler will reconcile based on pod state and local CES
cache state instead of getting data from the CEP.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
@jshr-w
Copy link
Copy Markdown
Contributor Author

jshr-w commented Dec 26, 2025

Leaving a note; I've followed up with most of the changes requested here - there are a few I will kick to a follow-up PR such as adding more tests. I've noted them in the comments here & added them to the associated issue to ensure tracking for their follow-up. This feature is currently behind a hidden flag and partially implemented at this point.

Thanks @giorio94 for your patience and help through the entire review process for this super long piece :)

When a pod is added/updated or deleted, we insert it into the cache. We
only attempt to add it to the CES cache if we have sufficient state,
otherwise we wait for the necessary info to be ready on the pod.
We will add the pod to the cache whether or not it has been assigned an
identity -- in the case it has not, we will receive an update when the
identity is created and associate it with the pod later.

Add unit tests for cescache and for slim manager flow.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
On initialization of the operator and CES controller, we sync CES from
the store to our local cache before the controller runs. To do this, we
read the existing store state and create the existing CES in our local
cache.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
Add start flow for SlimController to run.

Modify test helpers for slim mode to be tested.

Signed-off-by: jshr-w <shjayaraman@microsoft.com>
@jshr-w
Copy link
Copy Markdown
Contributor Author

jshr-w commented Dec 30, 2025

/test

@jshr-w
Copy link
Copy Markdown
Contributor Author

jshr-w commented Jan 6, 2026

/ci-eks

@giorio94
Copy link
Copy Markdown
Member

giorio94 commented Jan 7, 2026

@cilium/operator Gentle ping 🙏

I've also marked this PR as blocked for the moment, due to feature freeze.

@jshr-w
Copy link
Copy Markdown
Contributor Author

jshr-w commented Jan 20, 2026

Hello, gentle ping again for a review from @cilium/operator

Copy link
Copy Markdown
Member

@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joestringer
Copy link
Copy Markdown
Member

Shreya pointed out that this was previously reviewed by @pippolo84 for the @cilium/operator group, but he has stepped down from the group so GitHub no longer recognizes that - hence it was stuck waiting for operator review despite the interface not explicitly showing this in the top right corner. I re-requested review from the group to nudge it forward.

@joestringer
Copy link
Copy Markdown
Member

I'm seeing that @giorio94 reviewed the PR overall in detail. I'm glad he did, though it does feel like we should have pushed back more for incremental changes. >1K LoC is generally not an acceptable size for a change. I can tell from the comments in Marco's review that he feels obligated not to block the PR due to the sheer size and effort you've put into the PR. The functionality is useful. We definitely appreciate the work. On the other hand, this is how we end up with code we're not happy with in the repo because no-one wants to iterate further on the PR. How do we track and resolve the concerns about the operator getting into a state where it can't recover and properly sync back? Should we have an issue for it, and can we concretely describe what it would look like to resolve those concerns?

@jshr-w
Copy link
Copy Markdown
Contributor Author

jshr-w commented Jan 26, 2026

Hi Joe, that's fair to say. My understanding based on previous comments was that it would be okay to break this into smaller commits, but I would have also been willing to (and we can also do this now) break this into smaller PRs if requested.

It is not showing as linked, but I have an issue #38507 where I am tracking the other changes needed for this -- I will flesh some of these items out further.

@joestringer
Copy link
Copy Markdown
Member

As long as we're tracking those concerns somehow and reviewers are otherwise happy that this change improves overall code health, we should probably move forward to merge this. Making progress is a healthy step towards improving the code overall (taking inspiration from Google's code review guidelines). @giorio94 does that sound good to you, merge then iterate in the tree?

@giorio94
Copy link
Copy Markdown
Member

giorio94 commented Jan 28, 2026

Thanks Joe (and obviously @jshr-w for all the work you put here), I'll then go ahead and merge this PR. I think it sets reasonable foundations for the feature (which is currently gated behind a hidden flag), and it seems easier to me to keep iterating afterwards. My suggestion in that respect would be prioritize the scaffolding of the script tests infrastructure and the introduction of a few initial script tests. That should be of great help to figure out some of the possible rough edges, increase confidence, as well as allow to incrementally validate future improvements and fixes.

@joestringer
Copy link
Copy Markdown
Member

Thanks for your guidance on this @giorio94 . And congrats @jshr-w for landing this useful feature 🎉

@jshr-w
Copy link
Copy Markdown
Contributor Author

jshr-w commented Jan 28, 2026

Thanks Marco again for all your guidance through this PR! I will make sure the updates to testing are handled as the first priority to follow this PR. And thanks Joe for your help :)

@smagnani96 smagnani96 mentioned this pull request Mar 16, 2026
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature/ces Impacts the Cilium Endpoint Slice logic. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants