Skip to content

clustermesh: fix endpointslicesync cleanup race condition#44503

Merged
MrFreezeex merged 1 commit intocilium:mainfrom
MrFreezeex:fix-endpointslicesync-rc
Feb 24, 2026
Merged

clustermesh: fix endpointslicesync cleanup race condition#44503
MrFreezeex merged 1 commit intocilium:mainfrom
MrFreezeex:fix-endpointslicesync-rc

Conversation

@MrFreezeex
Copy link
Copy Markdown
Member

The cleanup logic was relying on the informer to list EndpointSlice to delete which is racy since the informer is updated asynchronously/separately through its own watch rather than acting as a write through cache.

The cleanup logic which is invoked on a cluster removal is modified in this commit to thus directly rely on the client instead. This could was actually reproducible with the existing endpointslicesync test at at small rate (~0.1%).

clustermesh: fix a race condition where EndpointSlices created just before a cluster is removed could be left uncleaned

@MrFreezeex MrFreezeex requested a review from a team as a code owner February 23, 2026 17:33
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 23, 2026
@MrFreezeex MrFreezeex added release-note/bug This PR fixes an issue in a previous release of Cilium. area/clustermesh Relates to multi-cluster routing functionality in Cilium. needs-backport/1.19 This PR / issue needs backporting to the v1.19 branch and removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Feb 23, 2026
@MrFreezeex
Copy link
Copy Markdown
Member Author

/test

@MrFreezeex MrFreezeex force-pushed the fix-endpointslicesync-rc branch from 9d045ef to 304e26b Compare February 23, 2026 17:36
@MrFreezeex MrFreezeex changed the title clustermesh: fix cleanup race condition clustermesh: fix endpointslicesync cleanup race condition Feb 23, 2026
The cleanup logic was relying on the informer to list EndpointSlice to
delete which is racy since the informer is updated asynchronously/separately
through its own watch rather than acting as a write through cache.

The cleanup logic which is invoked on a cluster removal is modified in
this commit to thus directly rely on the client instead. This could
was actually reproducible with the existing endpointslicesync test at small
rate (~0.1%).

Signed-off-by: Arthur Outhenin-Chalandre <git@mrfreezeex.fr>
@MrFreezeex MrFreezeex force-pushed the fix-endpointslicesync-rc branch from 304e26b to e74bff2 Compare February 23, 2026 17:39
@MrFreezeex
Copy link
Copy Markdown
Member Author

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Feb 24, 2026
@jrajahalme jrajahalme added this pull request to the merge queue Feb 24, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 24, 2026
@MrFreezeex MrFreezeex added this pull request to the merge queue Feb 24, 2026
Merged via the queue into cilium:main with commit c9fa728 Feb 24, 2026
77 checks passed
@MrFreezeex MrFreezeex deleted the fix-endpointslicesync-rc branch February 24, 2026 14:36
@YutaroHayakawa YutaroHayakawa mentioned this pull request Feb 24, 2026
21 tasks
@YutaroHayakawa YutaroHayakawa added backport-pending/1.19 The backport for Cilium 1.19.x for this PR is in progress. and removed needs-backport/1.19 This PR / issue needs backporting to the v1.19 branch labels Feb 24, 2026
@github-actions github-actions bot added backport-done/1.19 The backport for Cilium 1.19.x for this PR is done. and removed backport-pending/1.19 The backport for Cilium 1.19.x for this PR is in progress. labels Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/clustermesh Relates to multi-cluster routing functionality in Cilium. backport-done/1.19 The backport for Cilium 1.19.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants