Skip to content

clustermesh: fix a goroutine leak in epslicesync#44444

Merged
giorio94 merged 1 commit intocilium:mainfrom
MrFreezeex:fix-endpointslicesync-goroutineleak
Feb 23, 2026
Merged

clustermesh: fix a goroutine leak in epslicesync#44444
giorio94 merged 1 commit intocilium:mainfrom
MrFreezeex:fix-endpointslicesync-goroutineleak

Conversation

@MrFreezeex
Copy link
Copy Markdown
Member

@MrFreezeex MrFreezeex commented Feb 19, 2026

The receive function could stall because of the send to the unbuffered result channel and prevent the goroutine from terminating. This commit fixes that by following the same approach as StreamWatcher (which the meshEndpointSliceWatcher is inspired from) to avoid this leak.

Fixes #44421

clustermesh: fix a goroutine leak related to EndpointSliceSync when removing cluster

@MrFreezeex MrFreezeex requested a review from a team as a code owner February 19, 2026 16:06
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 19, 2026
@MrFreezeex MrFreezeex added release-note/bug This PR fixes an issue in a previous release of Cilium. area/clustermesh Relates to multi-cluster routing functionality in Cilium. needs-backport/1.19 This PR / issue needs backporting to the v1.19 branch and removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Feb 19, 2026
@MrFreezeex
Copy link
Copy Markdown
Member Author

/test

@MrFreezeex MrFreezeex force-pushed the fix-endpointslicesync-goroutineleak branch from 051f725 to 9a3a2da Compare February 19, 2026 16:12
@MrFreezeex MrFreezeex marked this pull request as draft February 19, 2026 16:15
@MrFreezeex MrFreezeex marked this pull request as ready for review February 19, 2026 16:18
@MrFreezeex
Copy link
Copy Markdown
Member Author

/test

@MrFreezeex
Copy link
Copy Markdown
Member Author

So the error doesn't appear anymore while using the stress command but there seems to be something else failing there on very rare occasion (~0.12% locally). But at least I don't get the goroutine leaks anymore, so this should be fine on its own and I will check the other failure later on 👀

The receive function could stall because of the send to the unbuffered result
channel and prevent the goroutine from terminating. This commit fixes that by
following the same approach as StreamWatcher (which the
meshEndpointSliceWatcher is inspired from) to avoid this leak.

Reported-by: Tobias Klauser <tobias@cilium.io>
Signed-off-by: Arthur Outhenin-Chalandre <git@mrfreezeex.fr>
@MrFreezeex MrFreezeex force-pushed the fix-endpointslicesync-goroutineleak branch from 9a3a2da to f7f0ac5 Compare February 19, 2026 18:41
@MrFreezeex
Copy link
Copy Markdown
Member Author

/test

Copy link
Copy Markdown
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@giorio94 giorio94 added this pull request to the merge queue Feb 23, 2026
Merged via the queue into cilium:main with commit af3d557 Feb 23, 2026
77 checks passed
@YutaroHayakawa YutaroHayakawa mentioned this pull request Feb 24, 2026
21 tasks
@YutaroHayakawa YutaroHayakawa added backport-pending/1.19 The backport for Cilium 1.19.x for this PR is in progress. and removed needs-backport/1.19 This PR / issue needs backporting to the v1.19 branch labels Feb 24, 2026
@github-actions github-actions bot added backport-done/1.19 The backport for Cilium 1.19.x for this PR is done. and removed backport-pending/1.19 The backport for Cilium 1.19.x for this PR is in progress. labels Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/clustermesh Relates to multi-cluster routing functionality in Cilium. backport-done/1.19 The backport for Cilium 1.19.x for this PR is done. release-note/bug This PR fixes an issue in a previous release of Cilium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI: pkg/clustermesh/endpointslicesync: found unexpected goroutines

3 participants