Skip to content

policy: fix ResourceID leak in policy importer during policy churn#44724

Merged
fristonio merged 2 commits intocilium:mainfrom
odinuge:odinuge/resource-id-leak
Mar 12, 2026
Merged

policy: fix ResourceID leak in policy importer during policy churn#44724
fristonio merged 2 commits intocilium:mainfrom
odinuge:odinuge/resource-id-leak

Conversation

@odinuge
Copy link
Copy Markdown
Member

@odinuge odinuge commented Mar 11, 2026

This fixes a slow leak where the ResourceID of a given policy is leaked
in the prefixesByResource map when a policy without a single CIDR rule
is removed.

This is a rather slow leak, where our testing shows that churning ~10
policies that all leak, results in ~75MiB of leaked memory after 24
hours. For testing purposes, we did this with both namespaces and names
of policies being UUIDs to get consistent results. This affects all
agents in the cluster, no matter if the policies select endpoints on the
node or not.

This affects cilium v1.17, and given the code has not changed, it most
likely also affects all stable versions. A following test update shows
this is indeed the case. This code was introduced in the first mentioned
commit below, but it was initially unused. Then, the last referenced
commit is the one that made policy imports use this code.

This code does deserve some rewrite to make it simpler to avoid this,
but keeping this as a simple fix to ease backports.

Fixes: 783465b ("policy: add policy import cell")
Fixes: b762c3f ("policy/k8s: switch to policy importer cell")

Fixes: #issue-number

Fix memory leak triggered by policies being created and deleted

This fixes a slow leak where the ResourceID of a given policy is leaked
in the prefixesByResource map when a policy without a single CIDR rule
is removed.

This is a rather slow leak, where our testing shows that churning ~10
policies that all leak, results in ~75MiB of leaked memory after 24
hours. For testing purposes, we did this with both namespaces and names
of policies being UUIDs to get consistent results. This affects all
agents in the cluster, no matter if the policies select endpoints on the
node or not.

This affects cilium v1.17, and given the code has not changed, it most
likely also affects all stable versions. A following test update shows
this is indeed the case. This code was introduced in the first mentioned
commit below, but it was initially unused. Then, the last referenced
commit is the one that made policy imports use this code.

This code does deserve some rewrite to make it simpler to avoid this,
but keeping this as a simple fix to ease backports.

Fixes: 783465b ("policy: add policy import cell")
Fixes: b762c3f ("policy/k8s: switch to policy importer cell")
Signed-off-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Odin Ugedal <ougedal@palantir.com>
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 11, 2026
@github-actions github-actions bot added the sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. label Mar 11, 2026
This adds some more cases we want to test to ensure we don't leak old
resourceIDs when a policy is removed.

Signed-off-by: Odin Ugedal <odin@uged.al>
Signed-off-by: Odin Ugedal <ougedal@palantir.com>
@odinuge odinuge force-pushed the odinuge/resource-id-leak branch from 24799f9 to aef9fbe Compare March 11, 2026 08:12
@odinuge
Copy link
Copy Markdown
Member Author

odinuge commented Mar 11, 2026

/test

1 similar comment
@odinuge
Copy link
Copy Markdown
Member Author

odinuge commented Mar 11, 2026

/test

@odinuge
Copy link
Copy Markdown
Member Author

odinuge commented Mar 11, 2026

/ci-clustermesh

@odinuge
Copy link
Copy Markdown
Member Author

odinuge commented Mar 11, 2026

/ci-l3-l4

@odinuge odinuge marked this pull request as ready for review March 11, 2026 10:24
@odinuge odinuge requested a review from a team as a code owner March 11, 2026 10:24
@odinuge odinuge requested a review from fristonio March 11, 2026 10:24
@odinuge
Copy link
Copy Markdown
Member Author

odinuge commented Mar 11, 2026

cc @squeed

Copy link
Copy Markdown
Member

@fristonio fristonio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks ❤️

@odinuge
Copy link
Copy Markdown
Member Author

odinuge commented Mar 12, 2026

Thanks @fristonio! Can you also add the backport labels for all supported releases now? Given the severity of this leak, and the simple fix, I think it deserves being backported to v1.17, v1.18 and v1.19.

@fristonio fristonio added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Mar 12, 2026
@maintainer-s-little-helper maintainer-s-little-helper bot added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. and removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Mar 12, 2026
@fristonio fristonio added dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch needs-backport/1.19 This PR / issue needs backporting to the v1.19 branch and removed ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Mar 12, 2026
@maintainer-s-little-helper maintainer-s-little-helper bot added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. and removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Mar 12, 2026
@fristonio fristonio added this pull request to the merge queue Mar 12, 2026
Merged via the queue into cilium:main with commit a9acd58 Mar 12, 2026
83 checks passed
@odinuge
Copy link
Copy Markdown
Member Author

odinuge commented Mar 13, 2026

Thanks @fristonio 🙏

@odinuge odinuge deleted the odinuge/resource-id-leak branch March 13, 2026 07:05
@smagnani96 smagnani96 mentioned this pull request Mar 16, 2026
4 tasks
@smagnani96 smagnani96 added backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. and removed needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch labels Mar 16, 2026
@smagnani96 smagnani96 mentioned this pull request Mar 16, 2026
4 tasks
@smagnani96 smagnani96 added backport-pending/1.18 The backport for Cilium 1.18.x for this PR is in progress. and removed needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch labels Mar 16, 2026
@smagnani96 smagnani96 mentioned this pull request Mar 16, 2026
10 tasks
@smagnani96 smagnani96 added backport-pending/1.19 The backport for Cilium 1.19.x for this PR is in progress. and removed needs-backport/1.19 This PR / issue needs backporting to the v1.19 branch labels Mar 16, 2026
@github-actions github-actions bot added backport-done/1.18 The backport for Cilium 1.18.x for this PR is done. backport-done/1.19 The backport for Cilium 1.19.x for this PR is done. backport-done/1.17 The backport for Cilium 1.17.x for this PR is done. and removed backport-pending/1.18 The backport for Cilium 1.18.x for this PR is in progress. backport-pending/1.19 The backport for Cilium 1.19.x for this PR is in progress. backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. labels Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-done/1.17 The backport for Cilium 1.17.x for this PR is done. backport-done/1.18 The backport for Cilium 1.18.x for this PR is done. backport-done/1.19 The backport for Cilium 1.19.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants