Skip to content

envoy: Handle pending completions for deleted policies#44754

Merged
jrajahalme merged 4 commits intocilium:mainfrom
jrajahalme:completion-delete-resource-name-on-policy-remove
Mar 13, 2026
Merged

envoy: Handle pending completions for deleted policies#44754
jrajahalme merged 4 commits intocilium:mainfrom
jrajahalme:completion-delete-resource-name-on-policy-remove

Conversation

@jrajahalme
Copy link
Copy Markdown
Member

@jrajahalme jrajahalme commented Mar 12, 2026

When a network policy is deleted, the resource names of them need to be deleted from any pending completions to allow the Wait on the WaitGroup to return when the ACK is received from Envoy.

Fixes: #44543
Fixes: #44714

Fixed an issue where policy update ack is never completed after endpoint deletion.

Metrics mock had nack and ack reversed, fix it.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Add a unit test for case where a completion is created for resource
upsert, but the resource is Deleted before the ACK is processed. This is
currently failing, fix is in the following commit.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
@jrajahalme jrajahalme requested a review from a team as a code owner March 12, 2026 13:53
@jrajahalme jrajahalme added the kind/bug This is a bug in the Cilium logic. label Mar 12, 2026
@jrajahalme jrajahalme requested a review from sayboras March 12, 2026 13:53
@jrajahalme jrajahalme added area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. affects/v1.16 This issue affects v1.16 branch needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch needs-backport/1.19 This PR / issue needs backporting to the v1.19 branch labels Mar 12, 2026
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 12, 2026
@jrajahalme jrajahalme added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Mar 12, 2026
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 12, 2026
When a network policy is deleted, the resource names of them need to be
deleted from any pending completions to allow the Wait on the WaitGroup
to return when the ACK is received from Envoy.

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
@jrajahalme jrajahalme force-pushed the completion-delete-resource-name-on-policy-remove branch from f7cf5f3 to 93306f3 Compare March 12, 2026 14:07
@jrajahalme
Copy link
Copy Markdown
Member Author

Fixed a Go lint finding.

@jrajahalme
Copy link
Copy Markdown
Member Author

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 13, 2026
@jrajahalme jrajahalme added the backport/author The backport will be carried out by the author of the PR. label Mar 13, 2026
@jrajahalme jrajahalme added this pull request to the merge queue Mar 13, 2026
Merged via the queue into cilium:main with commit 2448120 Mar 13, 2026
79 checks passed
@jrajahalme jrajahalme deleted the completion-delete-resource-name-on-policy-remove branch March 13, 2026 16:31
@riuvshyn
Copy link
Copy Markdown

Hey @jrajahalme thanks for working on this! Is this change going to be backported to 1.18.x ? 🙏🏽

}
if pending.typeURL == typeURL {
for _, resourceNames := range pending.remainingNodesResources {
// resourceNames map is left in place even if empty, so that
Copy link
Copy Markdown
Member

@christarazi christarazi Mar 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This comment is useful here, but it might also be worth adding a corresponding note on the other end, inside HandleResourceVersionAck where len(remainingResourceNames) == 0 is checked. That's where the empty map is actually consumed, and a future reader seeing it might not realize it was intentionally emptied by Delete. A short comment there pointing back here would close this loop IMO.

@jrajahalme jrajahalme added backport-pending/1.19 The backport for Cilium 1.19.x for this PR is in progress. backport-pending/1.18 The backport for Cilium 1.18.x for this PR is in progress. and removed needs-backport/1.19 This PR / issue needs backporting to the v1.19 branch needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch labels Mar 16, 2026
@jrajahalme jrajahalme mentioned this pull request Mar 16, 2026
1 task
@jrajahalme jrajahalme added backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. and removed needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch labels Mar 16, 2026
@github-actions github-actions bot added backport-done/1.18 The backport for Cilium 1.18.x for this PR is done. backport-done/1.19 The backport for Cilium 1.19.x for this PR is done. backport-done/1.17 The backport for Cilium 1.17.x for this PR is done. and removed backport-pending/1.18 The backport for Cilium 1.18.x for this PR is in progress. backport-pending/1.19 The backport for Cilium 1.19.x for this PR is in progress. backport-pending/1.17 The backport for Cilium 1.17.x for this PR is in progress. labels Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

affects/v1.16 This issue affects v1.16 branch area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. backport/author The backport will be carried out by the author of the PR. backport-done/1.17 The backport for Cilium 1.17.x for this PR is done. backport-done/1.18 The backport for Cilium 1.18.x for this PR is done. backport-done/1.19 The backport for Cilium 1.19.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Yet another issue with unexpected fqdn egress dropped traffic identity updater is wedged forever when missing NPDS ack from envoy

7 participants