Skip to content

envoy: Fix xds server npds listeners accounting#44830

Merged
jrajahalme merged 3 commits intocilium:mainfrom
fristonio:pr/fristonio/fix/npds-listeners-accounting
Mar 26, 2026
Merged

envoy: Fix xds server npds listeners accounting#44830
jrajahalme merged 3 commits intocilium:mainfrom
fristonio:pr/fristonio/fix/npds-listeners-accounting

Conversation

@fristonio
Copy link
Copy Markdown
Member

@fristonio fristonio commented Mar 16, 2026

See commit message for details.

This affects v1.17 where if a toFQDNs rule is compounded with a CEC listener(and no other proxy listeners), the DNS response to client would skip waiting for Envoy network policy configuration. This leads to a race, where envoy
network policy update happens after client resolves the DNS and makes the request causing connection failure. For cilium version >= v1.18 we preallocate identities for FQDN selectors, so envoy doesn't need network policy update on DNS lookup.

@fristonio fristonio added area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch needs-backport/1.19 This PR / issue needs backporting to the v1.19 branch labels Mar 16, 2026
@github-actions github-actions bot added the cilium-cli This PR contains changes related with cilium-cli label Mar 16, 2026
@fristonio fristonio force-pushed the pr/fristonio/fix/npds-listeners-accounting branch 2 times, most recently from cb8a6a6 to bc0513c Compare March 17, 2026 00:43
@fristonio
Copy link
Copy Markdown
Member Author

/ci-l7

@fristonio fristonio force-pushed the pr/fristonio/fix/npds-listeners-accounting branch from bc0513c to 26c0b07 Compare March 17, 2026 03:10
@fristonio
Copy link
Copy Markdown
Member Author

/ci-l7

3 similar comments
@cilium-ariane
Copy link
Copy Markdown

cilium-ariane bot commented Mar 17, 2026

/ci-l7

@fristonio
Copy link
Copy Markdown
Member Author

/ci-l7

@cilium-ariane
Copy link
Copy Markdown

cilium-ariane bot commented Mar 17, 2026

/ci-l7

@fristonio fristonio force-pushed the pr/fristonio/fix/npds-listeners-accounting branch from 80441f1 to 37d9247 Compare March 17, 2026 05:34
@fristonio
Copy link
Copy Markdown
Member Author

/ci-l7

1 similar comment
@cilium-ariane
Copy link
Copy Markdown

cilium-ariane bot commented Mar 17, 2026

/ci-l7

@fristonio fristonio force-pushed the pr/fristonio/fix/npds-listeners-accounting branch from 37d9247 to 44c8201 Compare March 17, 2026 06:09
@fristonio
Copy link
Copy Markdown
Member Author

/ci-l7

1 similar comment
@cilium-ariane
Copy link
Copy Markdown

cilium-ariane bot commented Mar 17, 2026

/ci-l7

@fristonio
Copy link
Copy Markdown
Member Author

fristonio commented Mar 17, 2026

/test
Edit: EKS Cluster creation failures for multiple workflows, retriggered.

@fristonio fristonio marked this pull request as ready for review March 17, 2026 16:51
@fristonio fristonio requested review from a team as code owners March 17, 2026 16:51
@fristonio fristonio requested review from christarazi, derailed and nezdolik and removed request for christarazi March 17, 2026 16:51
@fristonio fristonio requested a review from jrajahalme March 17, 2026 16:51
Copy link
Copy Markdown
Member

@jrajahalme jrajahalme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small fix needed to the CancelCompletions call

@nezdolik nezdolik removed their request for review March 20, 2026 15:11
@fristonio fristonio force-pushed the pr/fristonio/fix/npds-listeners-accounting branch from 44c8201 to 922c3e2 Compare March 22, 2026 22:26
When a network policy is updated, the decision to wait on ACK
from envoy depends on weather or not there is any listener filter that
requires NPDS(presence of `cilium.bpf_metadata` filter). Before this
change, only internal proxy listeners were accounted. This caused
NetworkPolicy update wait to be skipped if **only** the listeners configured
through CiliumEnvoyConfig were present.

Given that all CEC listeners get injected with `cilium.bpf_metadata`
filter for L3 policy enforcement, incorrect accounting leads to
a race condition where BPF policy map redirects can be configured before
envoy receives the up-to-date policy information. This can cause traffic
interruption due to policy denials in envoy.

This commit fixes this issue by keeping track of `cilium.bpf_metadata`
filter across all envoy listeners to determine the need of NPDS
subscription.

Signed-off-by: Deepesh Pathak <deepeshpathak09@gmail.com>
This commit adds a new connectivity test to validate toFQDN rules
compounded with a CEC listener for L7 processing.

The new test creates a CNP that redirects http traffic destined for
external target to envoy, where a listener configured through CCEC
proxies the request to the requested origin.

Signed-off-by: Deepesh Pathak <deepeshpathak09@gmail.com>
This commit downgrades the severity of "Missing proxy redirect" message
in incremental policy update path to INFO for redirects with listener
references.

Signed-off-by: Deepesh Pathak <deepeshpathak09@gmail.com>
@fristonio fristonio force-pushed the pr/fristonio/fix/npds-listeners-accounting branch from 922c3e2 to 6ae9e3d Compare March 22, 2026 23:16
@fristonio
Copy link
Copy Markdown
Member Author

/test

1 similar comment
@cilium-ariane
Copy link
Copy Markdown

cilium-ariane bot commented Mar 22, 2026

/test

@fristonio fristonio requested a review from jrajahalme March 23, 2026 00:43
Copy link
Copy Markdown
Contributor

@derailed derailed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fristonio Nice work!

@jrajahalme jrajahalme requested a review from christarazi March 25, 2026 17:11
@jrajahalme
Copy link
Copy Markdown
Member

@christarazi This needs a ci-structure review, could you help out?

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 25, 2026
@jrajahalme jrajahalme enabled auto-merge March 26, 2026 09:13
@jrajahalme jrajahalme added this pull request to the merge queue Mar 26, 2026
Merged via the queue into cilium:main with commit b5bdf10 Mar 26, 2026
83 checks passed
@viktor-kurchenko viktor-kurchenko added the backport/author The backport will be carried out by the author of the PR. label Mar 30, 2026
@viktor-kurchenko
Copy link
Copy Markdown
Contributor

@fristonio I've labeled the PR with backport/author, because it depends on #44597 which is marked as backport/author.

Is it ok?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. backport/author The backport will be carried out by the author of the PR. cilium-cli This PR contains changes related with cilium-cli needs-backport/1.17 This PR / issue needs backporting to the v1.17 branch needs-backport/1.18 This PR / issue needs backporting to the v1.18 branch needs-backport/1.19 This PR / issue needs backporting to the v1.19 branch ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants