Skip to content

fix(policy): guard against deserialization errors in watches#14844

Merged
adleong merged 4 commits intomainfrom
alex/guards-deserialize-him
Jan 12, 2026
Merged

fix(policy): guard against deserialization errors in watches#14844
adleong merged 4 commits intomainfrom
alex/guards-deserialize-him

Conversation

@adleong
Copy link
Member

@adleong adleong commented Jan 10, 2026

Fixes #14741

When the policy controller watches a resource and encounters a resource which it cannot deserialize, the entire watch fails and needs to be restarted. When this happens, the problematic resource is encountered again leading to an infinite loop of watch restarts. This can happen when a resource has a enum variant which is not present in Linkerd's client bindings, such as with the CORS filter in HttpRoute as described in #14741.

We add a DeserializeGuard to all of the resource watches in the policy controller so that when a resource cannot be deserialized, that event is logged and the event is skipped, allowing the watch to continue.

Prior to this fix, the policy controller would log an repeating stream of this log message when such a resource was encountered:

2025-11-20T13:09:10.140731Z INFO httproutes.gateway.networking.k8s.io: kubert::errors: stream failed error=failed to perform initial object list: Error deserializing response: unknown variant CORS, expected one of RequestHeaderModifier, ResponseHeaderModifier, RequestMirror, RequestRedirect, URLRewrite, ExtensionRef at line 1 column 3019

After this fix, it now logs this message once:

2026-01-10T00:59:50.265920Z  WARN httproutes.gateway.networking.k8s.io: linkerd_policy_controller_runtime::args: skipping invalid HTTPRoute resource gateway-conformance-infra/cors-allow-credentials: Unknown variant CORS. Expected one of RequestHeaderModifier, ResponseHeaderModifier, RequestMirror, RequestRedirect, URLRewrite, ExtensionRef

Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
@adleong adleong requested a review from a team as a code owner January 10, 2026 01:01
Signed-off-by: Alex Leong <alex@buoyant.io>
Copy link
Member

@zaharidichev zaharidichev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. LGTM, pending comments + CI


// A watch that uses DeserializeGuard to skip resources which fail to deserialize.
// Any deserialization errors are logged as warnings and the event is skipped.
fn guarded_watch<T, R>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIOLI: This can be simplified to:

  • avoid writing the logging logic for each event
  • drop the T param of the runtime, seems to always be Option<Bound>
// A watch that uses DeserializeGuard to skip resources which fail to deserialize.
// Any deserialization errors are logged as warnings and the event is skipped.
fn guarded_watch<R>(
    runtime: &mut kubert::Runtime<Option<Bound>>,
    watcher_config: watcher::Config,
) -> impl Stream<Item = watcher::Event<R>>
where
    R: Resource + DeserializeOwned + Clone + Debug + Send + 'static,
    R::DynamicType: Default,
{
    runtime
        .watch_all::<DeserializeGuard<R>>(watcher_config)
        .filter_map(async |item| {
            let log_warning = |t: &DeserializeGuard<R>| {
                if let Err(ref err) = t.0 {
                    tracing::warn!(
                        "skipping invalid {} resource {}/{}: {}",
                        R::kind(&Default::default()),
                        t.namespace().unwrap_or("<cluster>".to_string()),
                        t.name_any(),
                        err
                    );
                }
            };

            match item {
                watcher::Event::Apply(t) => {
                    log_warning(&t);
                    t.0.ok().map(watcher::Event::Apply)
                }
                watcher::Event::Delete(t) => {
                    log_warning(&t);
                    t.0.ok().map(watcher::Event::Delete)
                }
                watcher::Event::InitApply(t) => {
                    log_warning(&t);
                    t.0.ok().map(watcher::Event::InitApply)
                }
                watcher::Event::Init => Some(watcher::Event::Init),
                watcher::Event::InitDone => Some(watcher::Event::InitDone),
            }
        })
}

Signed-off-by: Alex Leong <alex@buoyant.io>
@adleong adleong merged commit 5dc0e25 into main Jan 12, 2026
98 of 103 checks passed
@adleong adleong deleted the alex/guards-deserialize-him branch January 12, 2026 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

linkerd control plane - policy controller high cpu usage when idle

2 participants