Skip to content

Conversation

@linkvt
Copy link
Contributor

@linkvt linkvt commented Oct 23, 2025

Hi,

when running kubectl rollout restart deployment, Kubernetes adds the kubectl.kubernetes.io/restartedAt annotation to the deployment's pod template to trigger a rolling restart. However, Knative's reconciler was removing this annotation by completely replacing the deployment spec, causing new pods to be immediately terminated instead of completing the rollout.

This change preserves externally-added metadata at both the deployment level and template level during reconciliation:

  • Deployment-level labels (was already the case) and annotations (new)
  • Pod template labels and annotations (e.g., kubectl.kubernetes.io/restartedAt)

Fixes #14705

Proposed Changes

  • Preserve deployment and template annotations and labels during reconcile
    • knative labels and annotations have priority and will overwrite other labels/annotations

Release Note

Preserve deployment and template annotations and labels during reconcile

/kind bug

Thanks for the review!

@knative-prow knative-prow bot added kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 23, 2025
@linkvt linkvt force-pushed the preserve-deployment-metadata branch from f079377 to 7632241 Compare October 23, 2025 12:03
@codecov
Copy link

codecov bot commented Oct 23, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.06%. Comparing base (60b1c39) to head (dff5d18).
⚠️ Report is 10 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #16199   +/-   ##
=======================================
  Coverage   80.05%   80.06%           
=======================================
  Files         215      215           
  Lines       13322    13330    +8     
=======================================
+ Hits        10665    10672    +7     
+ Misses       2297     2296    -1     
- Partials      360      362    +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@linkvt linkvt requested a review from Fedosin October 23, 2025 12:14
@Fedosin
Copy link
Contributor

Fedosin commented Oct 23, 2025

LGTM, but I'd like @dprotaso to take a look as well.

@linkvt
Copy link
Contributor Author

linkvt commented Oct 23, 2025

/remove-approve

Was added automatically as I'm co-release lead.

@knative-prow knative-prow bot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 23, 2025
@linkvt
Copy link
Contributor Author

linkvt commented Oct 23, 2025

/ok-to-test

@knative-prow knative-prow bot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Oct 23, 2025
@dprotaso
Copy link
Member

/hold

Will revisit after the release next Tuesday

@knative-prow knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 23, 2025
@linkvt
Copy link
Contributor Author

linkvt commented Nov 12, 2025

Hi @dprotaso , please take another look, thanks!

@aslammmuhammed
Copy link

@linkvt @dprotaso can we expect this bug fix in the next release ?

Comment on lines 79 to 80
desiredDeployment.Spec.Template.Labels = kmeta.UnionMaps(deployment.Spec.Template.Labels, have.Spec.Template.Labels)
desiredDeployment.Spec.Template.Annotations = kmeta.UnionMaps(deployment.Spec.Template.Annotations, have.Spec.Template.Annotations)
Copy link
Member

@dprotaso dprotaso Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I learned from working on - https://github.com/knative/serving/pull/16186/files

If we union the have map with the desired one then we end up losing the ability to delete things.

Thus we should filter out labels/annotations we control from the have map.

I think we can just remove *.knative.dev and we also have an app label (though I don't expect us to remove that)

Everything else should be preserved. See my other PR as a starting point - there we only care about specific autoscaling labels/annotations for the pod autoscaler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!
I also thought about that but couldn't come up with a scenario where that would happen, as we (AFAIK) basically always create new deployments/revisions on changes.
It still makes sense though to be bit more defensive, I'll update the code.

@linkvt linkvt force-pushed the preserve-deployment-metadata branch from 7632241 to 1376cb1 Compare November 28, 2025 12:10
@knative-prow knative-prow bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 28, 2025
@linkvt
Copy link
Contributor Author

linkvt commented Nov 28, 2025

I just rebased the PR, PTAL 🫶

@linkvt
Copy link
Contributor Author

linkvt commented Nov 28, 2025

/retest

When running `kubectl rollout restart deployment`, Kubernetes adds the
`kubectl.kubernetes.io/restartedAt` annotation to the deployment's pod
template to trigger a rolling restart. However, Knative's reconciler was
removing this annotation by completely replacing the deployment spec,
causing new pods to be immediately terminated instead of completing the
rollout.

This change preserves externally-added metadata at both the deployment
level and template level during reconciliation by filtering Knative-controlled
keys (*.knative.dev and app label) from the current state before merging with
the desired state. This ensures Knative maintains full control over its own
metadata while preserving external additions.
@linkvt linkvt force-pushed the preserve-deployment-metadata branch from 1376cb1 to dff5d18 Compare November 28, 2025 13:35
@dprotaso
Copy link
Member

dprotaso commented Dec 3, 2025

/hold cancel

@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 3, 2025
@dprotaso
Copy link
Member

dprotaso commented Dec 3, 2025

/lgtm
/approve

thanks for the change @linkvt 🎉

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Dec 3, 2025
@knative-prow
Copy link

knative-prow bot commented Dec 3, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprotaso

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 3, 2025
@knative-prow knative-prow bot merged commit bad0a6b into knative:main Dec 3, 2025
91 checks passed
@linkvt linkvt deleted the preserve-deployment-metadata branch December 4, 2025 07:50
@aslammmuhammed
Copy link

Is this change released or planning to be released soon ? Asking since i cant see new releases in https://github.com/knative/operator/releases or https://knative.dev/docs/reference/relnotes/

@linkvt
Copy link
Contributor Author

linkvt commented Dec 12, 2025

Right now it is only included in main which means it would get released in knative 1.21.

I'll trigger a cherry pick to the 1.20 release and let Dave decide whether it is OK to be included.

/cherrypick 1.20

@knative-prow-robot
Copy link
Contributor

@linkvt: cannot checkout 1.20: error checking out "1.20": exit status 1 error: pathspec '1.20' did not match any file(s) known to git

Details

In response to this:

Right now it is only included in main which means it would get released in knative 1.21.

I'll trigger a cherry pick to the 1.20 release and let Dave decide whether it is OK to be included.

/cherrypick 1.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@linkvt
Copy link
Contributor Author

linkvt commented Dec 12, 2025

/cherrypick release-1.20

Thats the correct one :)

@knative-prow-robot
Copy link
Contributor

@linkvt: new pull request created: #16303

Details

In response to this:

/cherrypick release-1.20

Thats the correct one :)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

During rollout restart of a deployment the new pod gets terminated within seconds

5 participants