Move to a structured status for dynamic kubelet config#63314
Move to a structured status for dynamic kubelet config#63314k8s-github-robot merged 1 commit intokubernetes:masterfrom
Conversation
pkg/registry/core/node/strategy.go
Outdated
There was a problem hiding this comment.
seems like the whole config field should disappear in that case
There was a problem hiding this comment.
also need to strip this field (from old and new object) in PrepareForUpdate if the feature is off
There was a problem hiding this comment.
Good catch, I changed it to a pointer type midway through working on this and forgot to update this.
There was a problem hiding this comment.
updated PrepareForUpdate as well
pkg/apis/core/types.go
Outdated
There was a problem hiding this comment.
clarify whether the presence of Error means the Assigned config source will be ignored until changes are made to it
pkg/apis/core/types.go
Outdated
There was a problem hiding this comment.
clarify this only applies if dynamic config feature is used and a ConfigSource is specified for the node?
There was a problem hiding this comment.
Technically it also applies when dynamic config is enabled, but you've set a nil configSource.
There was a problem hiding this comment.
I thought you wanted resource version and uid reflected in the status, to know whether the node had observed a particular version of the source
There was a problem hiding this comment.
I suppose one possibility is we've seen the assigned reference and are getting an error fetching it, in which case we wouldn't have a uid or resourceVersion
There was a problem hiding this comment.
I thought you wanted resource version and uid reflected in the status, to know whether the node had observed a particular version of the source
For Active and LastKnownGood, yes.
Assigned is just an ack, so you know what the Kubelet thinks you told it to do. When we start responding to ConfigMap mutations, the meaning of ConfigSource, assigned in spec, will become "I'm assigning you the latest config at this namespace/name, invariant to UID and ResourceVersion," and I want the meaning of Assigned in status to be "I recognize that I should use the latest config at that namespace/name, invariant to UID and ResourceVersion."
I suppose one possibility is we've seen the assigned reference and are getting an error fetching it, in which case we wouldn't have a uid or resourceVersion
Exactly, Assigned gets acknowledged before the Kubelet tries to download the config, and if an error is encountered while downloading, Error gets set subsequently.
There was a problem hiding this comment.
even if we don't track mutations, we should still have the actual configmap in hand when we report status, right? why can't we report the resourceVersion we're using?
There was a problem hiding this comment.
We set status when we bootstrap the controller, when an active config becomes the new last-known-good, and when we ack the assigned config. The last of those three doesn't need UID/ResourceVersion because it's a reflection of spec; but during the former two we only have metadata from SerializedNodeConfigSources. Until I made the changes in #63221 to respect ConfigMap updates, we weren't setting ResourceVersion in the SerializedNodeConfigSources (UID was the only identifier that mattered for resolving a checkpoint), so we didn't know it at bootstrap/lkg-update time.
It wouldn't be too hard to port that fix back to this PR, though.
|
/retest |
3ac818a to
4460b42
Compare
|
/assign |
|
[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process @dashpole @dchen1107 @liggitt @mtaufen Pull Request Labels
|
pkg/apis/core/types.go
Outdated
There was a problem hiding this comment.
From our discussion offline: I think having transition times, and heartbeat times on each of these, showing when they were last updated, and when it last changed would be useful. It would allow distinguishing between roll-backs to LKG and when the node config hasn't yet taken effect because if assigned has transitioned more recently than active, it hasn't taken effect, and if the opposite, it was rolled back. Even though we have well defined errors, this seems a better way to transmit this information. It can also help with correlating config changes with other, non-fatal behavior changes (e.g. lots of evictions start happening after enabling local storage). Or, for example if the kubelet wasn't updating LKG after 10 minutes, or if the kubelet was in an infinite loop during loading of config, and heartbeat hadn't been updated in a couple minutes, these would be obvious. I agree that having well defined objects in the status is helpful for this, but I think we should model this as a "typed condition", and keep the heartbeat and transition times for each of these Sources.
There was a problem hiding this comment.
distinguishing between roll-backs to LKG and when the node config hasn't yet taken effect
To clarify, this is the scenario where we're on LKG, then ConfigSource is updated by a user, then Assigned is updated by the Kubelet, but we haven't tried the new config yet, so the status looks inconsistent (and it's unclear if Error refers to the new Assigned or the previous Assigned)?
And the transition times would clarify this by allowing you to compare the transition time for Assigned with the transition time for Error?
It can also help with correlating config changes with other, non-fatal behavior changes
I wonder if some of these debugging scenarios aren't better covered by monitoring events or timeseries derived from metrics. We could send events at every status transition, rather than just when the Kubelet restarts to try a new config.
I want to be careful with heartbeats, as they do impact scalability (every update, including heartbeats, requires a full rewrite of the Node object in etcd). But I think transition times could provide some value.
if the kubelet was in an infinite loop during loading of config
I think you'd already get a NodeNotReady in this case.
There was a problem hiding this comment.
I spent the afternoon drawing up a state machine diagram, to help clarify what we should think about for status reporting: https://docs.google.com/drawings/d/1Xk_uiDFY0Y3pN6gualoy9wDPnja9BAuT-i5JYuAZ6wE/edit?usp=sharing
There was a problem hiding this comment.
I'm thinking that rather than having Assigned in the status be an early acknowledgement of ConfigSource, it might be clearer to make it a direct reflection of the fully explicit on-disk record of assigned config (like LKG is), and then ensure the error messages clearly differentiate between errors related to the Spec.ConfigSource vs errors related to the already-downloaded config payload.
In general, it's probably clearer if the status simply maps to the state machine.
There was a problem hiding this comment.
And if we do report timestamps, the simplest option might just be to report the modification times of the on-disk records for Assigned and LastKnownGood. (Active is a little trickier, since this is determined at Kubelet startup and only applies to the runtime; though the NodeReady heartbeat might be a decent proxy for whether Active is up to date or not).
There was a problem hiding this comment.
@dashpole and I had an offline meeting and decided to leave timestamps out for now, and potentially add them later if a controller can justify that it needs them to reason about the state of the world.
cmd/kubelet/app/server.go
Outdated
There was a problem hiding this comment.
nit: i'm generally not a fan of giving nil an implicit meaning. I would rather have an explicit return value indicating if we should use local or remote.
There was a problem hiding this comment.
I'm not sure I agree in this case.
The controller bootstrap returns the dynamic config if it exists.
If there's no dynamic config to use it returns nil (nil => Nothing is a common idiom in Go).
In the case that there's no dynamic config to use, we just move on and use the local.
I think adding a fourth return value to tag the result is unnecessary, given the ubiquity of that idiom.
pkg/apis/core/types.go
Outdated
There was a problem hiding this comment.
TODO: Think through this case more:
- API server is upgraded to version w/ new source subtype, Kubelets not
- User sets new source type
- Kubelet sets
Assignedin runtime status manager, intending to ack, but had unmarshaled a config with all nil subfields, as far as it could see. So the status sync will fail API server validation. - Kubelet sees
AllNilSubfieldsErrorwhen it tries to produce a config source a little after updating runtime status manager, and updates the status manager with this error. But sinceAssignedwas already set to an invalid value in the status manager, all status sync attempts will fail API server validation until this is corrected.
There was a problem hiding this comment.
Now that we report the checkpointed intent in Assigned, rather than using it as an early ack, this isn't a concern. AllNilSubfieldsError would be encountered prior to checkpointing the record of Assigned config.
a56f36b to
2c60104
Compare
There was a problem hiding this comment.
how does this get set in the status? It looks like we ignore syncError unless len() > 0.
There was a problem hiding this comment.
When the status is constructed, SyncError dominates Error (e.g. Node.Status.Config.Error = nodeConfigStatus.syncError || nodeConfigStatus.status.Error ).
For example:
- Config fails to validate, you see a
ValidateError. - You update config source, but you get
AllNilSubfieldsError; this is a syncError (overlay), but the Kubelet is still internally aware of theValidateError. - You revert config source, Kubelet knows it doesn't need a restart, so it just clears the syncError, and you see the ongoing
ValidateErroragain.
pkg/registry/core/node/strategy.go
Outdated
There was a problem hiding this comment.
since I am not familiar with the pgk/registry code-base, why is this required? Same question for addition in PrepareForUpdate.
There was a problem hiding this comment.
We strip alpha fields when the corresponding feature gate is turned off, so that they can't be written unless the feature gate is turned on. See similar code in func (nodeStrategy) PrepareForCreate/PrepareForUpdate, similarly see DropDisabledAlphaFields in pkg/api/pod/util.go.
|
I approve the pr but rely on dashpole@ for a throughout review. /approve |
2c60104 to
4eb3059
Compare
There was a problem hiding this comment.
I would prefer having Error and SyncError functions behave similarly, and I prefer qualifying the error. I.E.
SetLoadError
ClearLoadError
SetSyncError
ClearSyncError
Or just Set...Error functions.
The fact that we store the Load error in the status, and override it with the sync error is an implementation detail. (You dont need to call it load error. It is just what I picked since it comes from loadConfig(assigned).)
There was a problem hiding this comment.
I could see renaming SetSyncError to something like SetTmpError. I don't think the status manager should assume anything about the context of the base error, which is why I just stuck with SetError.
There was a problem hiding this comment.
Updated naming, and simplified to just SetError and SetErrorOverride.
Updates dynamic Kubelet config to use a structured status, rather than a node condition. This makes the status machine-readable, and thus more useful for config orchestration. Fixes: kubernetes#56896
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dashpole, dchen1107, mtaufen The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/test all [submit-queue is verifying that this PR is safe to merge] |
|
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here. |
This PR updates dynamic Kubelet config to use a structured status, rather than a node condition. This makes the status machine-readable, and thus more useful for config orchestration.
Fixes: #56896