Skip to content

KEP-3960: Introducing Sleep Action for PreStop Hook#3961

Merged
k8s-ci-robot merged 11 commits into
kubernetes:masterfrom
AxeZhan:sleepAction
Jun 16, 2023
Merged

KEP-3960: Introducing Sleep Action for PreStop Hook#3961
k8s-ci-robot merged 11 commits into
kubernetes:masterfrom
AxeZhan:sleepAction

Conversation

@AxeZhan

@AxeZhan AxeZhan commented Apr 22, 2023

Copy link
Copy Markdown
Member
  • One-line PR description: This KEP proposes the addition of a new sleep action for the PreStop lifecycle hook in Kubernetes, allowing containers to pause for a specified duration before termination.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Apr 22, 2023
@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 22, 2023
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/kep.yaml
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/kep.yaml Outdated
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
@bart0sh

bart0sh commented Apr 25, 2023

Copy link
Copy Markdown
Contributor

/assign @SergeyKanzhelev @mrunalp

Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/kep.yaml
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/kep.yaml Outdated
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/kep.yaml Outdated

@SergeyKanzhelev SergeyKanzhelev left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 13, 2023

@jeremyrickard jeremyrickard left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 Giving this a PRR review as a shadow this time around, have a few questions to clarify.


###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?

The feature can be disabled in Alpha and Beta versions by restarting kube-apiserver with the feature-gate off. In terms of Stable versions, users can choose to opt-out by not setting the sleep field.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If users have set the sleep field and then the feature is disabled, it will just be ignored and the old behavior (i.e. no sleep) would apply?

@charles-chenzz charles-chenzz Jun 14, 2023

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then feature disabled

IIUC, you mean enabled first then setup the sleep field, then restart/disabled it? if that was the case, we want it be ignored and old behavior apply

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If users have set the sleep field and then the feature is disabled, it will just be ignored and the old behavior (i.e. no sleep) would apply?

Yes, in that case, the prestop hook will not take effect.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For most API changes, the feature gate controls the admission of new uses to the system, and not the actual implementation logic.

In other words, if you enable the feature, use the feature, then disable the feature - it keeps working, bt no NEW uses of the feature work. This is not universally true, we actually do not have a consistent rule here.

Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated

###### How can a rollout or rollback fail? Can it impact already running workloads?

The change is opt-in, it doesn't impact already running workloads. But problems with the updated validation logic may cause crashes in the apiserver.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would the user determine this is the cause of crashes in the apiserver? Will there be any tests that help prevent this from making it into the release?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is misleading (crash), let me update later. If a pod with sleepAction was created and the featue is disabled. And this pod is recreated/updated by a user, the pod's yaml won't pass the validation.
In this case, an error will occur to point out the wrong field, instead of the "crash in the apiserver"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a hard rule that previously accepted objects must not later fail validation. When it comes to actual API review, we will ensure that :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a hard rule that previously accepted objects must not later fail validation. When it comes to actual API review, we will ensure that :)

Then I think, we can safely say that this feature will not impact already running workloads?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 14, 2023

@wojtek-t wojtek-t left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more comments about PRR on top of previously added by @jeremyrickard

Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated

### Version Skew Strategy

N/A

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree - it actually matters. It's definitely possible [even more, it will happen for sure at least for a moment] that the FG will be enabled only in one component and not the other.

Ideally, I would like to see matrix of:

  • kubeapiserver enabled/disable
  • kubelet enabled/disable
    with description of what exactly we expect

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, but I'm not sure what will happen in the scenario only the kube-apiserver enable this feature, will the creating request pass the validation and successsfully processed or will it be rejected?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wojtek-t this goes to the discussion about alpha/beta that I started and have not followed up on :)

Should the kubelet even have the gate? We are not consistent.

IOW:

  1. The feature gate only exists to prevent use of the field in the API. Once accepted on a pod, the feature is on for that pod. Kubelet treats the field like it is GA.

  2. The feature gate prevents new use of the field in the API and nullifies the effect on existing uses. If gate is disabled, Kubelet will see the field and ignore it.

What do you think makes more sense?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semantic that I believe works best is close to (2) [although I didn't fully understand "and nullifies ..." part]
I believe what we should target is:

If FG is disabled:
(a) any attempt to set the field for an object doesn't succeed - the field is silently dropped - the "dropFields" strategy: https://github.com/kubernetes/kubernetes/blob/master/pkg/registry/core/pod/strategy.go#LL89C10-L89C31
(b) if the field was set once the FG was enabled, it stays to be set
[the above is what kube-apiserver is doing]

(c) if the FG is disabled, even if Kubelet (or in general any other component) observes the field it ignores its existence

[for (c) there are exceptions - the nice example appeared in Sidecar KEP and computing resources in scheduler, but I personally treat them as exceptions - by default I consider the above the desired semantic]

@thockin - do you have any concerns about this?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this specific field, that seems somewhat acceptable because it only matters exactly once (when pod is deleted), although it is pretty surprising that the API says it is on but its really not.

But imagine a field which is used in real-time on a long-lived resource like Service or Deployment. The object was admitted, the feature was used, then the gate was disabled, the API stil says the feature is on, but suddenly it stopped working. Worse - it could stop working on some components and not others (e.g. disabled on some kube-proxy and noth others).

That makes testing MUCH more complicated.

Now imagine an enum sort of field or a loosening of validation, where a new value was allowed by the gate, and then the gate is disabled, and the value is no longer allowable. Do we fall back on some next-closest value without updating the API? That seems terrible.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the real-time example - I would actually say that the "stops working" (or "partially stops working") is still what I want. If I'm disabling a FG, I'm doing it on purpose, so I actually want this feature to really stop working, as this is presumably causing some troubles.
I definitely agree that the current support for it is poor, but I still think it's what I want.
And eventually the FeatureGate KEP will address the issue [you register a hook for disabling FG that is clearing this field, or doing whatever else is better in a given case]

Re enum example - this is harder one. The only ones that I've seen where effectively "yet another type of optional behavior", so disabling meant "you no longer have this, so you don't get anything". Which is kind of the same case as the above.
If it won't be an "optional behavior", but rather "some behavior is required" that would become super tricky and that may justify an exception. But I guess it depends on the specific case.

So I guess the summary of my answer is: if I'm requesting disabling a FG, I actually expect the feature to become disabled. And I acknowledge the fact that if someone was using it, they may got affected/broken, but I'm disabling FG for a reason.

@deads2k @johnbelamaric - FYI - this is an interesting discussion

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpbetz also :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, in this kep, should the kubelet have the gate?

if yes:

  1. only apiserver enable the FG
  • validation will pass, but kubelet will ignore this field when pod is terminating?
  1. only kubelet enable the FG
  • new pod will fail the validation, but the existing pods will exec the sleepAction.

if no:

  • FG only controls the validation, once a container with sleepAction is set, it will always sleep before terminating regardless of whether FG still holds

Am I understanding this correct?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THis is correct.

And what I'm saying is that we should go with the first option.
I know it has drawbacks, but I think the gains from it outweigh those drawbacks. It seems that Tim is far from being convinced on the generic case, but he seems to be ok for this particular case, so let's go for it here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md Outdated
@thockin

thockin commented Jun 15, 2023

Copy link
Copy Markdown
Member

I remain LGTM on the design, not sure if this will make the deadline or not, but don't block it on me. IMO the KEP is substantially well understood and should proceeed to implementation, even if we argue a bit more about testing.

@mrunalp

mrunalp commented Jun 15, 2023

Copy link
Copy Markdown
Contributor

/approve

Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
Comment thread keps/sig-node/3960-pod-lifecycle-sleep-action/README.md
@wojtek-t

Copy link
Copy Markdown
Member

I had two more comments, but given they are quite minor I don't want to block my approval given deadline.
So I'm approving and holding the PR - please fix them and then anyone can re-lgtm.

/approve PRR
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 15, 2023
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: AxeZhan, mrunalp, SergeyKanzhelev, thockin, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 15, 2023
@SergeyKanzhelev

Copy link
Copy Markdown
Member

/unhold

let's address as a follow up, only 30 minnutes left

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 16, 2023
@SergeyKanzhelev

Copy link
Copy Markdown
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 16, 2023
@k8s-ci-robot k8s-ci-robot merged commit 6ce7493 into kubernetes:master Jun 16, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.28 milestone Jun 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

Development

Successfully merging this pull request may close these issues.