Skip to content

prevent mutation of deletion options during delete collection#100101

Merged
k8s-ci-robot merged 1 commit into
kubernetes:masterfrom
deads2k:mutated-options
Jul 9, 2021
Merged

prevent mutation of deletion options during delete collection#100101
k8s-ci-robot merged 1 commit into
kubernetes:masterfrom
deads2k:mutated-options

Conversation

@deads2k

@deads2k deads2k commented Mar 10, 2021

Copy link
Copy Markdown
Contributor

DeepCopy the deletion options because individual graceful deleters communicate changes via a mutating function in the delete strategy called in the delete method. While that is always ugly, it works when making a single call. When making multiple calls via delete collection, the mutation applied to pod/A can change the option ultimately used for pod/B. This can result in pods be non-gracefully terminated

The mutation happens here https://github.com/kubernetes/kubernetes/blob/master/pkg/registry/core/pod/strategy.go#L159 and is documented as the proper way for gracefuldeleters to indicate new values.

/kind bug
/priority important-soon
@kubernetes/sig-node-bugs

graceful termination will now be honored when deleting a collection of pods.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Mar 10, 2021
@k8s-ci-robot k8s-ci-robot requested review from ncdc and thockin March 10, 2021 20:43
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 10, 2021
@smarterclayton smarterclayton added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Mar 10, 2021
@deads2k

deads2k commented Mar 10, 2021

Copy link
Copy Markdown
Contributor Author

/retest

1 similar comment
@yangjunmyfm192085

Copy link
Copy Markdown
Contributor

/retest

@yangjunmyfm192085

Copy link
Copy Markdown
Contributor

/test pull-kubernetes-integration

@tkashem

tkashem commented Mar 11, 2021

Copy link
Copy Markdown
Contributor

/retest

@tkashem

tkashem commented Mar 11, 2021

Copy link
Copy Markdown
Contributor

/test pull-kubernetes-integration

@deads2k

deads2k commented Mar 11, 2021

Copy link
Copy Markdown
Contributor Author

/hold

this would be one of the stranger bug triggers I've seen, but let's see

/test pull-kubernetes-integration

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 11, 2021
@deads2k

deads2k commented Mar 11, 2021

Copy link
Copy Markdown
Contributor Author

yep, found it

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 11, 2021
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 11, 2021
Comment thread test/integration/scheduler/util.go Outdated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this helper only has one callsite and it was already relying on termination with no grace.

@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Mar 11, 2021
@caesarxuchao

Copy link
Copy Markdown
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Mar 11, 2021
@smarterclayton

Copy link
Copy Markdown
Contributor

Is there an explicit unit test we can add that guarantees the DELETECOLLECTION call doesn't do this?

@ehashman

Copy link
Copy Markdown
Member

/assign @mrunalp

@DangerOnTheRanger

Copy link
Copy Markdown
Contributor

Why is the mutation of GracePeriodSeconds on https://github.com/kubernetes/kubernetes/blob/master/pkg/registry/core/pod/strategy.go#L159 kept if options is getting deepcopied? It looks like there's a usage in apiserver/pkg/registry/rest/delete.go, but I couldn't tell if the mutation was used there.

@ehashman

Copy link
Copy Markdown
Member

/hold

@deads2k can you add a unit test per @smarterclayton ?

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 26, 2021
@smarterclayton

Copy link
Copy Markdown
Contributor

Why is the mutation of GracePeriodSeconds on https://github.com/kubernetes/kubernetes/blob/master/pkg/registry/core/pod/strategy.go#L159 kept if options is getting deepcopied? It looks like there's a usage in apiserver/pkg/registry/rest/delete.go, but I couldn't tell if the mutation was used there.

CheckGracefulDelete is "per pod". A copy has to be used anytime you invoke "per pod" from a "per list". Delete connection calling it "per list" is wrong. Note there are several other PRs in flight dealing with the problems around what CHeckGracefulDelete is doing (triggered by this thread, originally, since we started looking and saw that things were burning) #102025 #102344 and #98866

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 22, 2021
@dims

dims commented Jul 8, 2021

Copy link
Copy Markdown
Member

@deads2k do we want this in 1.22? given code freeze today and this is marked priority/critical-urgent

@deads2k deads2k force-pushed the mutated-options branch from b3c50c0 to ddfb8b1 Compare July 8, 2021 17:55
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 8, 2021
@deads2k deads2k force-pushed the mutated-options branch from ddfb8b1 to 649b87a Compare July 8, 2021 19:36
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 8, 2021
@deads2k

deads2k commented Jul 8, 2021

Copy link
Copy Markdown
Contributor Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 8, 2021
@smarterclayton

Copy link
Copy Markdown
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 8, 2021
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, smarterclayton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ehashman

ehashman commented Jul 8, 2021

Copy link
Copy Markdown
Member

/milestone v1.22

@k8s-ci-robot k8s-ci-robot added this to the v1.22 milestone Jul 8, 2021
@deads2k

deads2k commented Jul 8, 2021

Copy link
Copy Markdown
Contributor Author

/retest

@fejta-bot

Copy link
Copy Markdown

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

2 similar comments
@fejta-bot

Copy link
Copy Markdown

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@fejta-bot

Copy link
Copy Markdown

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 3ccfe94 into kubernetes:master Jul 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.