[Kubelet] Delay deletion of pod from the API server until volumes are deleted by dashpole · Pull Request #41095 · kubernetes/kubernetes

dashpole · 2017-02-07T21:30:41Z

Previous PR that was reverted: #40239.

To summarize the conclusion of the previous PR after reverting:

The status manager has the most up-to-date status, but the volume manager uses the status from the pod manager, which only is as up-to-date as the API server.
Because of this, the previous change required an additional round trip between the kubelet and API server.
When few pods are being added or deleted, this is only a minor issue. However, when under heavy load, the QPS limit to the API server causes this round trip to take ~60 seconds, which is an unacceptable increase in latency. Take a look at the graphs in Delay deletion of pod from the API server until volumes are deleted #40239 to see the effect of QPS changes on timing.
To remedy this, the volume manager looks at the status from the status manager, which eliminates the round trip.

cc: @vishh @derekwaynecarr @sjenning @jingxu97 @kubernetes/sig-storage-misc

k8s-reviewable · 2017-02-07T21:30:48Z

This change is

dashpole · 2017-02-07T22:35:59Z

Dark Grey = Kubelet receives deletion request in syncloop
Red = Kubelet processes container exited PLEG event
Light Grey = Volume is torn down
Purple = Pod is deleted from the API server

Notice that there is now very little delay between red and light grey, as a result of the change in the volume manager in this PR.
The gradual slope of the purple line is due to QPS limits, and is expected.

dashpole · 2017-02-07T23:14:23Z

I also just ran all node-e2e tests on this PR on all images that we test on, and it passed all tests without any flakes.

dashpole · 2017-02-07T23:27:33Z

And just for reference, this is what the profile looked like before this change. The difference is that previously volume teardown (light grey) occurred after deletion finished (purple). Now volume teardown happens immediately after the containers are terminated.

note: the grey line should be completely after the purple line in this graph. This is due to one of the pods having more than 1 volume. #40934

k8s-ci-robot · 2017-02-07T23:40:28Z

@dashpole: The following test(s) failed:

Test name	Commit	Details	Rerun command
Jenkins Bazel Build	`bf7a3ee`	link	`@k8s-bot bazel test this`
Jenkins GCE Node e2e	`bf7a3ee`	link	`@k8s-bot node e2e test this`
Jenkins verification	`bf7a3ee`	link	`@k8s-bot verify test this`
Jenkins GCE etcd3 e2e	`c130ecd`	link	`@k8s-bot gce etcd3 e2e test this`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

dashpole · 2017-02-07T23:45:52Z

@k8s-bot gce etcd3 e2e test this

vishh · 2017-02-08T00:08:10Z

Can you squash the last two commits as it will make reverts easy via git?

vishh · 2017-02-08T00:09:08Z

/lgtm

vishh · 2017-02-08T00:09:19Z

squash the commits and I will approve it

dashpole · 2017-02-08T00:57:16Z

wanted to make it easier to review, as the last two commits are the only changes from the previous PR

vishh · 2017-02-08T01:03:56Z

/LGTM
/approve

k8s-github-robot · 2017-02-08T15:36:13Z

[APPROVALNOTIFIER] This PR is APPROVED

The following people have approved this PR: dashpole, vishh

Needs approval from an approver in each of these OWNERS Files:

~~pkg/kubelet/OWNERS~~ [vishh]
~~pkg/kubelet/volumemanager/OWNERS~~ [vishh]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

dashpole · 2017-02-08T15:38:09Z

@vishh rebasing removed the lgtm...

k8s-github-robot · 2017-02-08T18:45:16Z

Automatic merge from submit-queue

ncdc · 2017-02-09T01:49:19Z

@dashpole I'm not sure if this is the culprit, but since this PR merged, 5 out of the 6 runs of https://k8s-gubernator.appspot.com/builds/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet/ have failed waiting for a pod to be deleted. WDYT?

ncdc · 2017-02-09T01:49:47Z

Or is that what #41157 is trying to address?

dashpole · 2017-02-09T01:49:56Z

Yes, that is what #41157 is trying to address.

dashpole · 2017-02-09T02:07:07Z

@ncdc with regards to the node-kubelet tests, I am surprised that it failed so often on those runs compared with the node-e2e testgrid runs, which have only flaked occasionally...

@vish

Automatic merge from submit-queue Fix bug in status manager TerminatePod In TerminatePod, we previously pass pod.Status to updateStatusInternal. This is a bug, since it is the original status that we are given. Not only does it skip updates made to container statuses, but in some cases it reverted the pod's status to an earlier version, since it was being passed a stale status initially. This was the case in #40239 and #41095. As shown in #40239, the pod's status is set to running after it is set to failed, occasionally causing very long delays in pod deletion since we have to wait for this to be corrected. This PR fixes the bug, adds some helpful debugging statements, and adds a unit test for TerminatePod (which for some reason didnt exist before?). @kubernetes/sig-node-bugs @vish @Random-Liu

@vishh

Automatic merge from submit-queue (batch tested with PRs 41466, 41456, 41550, 41238, 41416) Delay Deletion of a Pod until volumes are cleaned up #41436 fixed the bug that caused #41095 and #40239 to have to be reverted. Now that the bug is fixed, this shouldn't cause problems. @vishh @derekwaynecarr @sjenning @jingxu97 @kubernetes/sig-storage-misc

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 7, 2017

k8s-github-robot assigned dims Feb 7, 2017

k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Feb 7, 2017

vishh assigned vishh and unassigned dims Feb 8, 2017

vishh added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Feb 8, 2017

vishh changed the title ~~Reinstate: Delay deletion of pod from the API server until volumes are deleted~~ [Kubelet] Delay deletion of pod from the API server until volumes are deleted Feb 8, 2017

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2017

vishh added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 8, 2017

dashpole force-pushed the deletion_pod_lifecycle branch from c130ecd to 285706b Compare February 8, 2017 01:00

k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2017

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2017

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 8, 2017

delete volumes before pod deletion

67cb270

dashpole force-pushed the deletion_pod_lifecycle branch from 285706b to 67cb270 Compare February 8, 2017 15:35

k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2017

k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 8, 2017

vishh added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2017

k8s-github-robot merged commit 1be74e3 into kubernetes:master Feb 8, 2017

dashpole deleted the deletion_pod_lifecycle branch February 8, 2017 18:45

dashpole mentioned this pull request Feb 8, 2017

lengthen pod deletion timeout to prevent flakes #41157

Merged

This was referenced Feb 9, 2017

Revert "[Kubelet] Delay deletion of pod from the API server until volumes are deleted" #41202

Merged

Fix bug in status manager TerminatePod #41436

Merged

dashpole mentioned this pull request Feb 15, 2017

Delay Deletion of a Pod until volumes are cleaned up #41456

Merged

dashpole mentioned this pull request Mar 7, 2017

Add Pod Deletion Prometheus Metric #42673

Closed

Conversation

dashpole commented Feb 7, 2017

Uh oh!

k8s-reviewable commented Feb 7, 2017

Uh oh!

dashpole commented Feb 7, 2017

Uh oh!

dashpole commented Feb 7, 2017

Uh oh!

dashpole commented Feb 7, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Feb 7, 2017

Uh oh!

dashpole commented Feb 7, 2017

Uh oh!

vishh commented Feb 8, 2017

Uh oh!

vishh commented Feb 8, 2017

Uh oh!

vishh commented Feb 8, 2017

Uh oh!

dashpole commented Feb 8, 2017

Uh oh!

vishh commented Feb 8, 2017

Uh oh!

k8s-github-robot commented Feb 8, 2017

Uh oh!

dashpole commented Feb 8, 2017

Uh oh!

k8s-github-robot commented Feb 8, 2017

Uh oh!

ncdc commented Feb 9, 2017

Uh oh!

ncdc commented Feb 9, 2017

Uh oh!

dashpole commented Feb 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dashpole commented Feb 9, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

dashpole commented Feb 7, 2017 •

edited

Loading

dashpole commented Feb 9, 2017 •

edited

Loading