Skip to content

[Kubelet] Delay deletion of pod from the API server until volumes are deleted#41095

Merged
k8s-github-robot merged 1 commit into
kubernetes:masterfrom
dashpole:deletion_pod_lifecycle
Feb 8, 2017
Merged

[Kubelet] Delay deletion of pod from the API server until volumes are deleted#41095
k8s-github-robot merged 1 commit into
kubernetes:masterfrom
dashpole:deletion_pod_lifecycle

Conversation

@dashpole

@dashpole dashpole commented Feb 7, 2017

Copy link
Copy Markdown
Contributor

Previous PR that was reverted: #40239.

To summarize the conclusion of the previous PR after reverting:

  • The status manager has the most up-to-date status, but the volume manager uses the status from the pod manager, which only is as up-to-date as the API server.
  • Because of this, the previous change required an additional round trip between the kubelet and API server.
  • When few pods are being added or deleted, this is only a minor issue. However, when under heavy load, the QPS limit to the API server causes this round trip to take ~60 seconds, which is an unacceptable increase in latency. Take a look at the graphs in Delay deletion of pod from the API server until volumes are deleted #40239 to see the effect of QPS changes on timing.
  • To remedy this, the volume manager looks at the status from the status manager, which eliminates the round trip.

cc: @vishh @derekwaynecarr @sjenning @jingxu97 @kubernetes/sig-storage-misc

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 7, 2017
@k8s-reviewable

Copy link
Copy Markdown

This change is Reviewable

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Feb 7, 2017
@dashpole

dashpole commented Feb 7, 2017

Copy link
Copy Markdown
Contributor Author

Dark Grey = Kubelet receives deletion request in syncloop
Red = Kubelet processes container exited PLEG event
Light Grey = Volume is torn down
Purple = Pod is deleted from the API server
screenshot from 2017-02-07 14 31 42

Notice that there is now very little delay between red and light grey, as a result of the change in the volume manager in this PR.
The gradual slope of the purple line is due to QPS limits, and is expected.

@dashpole

dashpole commented Feb 7, 2017

Copy link
Copy Markdown
Contributor Author

I also just ran all node-e2e tests on this PR on all images that we test on, and it passed all tests without any flakes.

@dashpole

dashpole commented Feb 7, 2017

Copy link
Copy Markdown
Contributor Author

And just for reference, this is what the profile looked like before this change. The difference is that previously volume teardown (light grey) occurred after deletion finished (purple). Now volume teardown happens immediately after the containers are terminated.
screenshot from 2017-02-07 15 24 51

note: the grey line should be completely after the purple line in this graph. This is due to one of the pods having more than 1 volume. #40934

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

@dashpole: The following test(s) failed:

Test name Commit Details Rerun command
Jenkins Bazel Build bf7a3ee link @k8s-bot bazel test this
Jenkins GCE Node e2e bf7a3ee link @k8s-bot node e2e test this
Jenkins verification bf7a3ee link @k8s-bot verify test this
Jenkins GCE etcd3 e2e c130ecd link @k8s-bot gce etcd3 e2e test this

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@dashpole

dashpole commented Feb 7, 2017

Copy link
Copy Markdown
Contributor Author

@k8s-bot gce etcd3 e2e test this

@vishh

vishh commented Feb 8, 2017

Copy link
Copy Markdown
Contributor

Can you squash the last two commits as it will make reverts easy via git?

@vishh vishh assigned vishh and unassigned dims Feb 8, 2017
@vishh vishh added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Feb 8, 2017
@vishh vishh changed the title Reinstate: Delay deletion of pod from the API server until volumes are deleted [Kubelet] Delay deletion of pod from the API server until volumes are deleted Feb 8, 2017
@vishh

vishh commented Feb 8, 2017

Copy link
Copy Markdown
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2017
@vishh

vishh commented Feb 8, 2017

Copy link
Copy Markdown
Contributor

squash the commits and I will approve it

@vishh vishh added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 8, 2017
@dashpole

dashpole commented Feb 8, 2017

Copy link
Copy Markdown
Contributor Author

wanted to make it easier to review, as the last two commits are the only changes from the previous PR

@dashpole dashpole force-pushed the deletion_pod_lifecycle branch from c130ecd to 285706b Compare February 8, 2017 01:00
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2017
@vishh

vishh commented Feb 8, 2017

Copy link
Copy Markdown
Contributor

/LGTM
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2017
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 8, 2017
@dashpole dashpole force-pushed the deletion_pod_lifecycle branch from 285706b to 67cb270 Compare February 8, 2017 15:35
@k8s-github-robot

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

The following people have approved this PR: dashpole, vishh

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2017
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 8, 2017
@dashpole

dashpole commented Feb 8, 2017

Copy link
Copy Markdown
Contributor Author

@vishh rebasing removed the lgtm...

@vishh vishh added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2017
@k8s-github-robot

Copy link
Copy Markdown

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 1be74e3 into kubernetes:master Feb 8, 2017
@dashpole dashpole deleted the deletion_pod_lifecycle branch February 8, 2017 18:45
@ncdc

ncdc commented Feb 9, 2017

Copy link
Copy Markdown
Member

@dashpole I'm not sure if this is the culprit, but since this PR merged, 5 out of the 6 runs of https://k8s-gubernator.appspot.com/builds/kubernetes-jenkins/logs/ci-kubernetes-node-kubelet/ have failed waiting for a pod to be deleted. WDYT?

@ncdc

ncdc commented Feb 9, 2017

Copy link
Copy Markdown
Member

Or is that what #41157 is trying to address?

@dashpole

dashpole commented Feb 9, 2017

Copy link
Copy Markdown
Contributor Author

Yes, that is what #41157 is trying to address.

@dashpole

dashpole commented Feb 9, 2017

Copy link
Copy Markdown
Contributor Author

@ncdc with regards to the node-kubelet tests, I am surprised that it failed so often on those runs compared with the node-e2e testgrid runs, which have only flaked occasionally...

k8s-github-robot pushed a commit that referenced this pull request Feb 15, 2017
Automatic merge from submit-queue

Fix bug in status manager TerminatePod

In TerminatePod, we previously pass pod.Status to updateStatusInternal.  This is a bug, since it is the original status that we are given.  Not only does it skip updates made to container statuses, but in some cases it reverted the pod's status to an earlier version, since it was being passed a stale status initially.

This was the case in #40239 and #41095.  As shown in #40239, the pod's status is set to running after it is set to failed, occasionally causing very long delays in pod deletion since we have to wait for this to be corrected.

This PR fixes the bug, adds some helpful debugging statements, and adds a unit test for TerminatePod (which for some reason didnt exist before?).

@kubernetes/sig-node-bugs @vish @Random-Liu
k8s-github-robot pushed a commit that referenced this pull request Feb 16, 2017
Automatic merge from submit-queue (batch tested with PRs 41466, 41456, 41550, 41238, 41416)

Delay Deletion of a Pod until volumes are cleaned up

#41436 fixed the bug that caused #41095 and #40239 to have to be reverted.  Now that the bug is fixed, this shouldn't cause problems.

 @vishh @derekwaynecarr @sjenning @jingxu97 @kubernetes/sig-storage-misc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants