kubelet: fix duplicated status updates at pod cleanup by yujuhong · Pull Request #21438 · kubernetes/kubernetes

yujuhong · 2016-02-18T00:29:10Z

cleanupTerminatedPods is responsible for checking whether a pod has been
terminated and force a status update to trigger the pod deletion. However, this
function is called in the periodic clenup routine, which runs every 2 seconds.
In other words, it forces a status update for each non-running (and not yet
deleted in the apiserver) pod. When batch deleting tens of pods, the rate of
new updates surpasses what the status manager can handle, causing numerous
redundant requests (and the status channel to be full).

This change forces a status update only when detecting the DeletionTimestamp is
set for a terminated pod. Note that for other non-terminated pods, the pod
workers should be responsible for setting the correct status after killling all
the containers.

yujuhong · 2016-02-18T00:29:35Z

/cc @smarterclayton

k8s-github-robot · 2016-02-18T00:35:10Z

Labelling this PR as size/M

yujuhong · 2016-02-18T00:52:46Z

FYI, when deleting 100 pods on a node, the status update channel was completely flooded.

timstclair · 2016-02-18T00:55:15Z

If the cached status is not found, oldStatus will be nil here, which I think will lead to a nil pointer dereference in copyStatus?

ah I meant to set that to the pod.Status, but lost it during rebasing. Will fix it.

k8s-bot · 2016-02-18T00:59:41Z

GCE e2e test build/test passed for commit d531fce733d2416870192e03ac685b872097a7d5.

timstclair · 2016-02-18T01:03:47Z

LGTM once the nil pointer is fixed.

yujuhong · 2016-02-18T01:05:49Z

LGTM once the nil pointer is fixed.

Thanks for the review! It's fixed.

k8s-bot · 2016-02-18T01:28:28Z

GCE e2e test build/test passed for commit 58087f835b69b6427415ea5da6a185c1e36d1923.

k8s-github-robot · 2016-02-18T18:44:44Z

PR needs rebase

cleanupTerminatedPods is responsible for checking whether a pod has been terminated and force a status update to trigger the pod deletion. However, this function is called in the periodic clenup routine, which runs every 2 seconds. In other words, it forces a status update for each non-running (and not yet deleted in the apiserver) pod. When batch deleting tens of pods, the rate of new updates surpasses what the status manager can handle, causing numerous redundant requests (and the status channel to be full). This change forces a status update only when detecting the DeletionTimestamp is set for a terminated pod. Note that for other non-terminated pods, the pod workers should be responsible for setting the correct status after killling all the containers.

k8s-github-robot · 2016-02-18T19:01:19Z

PR changed after LGTM, removing LGTM.

yujuhong · 2016-02-18T19:01:48Z

Trivial rebase (only one line of comments was touched). Re-applying lgtm.

k8s-bot · 2016-02-18T19:30:41Z

GCE e2e build/test failed for commit 386453a.

yujuhong · 2016-02-18T19:56:57Z

@k8s-bot test this: issue #21467

k8s-bot · 2016-02-18T20:29:51Z

GCE e2e test build/test passed for commit 386453a.

k8s-github-robot · 2016-02-20T20:28:17Z

@k8s-bot test this

Tests are more than 48 hours old. Re-running tests.

k8s-bot · 2016-02-20T20:55:56Z

GCE e2e test build/test passed for commit 386453a.

k8s-github-robot · 2016-02-22T06:15:37Z

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

k8s-bot · 2016-02-22T06:43:24Z

GCE e2e test build/test passed for commit 386453a.

k8s-github-robot · 2016-02-22T10:35:23Z

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

k8s-bot · 2016-02-22T11:04:34Z

GCE e2e test build/test passed for commit 386453a.

k8s-github-robot · 2016-02-22T11:05:01Z

Automatic merge from submit-queue

Auto commit by PR queue bot

wojtek-t · 2016-02-22T13:27:47Z

It seems that this PR somehow broke kubemark - the last two runs of kubemark are failing with "time out on deleting replication controllers", which seems clearly related to this PR ...
I'm going to revert this PR.

wojtek-t · 2016-02-22T13:39:08Z

@yujuhong - just to give you an evidence that, these are the logs from the failure:

04:34:59 Feb 22 04:34:59.981: INFO: Deleting RC density3000-d7aa0532-d95f-11e5-9238-42010af01555 took: 2m32.169705578s
04:37:02 Feb 22 04:37:02.293: INFO: Terminating RC density3000-d7aa0532-d95f-11e5-9238-42010af01555 pods took: 2m2.312479847s
04:37:02 Feb 22 04:37:02.293: INFO: Unexpected error occurred: timed out waiting for the condition

Note that the phase "Terminating RC ..." before you change took ~2s, now it's timing out on 2 minutes.
I didn't have time to debug it deeply, but it clearly seems to be related to your change.

BTW - this is happening only on kubemark, something strange is now happening in SimpleKubelet? @gmarek can potentially give you more details on what kubemark is using

wojtek-t · 2016-02-22T14:53:40Z

And to confirm - reverting this PR fixed the problem.

yujuhong · 2016-02-22T18:48:06Z

BTW - this is happening only on kubemark, something strange is now happening in SimpleKubelet? @gmarek can potentially give you more details on what kubemark is using

I suspect the (non-)graceful termination in kubemark doesn't work expected because this PR only removes redundant pod deletion requests (which is a huge burden when batch deleting 100 pods) @Random-Liu has agreed to take a look and see what difference in kubemark and kubelet that'd cause pods not being terminated.

wojtek-t · 2016-02-22T19:39:09Z

@yujuhong - we are running exactly the same test (the same number of nodes, the same number of pods, etc.) in both: kubemark and real cluster.
Only the former was failing, so it doesn't seem to be related to any batching or sth like. I really think it is something like difference between kubelet and simple kubelet or sth like that.

yujuhong · 2016-02-22T19:43:27Z

@yujuhong - we are running exactly the same test (the same number of nodes, the same number of pods, etc.) in both: kubemark and real cluster.

It has nothing to do with batching as you said :) I think the SimpleKubelet may be too simple and not doing things right -- in which case the duplicated, redundant deletion requests in the periodic cleanup routine may have just saved the day.

wojtek-t · 2016-02-22T19:45:51Z

Yeah - that's definitely possible.

yujuhong · 2016-02-22T23:38:44Z

Alright, I believe I found the cause. Graceful termination does not work correctly in kubemark because of the fake docker client. After "stopping" a container, kubelet was unable to detect that the container has been stopped and thus did not consider the pod was fully terminated. We should fix the fake docker client.

E.g., repeatedly killing the same container:

I0222 23:16:36.861465       7 manager.go:1307] Killing container "/k8s_density60-246e5adf-d9ba-11e5-b19c-c4346b658736.a45f144c_density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg_24fef989-d9ba-11e5-bfa6-42010af00007_52741857 density60-246e5adf-d9ba-11e5-b19c-c4346b658736 e2e-tests-density-c5hwg/density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc" with 30 second grace period
I0222 23:16:37.102736       7 manager.go:1339] Container "/k8s_density60-246e5adf-d9ba-11e5-b19c-c4346b658736.a45f144c_density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg_24fef989-d9ba-11e5-bfa6-42010af00007_52741857 density60-246e5adf-d9ba-11e5-b19c-c4346b658736 e2e-tests-density-c5hwg/density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc" exited after 241.188219ms
W0222 23:16:37.102901       7 manager.go:1345] No ref for pod '"/k8s_density60-246e5adf-d9ba-11e5-b19c-c4346b658736.a45f144c_density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg_24fef989-d9ba-11e5-bfa6-42010af00007_52741857 density60-246e5adf-d9ba-11e5-b19c-c4346b658736 e2e-tests-density-c5hwg/density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc"'
I0222 23:16:52.861760       7 manager.go:270] Ignoring same status for pod "density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg(24fef989-d9ba-11e5-bfa6-42010af00007)", status: {Phase:Running Conditions:[{Type:Ready Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2016-02-22 23:15:37 +0000 UTC Reason: Message:}] Message: Reason: HostIP:10.245.1.5 PodIP:2.3.4.5 StartTime:2016-02-22 23:15:21 +0000 UTC ContainerStatuses:[{Name:density60-246e5adf-d9ba-11e5-b19c-c4346b658736 State:{Waiting:<nil> Running:0xc2089477c0 Terminated:<nil>} LastTerminationState:{Waiting:<nil> Running:<nil> Terminated:<nil>} Ready:true RestartCount:0 Image:gcr.io/google_containers/pause:2.0 ImageID:docker:// ContainerID:docker:///k8s_density60-246e5adf-d9ba-11e5-b19c-c4346b658736.a45f144c_density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg_24fef989-d9ba-11e5-bfa6-42010af00007_52741857}]}
I0222 23:16:52.861940       7 manager.go:1307] Killing container "/k8s_density60-246e5adf-d9ba-11e5-b19c-c4346b658736.a45f144c_density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg_24fef989-d9ba-11e5-bfa6-42010af00007_52741857 density60-246e5adf-d9ba-11e5-b19c-c4346b658736 e2e-tests-density-c5hwg/density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc" with 30 second grace period
I0222 23:16:53.192525       7 manager.go:1339] Container "/k8s_density60-246e5adf-d9ba-11e5-b19c-c4346b658736.a45f144c_density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg_24fef989-d9ba-11e5-bfa6-42010af00007_52741857 density60-246e5adf-d9ba-11e5-b19c-c4346b658736 e2e-tests-density-c5hwg/density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc" exited after 330.523712ms
W0222 23:16:53.192595       7 manager.go:1345] No ref for pod '"/k8s_density60-246e5adf-d9ba-11e5-b19c-c4346b658736.a45f144c_density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg_24fef989-d9ba-11e5-bfa6-42010af00007_52741857 density60-246e5adf-d9ba-11e5-b19c-c4346b658736 e2e-tests-density-c5hwg/density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc"'
I0222 23:17:06.861037       7 manager.go:270] Ignoring same status for pod "density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg(24fef989-d9ba-11e5-bfa6-42010af00007)", status: {Phase:Running Conditions:[{Type:Ready Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2016-02-22 23:15:37 +0000 UTC Reason: Message:}] Message: Reason: HostIP:10.245.1.5 PodIP:2.3.4.5 StartTime:2016-02-22 23:15:21 +0000 UTC ContainerStatuses:[{Name:density60-246e5adf-d9ba-11e5-b19c-c4346b658736 State:{Waiting:<nil> Running:0xc2082c7780 Terminated:<nil>} LastTerminationState:{Waiting:<nil> Running:<nil> Terminated:<nil>} Ready:true RestartCount:0 Image:gcr.io/google_containers/pause:2.0 ImageID:docker:// ContainerID:docker:///k8s_density60-246e5adf-d9ba-11e5-b19c-c4346b658736.a45f144c_density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg_24fef989-d9ba-11e5-bfa6-42010af00007_52741857}]}
I0222 23:17:06.861188       7 manager.go:1307] Killing container "/k8s_density60-246e5adf-d9ba-11e5-b19c-c4346b658736.a45f144c_density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg_24fef989-d9ba-11e5-bfa6-42010af00007_52741857 density60-246e5adf-d9ba-11e5-b19c-c4346b658736 e2e-tests-density-c5hwg/density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc" with 30 second grace period
I0222 23:17:07.089368       7 manager.go:1339] Container "/k8s_density60-246e5adf-d9ba-11e5-b19c-c4346b658736.a45f144c_density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg_24fef989-d9ba-11e5-bfa6-42010af00007_52741857 density60-246e5adf-d9ba-11e5-b19c-c4346b658736 e2e-tests-density-c5hwg/density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc" exited after 228.148632ms
W0222 23:17:07.089424       7 manager.go:1345] No ref for pod '"/k8s_density60-246e5adf-d9ba-11e5-b19c-c4346b658736.a45f144c_density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg_24fef989-d9ba-11e5-bfa6-42010af00007_52741857 density60-246e5adf-d9ba-11e5-b19c-c4346b658736 e2e-tests-density-c5hwg/density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc"'
I0222 23:17:20.861106       7 manager.go:270] Ignoring same status for pod "density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg(24fef989-d9ba-11e5-bfa6-42010af00007)", status: {Phase:Running Conditions:[{Type:Ready Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2016-02-22 23:15:37 +0000 UTC Reason: Message:}] Message: Reason: HostIP:10.245.1.5 PodIP:2.3.4.5 StartTime:2016-02-22 23:15:21 +0000 UTC ContainerStatuses:[{Name:density60-246e5adf-d9ba-11e5-b19c-c4346b658736 State:{Waiting:<nil> Running:0xc208bdc780 Terminated:<nil>} LastTerminationState:{Waiting:<nil> Running:<nil> Terminated:<nil>} Ready:true RestartCount:0 Image:gcr.io/google_containers/pause:2.0 ImageID:docker:// ContainerID:docker:///k8s_density60-246e5adf-d9ba-11e5-b19c-c4346b658736.a45f144c_density60-246e5adf-d9ba-11e5-b19c-c4346b658736-6hzbc_e2e-tests-density-c5hwg_24fef989-d9ba-11e5-bfa6-42010af00007_52741857}]}

Random-Liu · 2016-02-23T01:36:40Z

@yujuhong Thanks!
I found the root cause. In the function updateContainerStatus(), only statuses in ContainerList were updated. But if we stop a container, it will be moved to ExitedContainerList first, so its status won't be updated to "Exited".

gmarek · 2016-02-23T06:49:33Z

cc @fejta

yujuhong added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Feb 18, 2016

googlebot added the cla: yes label Feb 18, 2016

yujuhong assigned timstclair Feb 18, 2016

k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 18, 2016

timstclair reviewed Feb 18, 2016
View reviewed changes

yujuhong mentioned this pull request Feb 18, 2016

e2e: start 100 pods in the density test #21443

Merged

yujuhong force-pushed the terminated_pods branch from d531fce to 58087f8 Compare February 18, 2016 00:58

timstclair added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 18, 2016

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 18, 2016

yujuhong force-pushed the terminated_pods branch from 58087f8 to 386453a Compare February 18, 2016 19:01

k8s-github-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Feb 18, 2016

yujuhong added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 18, 2016

k8s-github-robot pushed a commit that referenced this pull request Feb 22, 2016

Merge pull request #21438 from yujuhong/terminated_pods

6716344

Auto commit by PR queue bot

k8s-github-robot merged commit 6716344 into kubernetes:master Feb 22, 2016

wojtek-t mentioned this pull request Feb 22, 2016

Revert "kubelet: fix duplicated status updates at pod cleanup" #21676

Merged

Random-Liu mentioned this pull request Feb 23, 2016

kubelet: fix duplicated status updates at pod cleanup #21732

Merged

This was referenced Feb 25, 2016

kubelet: fix duplicated status updates at pod cleanup #21959

Merged

kubelet: fix duplicated status updates at pod cleanup #22155

Merged

yujuhong deleted the terminated_pods branch November 7, 2016 21:39

Conversation

yujuhong commented Feb 18, 2016

Uh oh!

yujuhong commented Feb 18, 2016

Uh oh!

k8s-github-robot commented Feb 18, 2016

Uh oh!

yujuhong commented Feb 18, 2016

Uh oh!

timstclair Feb 18, 2016

Choose a reason for hiding this comment

Uh oh!

yujuhong Feb 18, 2016

Choose a reason for hiding this comment

Uh oh!

yujuhong Feb 18, 2016

Choose a reason for hiding this comment

Uh oh!

k8s-bot commented Feb 18, 2016

Uh oh!

timstclair commented Feb 18, 2016

Uh oh!

yujuhong commented Feb 18, 2016

Uh oh!

k8s-bot commented Feb 18, 2016

Uh oh!

k8s-github-robot commented Feb 18, 2016

Uh oh!

k8s-github-robot commented Feb 18, 2016

Uh oh!

yujuhong commented Feb 18, 2016

Uh oh!

k8s-bot commented Feb 18, 2016

Uh oh!

yujuhong commented Feb 18, 2016

Uh oh!

k8s-bot commented Feb 18, 2016

Uh oh!

k8s-github-robot commented Feb 20, 2016

Uh oh!

k8s-bot commented Feb 20, 2016

Uh oh!

k8s-github-robot commented Feb 22, 2016

Uh oh!

k8s-bot commented Feb 22, 2016

Uh oh!

k8s-github-robot commented Feb 22, 2016

Uh oh!

k8s-bot commented Feb 22, 2016

Uh oh!

k8s-github-robot commented Feb 22, 2016

Uh oh!

wojtek-t commented Feb 22, 2016

Uh oh!

wojtek-t commented Feb 22, 2016

Uh oh!

wojtek-t commented Feb 22, 2016

Uh oh!

yujuhong commented Feb 22, 2016

Uh oh!

wojtek-t commented Feb 22, 2016

Uh oh!

yujuhong commented Feb 22, 2016

Uh oh!

wojtek-t commented Feb 22, 2016

Uh oh!

yujuhong commented Feb 22, 2016

Uh oh!

Random-Liu commented Feb 23, 2016

Uh oh!

gmarek commented Feb 23, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants