Skip to content

apiserver: add --shutdown-delay-duration to keep serving until LBs stop sending traffic#74416

Merged
k8s-ci-robot merged 1 commit intokubernetes:masterfrom
sttts:sttts-apiserver-minimum-shutdown-duration
Jul 12, 2019
Merged

apiserver: add --shutdown-delay-duration to keep serving until LBs stop sending traffic#74416
k8s-ci-robot merged 1 commit intokubernetes:masterfrom
sttts:sttts-apiserver-minimum-shutdown-duration

Conversation

@sttts
Copy link
Copy Markdown
Contributor

@sttts sttts commented Feb 22, 2019

This is meant to delay the apiserver shutdown for a defined time duration in order to give the SDN a chance to update changed endpoints.

The reconciler is part of the "master controller", also called "bootstrap controller". It has a pre shutdown hook triggered by the stopCh. We delay the internalStopCh being closed which triggers to stop serving.

Add --shutdown-delay-duration to kube-apiserver in order to delay a graceful shutdown. `/healthz` will keep returning success during this time and requests are normally served, but `/readyz` will return faillure immediately. This delay can be used to allow the SDN to update iptables on all nodes and stop sending traffic.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 22, 2019
@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Feb 22, 2019

/assign @deads2k

@sttts sttts added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 22, 2019
@k8s-ci-robot k8s-ci-robot removed the needs-kind Indicates a PR lacks a `kind/foo` label and requires one. label Feb 22, 2019
@sttts sttts added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Feb 22, 2019
@k8s-ci-robot k8s-ci-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Feb 22, 2019
@sttts sttts added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 22, 2019
@k8s-ci-robot k8s-ci-robot removed the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Feb 22, 2019
@sttts sttts removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Feb 22, 2019
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver labels Feb 22, 2019
@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Feb 22, 2019

/assign @stewart-yu

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can can remove "" at the end of line 184?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just following the style in this file.

@sttts sttts force-pushed the sttts-apiserver-minimum-shutdown-duration branch from 1ec3faf to 6418b0c Compare February 25, 2019 14:51
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 25, 2019
@deads2k
Copy link
Copy Markdown
Contributor

deads2k commented Feb 25, 2019

This lgtm. It helps limit an unnecessary race. It's not foolproof because we cannot be aware of all of our consumers, but it makes it possible to avoid unnecessary dead endpoints.

@kubernetes/sig-api-machinery-misc

@lavalamp
Copy link
Copy Markdown
Contributor

/hold

not that I necessarily disagree, but I think this might be a big enough addition to the surface area that I want to think about it for a second.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 25, 2019
@sttts sttts force-pushed the sttts-apiserver-minimum-shutdown-duration branch from 77bb860 to bd1f77a Compare July 5, 2019 12:05
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 5, 2019
@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Jul 5, 2019

Rebased.

@lavalamp @logicalhan please cancel the hold here.

@sttts sttts added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 5, 2019
@sttts
Copy link
Copy Markdown
Contributor Author

sttts commented Jul 9, 2019

/retest

@sttts sttts force-pushed the sttts-apiserver-minimum-shutdown-duration branch from bd1f77a to e0d6b98 Compare July 9, 2019 20:00
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 9, 2019
@sttts sttts force-pushed the sttts-apiserver-minimum-shutdown-duration branch from e0d6b98 to 408f36b Compare July 9, 2019 20:09
@logicalhan
Copy link
Copy Markdown

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 9, 2019
@logicalhan
Copy link
Copy Markdown

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 11, 2019
@fejta-bot
Copy link
Copy Markdown

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

1 similar comment
@fejta-bot
Copy link
Copy Markdown

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 7e17aeb into kubernetes:master Jul 12, 2019
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@sttts: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce-100-performance 408f36b link /test pull-kubernetes-e2e-gce-100-performance

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

rfranzke added a commit to rfranzke/gardener that referenced this pull request Sep 19, 2019
We set the shutdown delay to `20s` which will lead to the fact that
kube-apiserver will serve requests normally and `/healthz` returns
success, but `/readyz` will immediately return `false`. Graceful
termination starts after this delay has elapsed.
We are using this to allow load balancers to stop sending traffic to
this server (the SDN has time to update the iptables on all nodes and
stop sending traffic). Previously, the kube-apiserver was stopping
servering requests while it may still got sent traffic.

See kubernetes/kubernetes#74416
rfranzke added a commit to gardener/gardener that referenced this pull request Sep 20, 2019
We set the shutdown delay to `20s` which will lead to the fact that
kube-apiserver will serve requests normally and `/healthz` returns
success, but `/readyz` will immediately return `false`. Graceful
termination starts after this delay has elapsed.
We are using this to allow load balancers to stop sending traffic to
this server (the SDN has time to update the iptables on all nodes and
stop sending traffic). Previously, the kube-apiserver was stopping
servering requests while it may still got sent traffic.

See kubernetes/kubernetes#74416
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants