Skip to content

Burstable QoS cgroup has cpu shares assigned#41753

Merged
k8s-github-robot merged 1 commit into
kubernetes:masterfrom
derekwaynecarr:burstable-cpu-shares
Mar 1, 2017
Merged

Burstable QoS cgroup has cpu shares assigned#41753
k8s-github-robot merged 1 commit into
kubernetes:masterfrom
derekwaynecarr:burstable-cpu-shares

Conversation

@derekwaynecarr

@derekwaynecarr derekwaynecarr commented Feb 20, 2017

Copy link
Copy Markdown
Member

What this PR does / why we need it:
This PR sets the Burstable QoS cgroup cpu shares value to the sum of the pods cpu requests in that tier. We need it for proper evaluation of CPU shares in the new QoS hierarchy.

Special notes for your reviewer:
It builds against the framework proposed for #41833

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 20, 2017
@k8s-reviewable

Copy link
Copy Markdown

This change is Reviewable

@derekwaynecarr

Copy link
Copy Markdown
Member Author

fyi @sjenning

@k8s-github-robot k8s-github-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. release-note-label-needed labels Feb 20, 2017
@derekwaynecarr

Copy link
Copy Markdown
Member Author

FYI @vishh @dashpole @sjenning

@derekwaynecarr derekwaynecarr assigned vishh and unassigned ncdc Feb 20, 2017
Comment thread hack/local-up-cluster.sh
FIRST_SERVICE_CLUSTER_IP=${FIRST_SERVICE_CLUSTER_IP:-10.0.0.1}
# if enabled, must set CGROUP_ROOT
CGROUPS_PER_QOS=${CGROUPS_PER_QOS:-false}
CGROUPS_PER_QOS=${CGROUPS_PER_QOS:-true}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you meaning to change these default? or did you not mean to commit this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didnt mean to commit, will remove when not WIP

func (m *qosContainerManagerImpl) setCPUReserve(configs map[v1.PodQOSClass]*CgroupConfig) error {
pods := m.getPods()
burstablePodCPURequest := int64(0)
for i := range pods {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for _, pod := range pods? then you don't need the assignment on below

},
}

if err := m.setCPUReserve(qosConfigs); err != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was that this call would be made from the switch statement below. I see that you are wanting to set MinShares on BE regardless of whether or not any cpu= option was provided to --experimental-qos-reserve-limits, which kinda works against the flow of the code here.

What would you think about having a function that set MinShares for BE from Start() since that won't ever need updating, and then have a separate function that updates the Burstable shares according to the reserve percentage the the users specifies, which currently isn't being used to calculate the Burstable share right now (why?).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjenning -- we need to run a reconciliation loop to ensure that BestEffort is always 2 shares in case anyone else ever mutates it on the host w/o kubelet knowledge.

@vishh

vishh commented Feb 20, 2017

Copy link
Copy Markdown
Contributor

@derekwaynecarr I'm swamped with v1.6 PRs atm. Based on the roadmap the following PRs are still pending:

  1. Node Allocatable proposal and implementation
  2. Guaranteed scheduling for Critical pods
  3. Pod Cgroups rollout. I'm yet to identify key metrics to manage rollout.
  4. Support for multiple GPUs.
  5. Eviction improvements - pod lifecycle changes and avoiding deleting multiple pods.

I don't know if I will be able to review the QoS related PRs before code freeze. I talked to @dchen1107 and she felt overwhelmed too. One of us definitely wish to review this.

Is QoS level enforcement critical/urgent for you and @sjenning?
Let's sync up on the priority to avoid confusion after the code freeze.

@derekwaynecarr

Copy link
Copy Markdown
Member Author

Qos level enforcement is needed for CPU shares to actually get enforced properly for Burstable tier relative to Guaranteed pods. We can synch tomorrow.

@k8s-github-robot k8s-github-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/old-docs size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 21, 2017
@k8s-github-robot k8s-github-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/old-docs size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 21, 2017

@vishh vishh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment. Otherwise this PR LGTM

}
for _, container := range pod.Spec.Containers {
if request, found := container.Resources.Requests[v1.ResourceCPU]; found {
burstablePodCPURequest += request.MilliValue()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@derekwaynecarr

Copy link
Copy Markdown
Member Author

fixed up my error, i am marking this do-not-merge until seth's pr lands.

k8s-github-robot pushed a commit that referenced this pull request Feb 25, 2017
Automatic merge from submit-queue (batch tested with PRs 41714, 41510, 42052, 41918, 31515)

Disable cgroups-per-qos pending Burstable/cpu.shares being set

Disable cgroups-per-qos to allow kubemark problems to still be resolved.

Re-enable it once the following merge:
#41753
#41644
#41621

Enabling it before cpu.shares is set on qos tiers can cause regressions since Burstable and BestEffort pods are given equal time.
@derekwaynecarr

Copy link
Copy Markdown
Member Author

@k8s-bot test this

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 27, 2017
@derekwaynecarr

Copy link
Copy Markdown
Member Author

i am waiting to update this pr until all pertinent prereqs are merged.

i am tagging it lgtm and do-not-merge in the interim per:
#41753 (review)

@derekwaynecarr derekwaynecarr added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 27, 2017
@derekwaynecarr derekwaynecarr added this to the v1.6 milestone Feb 27, 2017
@k8s-github-robot k8s-github-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Feb 28, 2017
@derekwaynecarr

Copy link
Copy Markdown
Member Author

i rebased this pr, marking do not merge still while seth's pr lands.

@derekwaynecarr derekwaynecarr added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 28, 2017
@derekwaynecarr

Copy link
Copy Markdown
Member Author

requisite pr has merged, will look to rebase shortly....

@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 1, 2017
@derekwaynecarr

Copy link
Copy Markdown
Member Author

@vishh -- just pushed the rebased code, retagging.

@derekwaynecarr derekwaynecarr added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. labels Mar 1, 2017
@k8s-github-robot

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

The following people have approved this PR: derekwaynecarr, k8s-merge-robot

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@vishh

vishh commented Mar 1, 2017

Copy link
Copy Markdown
Contributor

@derekwaynecarr ack. Now we wait for the merge queue to get to this PR.

@k8s-github-robot

Copy link
Copy Markdown

Automatic merge from submit-queue (batch tested with PRs 41644, 42020, 41753, 42206, 42212)

@k8s-github-robot k8s-github-robot merged commit dfe05e0 into kubernetes:master Mar 1, 2017
k8s-github-robot pushed a commit that referenced this pull request Mar 4, 2017
Automatic merge from submit-queue (batch tested with PRs 41919, 41149, 42350, 42351, 42285)

enable cgroups tiers and node allocatable enforcement on pods by default.

```release-note
Pods are launched in a separate cgroup hierarchy than system services.
```
Depends on #41753

cc @derekwaynecarr
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants