Burstable QoS cgroup has cpu shares assigned#41753
Conversation
|
fyi @sjenning |
921e76f to
98c73e9
Compare
| FIRST_SERVICE_CLUSTER_IP=${FIRST_SERVICE_CLUSTER_IP:-10.0.0.1} | ||
| # if enabled, must set CGROUP_ROOT | ||
| CGROUPS_PER_QOS=${CGROUPS_PER_QOS:-false} | ||
| CGROUPS_PER_QOS=${CGROUPS_PER_QOS:-true} |
There was a problem hiding this comment.
are you meaning to change these default? or did you not mean to commit this?
There was a problem hiding this comment.
didnt mean to commit, will remove when not WIP
| func (m *qosContainerManagerImpl) setCPUReserve(configs map[v1.PodQOSClass]*CgroupConfig) error { | ||
| pods := m.getPods() | ||
| burstablePodCPURequest := int64(0) | ||
| for i := range pods { |
There was a problem hiding this comment.
nit: for _, pod := range pods? then you don't need the assignment on below
| }, | ||
| } | ||
|
|
||
| if err := m.setCPUReserve(qosConfigs); err != nil { |
There was a problem hiding this comment.
The idea was that this call would be made from the switch statement below. I see that you are wanting to set MinShares on BE regardless of whether or not any cpu= option was provided to --experimental-qos-reserve-limits, which kinda works against the flow of the code here.
What would you think about having a function that set MinShares for BE from Start() since that won't ever need updating, and then have a separate function that updates the Burstable shares according to the reserve percentage the the users specifies, which currently isn't being used to calculate the Burstable share right now (why?).
There was a problem hiding this comment.
@sjenning -- we need to run a reconciliation loop to ensure that BestEffort is always 2 shares in case anyone else ever mutates it on the host w/o kubelet knowledge.
|
@derekwaynecarr I'm swamped with v1.6 PRs atm. Based on the roadmap the following PRs are still pending:
I don't know if I will be able to review the QoS related PRs before code freeze. I talked to @dchen1107 and she felt overwhelmed too. One of us definitely wish to review this. Is QoS level enforcement critical/urgent for you and @sjenning? |
|
Qos level enforcement is needed for CPU shares to actually get enforced properly for Burstable tier relative to Guaranteed pods. We can synch tomorrow. |
98c73e9 to
a0e074a
Compare
a0e074a to
f147a73
Compare
vishh
left a comment
There was a problem hiding this comment.
Just one comment. Otherwise this PR LGTM
| } | ||
| for _, container := range pod.Spec.Containers { | ||
| if request, found := container.Resources.Requests[v1.ResourceCPU]; found { | ||
| burstablePodCPURequest += request.MilliValue() |
There was a problem hiding this comment.
Use v1. PodRequestsAndLimits() instead https://github.com/kubernetes/kubernetes/blob/master/pkg/api/v1/resource_helpers.go#L185
|
fixed up my error, i am marking this do-not-merge until seth's pr lands. |
Automatic merge from submit-queue (batch tested with PRs 41714, 41510, 42052, 41918, 31515) Disable cgroups-per-qos pending Burstable/cpu.shares being set Disable cgroups-per-qos to allow kubemark problems to still be resolved. Re-enable it once the following merge: #41753 #41644 #41621 Enabling it before cpu.shares is set on qos tiers can cause regressions since Burstable and BestEffort pods are given equal time.
|
@k8s-bot test this |
|
i am waiting to update this pr until all pertinent prereqs are merged. i am tagging it lgtm and do-not-merge in the interim per: |
801b704 to
f70fdb9
Compare
|
i rebased this pr, marking do not merge still while seth's pr lands. |
|
requisite pr has merged, will look to rebase shortly.... |
f70fdb9 to
1947e76
Compare
|
@vishh -- just pushed the rebased code, retagging. |
|
[APPROVALNOTIFIER] This PR is APPROVED The following people have approved this PR: derekwaynecarr, k8s-merge-robot Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
|
@derekwaynecarr ack. Now we wait for the merge queue to get to this PR. |
|
Automatic merge from submit-queue (batch tested with PRs 41644, 42020, 41753, 42206, 42212) |
Automatic merge from submit-queue (batch tested with PRs 41919, 41149, 42350, 42351, 42285) enable cgroups tiers and node allocatable enforcement on pods by default. ```release-note Pods are launched in a separate cgroup hierarchy than system services. ``` Depends on #41753 cc @derekwaynecarr
What this PR does / why we need it:
This PR sets the Burstable QoS cgroup cpu shares value to the sum of the pods cpu requests in that tier. We need it for proper evaluation of CPU shares in the new QoS hierarchy.
Special notes for your reviewer:
It builds against the framework proposed for #41833