node: cpumanager: add options to reject non SMT-aligned workload#2626
Conversation
|
On Apr 13 sig-node meeting we agreed this proposal need more discussion, hence I posted this PR to foster that discussion, but for the same reason I'm not adding approver yet. |
|
There are some discussion about external cpu manager policies. I think you may be interesting: https://mail.google.com/mail/u/0?ik=4e2f4a7bba&view=om&permmsgid=msg-f%3A1696952803188250466 |
7dddd86 to
50bd260
Compare
Summary to be filled once WIP is removed Signed-off-by: Francesco Romani <fromani@redhat.com>
50bd260 to
bfa538b
Compare
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Capture implementation details of enabling smtaware CPU Manager policy
yes, we need to agree in the community about which way is the best to extend the cpumanager in the first place. This is a very important point I'll make sure to mention in the sig-node meeting. This KEP describes the behaviour we want to achieve in the end and the PoC implementation we have: to reimplement it using external policies is a perfectly fine approach. |
Signed-off-by: Francesco Romani <fromani@redhat.com>
6a2667c to
1c26db7
Compare
…policy - Fix minor formatting and typos - Disambiguate the text in smtaware policy to explain when SMTAlignmentError occurs. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Minor formatting/typo fixes and disambiguate explanation of smtaware policy
Signed-off-by: Francesco Romani <fromani@redhat.com>
|
overhaul and update after sig-node discussion on April 20, 2021. Link to the PR TBD. Will cc relevant people ASAP. |
…ementation Recent identification of a bug in the resourceAdmitHandlers Admit handler, led to changes in the implementation whereby ContainerManagerAdmissionError is no longer returned. This patch ensures that implementation strategy captured in the KEP matches the latest smtaware CPUManager implementation. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
Ensure Implementation stategy captured in the KEP matches latest implementation
Signed-off-by: Francesco Romani <fromani@redhat.com>
|
PRR looks good, will await SIG approval before approving |
kad
left a comment
There was a problem hiding this comment.
Few nits in wordings...
In overall: I'd suggest to re-consider to have dedicated policy name that would re-use as much as possible code from static policy, instead of creating new flag/option field in kubelet config. Reasons for that is that there are other components that can consume settings of Kubelet for some reasons (e.g. NFD or third party operators). Adding new field that alter behaviour of static policy means that all of those needs to be changed.
But beside that, the whole cause of adding SMT aligned workload placement is good thing to add.
|
|
||
| ### Proposed Change | ||
|
|
||
| We propose to add a new flag in Kubelet called `CPUManagerPolicyOptions` in the kubelet config or command line argument called `cpumanager-policy-options` which allows the user to specify the CPU Manager policy option. If the value of this option is specified to be `reject-non-smt-aligned`, it results in further refinements of the existing static policy. |
There was a problem hiding this comment.
Frankly speaking, I'd prefer as separate policy name "smt-aligned-static" instead of option field to existing policy.
There are might be other components in the wild that reads kubelet config to determine node state and configured TopologyManager/CPUManager/... states. Instead of trying to teach those new field to parse (policy=static, option=xxx), might be better to have different policy names for different behaviours (policy=a vs. policy=b).
There was a problem hiding this comment.
We had a discussion on sig-node slack channel. The main issue of the options approach is that options can alter the behaviour of the static policy, but if legacy (= not updated) consumers/unaware operators looking only at the policy name can fail to detect this and run with wrong assumptions about the kubelet behaviour.
So, we will add a new policy name (static-with-options? static-enhanced?) which will be an alias of static, but will enable the consumption of the options. This means that to enable this new behaviour we are proposing, a operator would need to:
- enable the feature gate (needed anyway)
- change the policy name
- actually provide options ( :) )
There was a problem hiding this comment.
For more context: https://kubernetes.slack.com/archives/C0BP8PW9G/p1620737012440700
f90b869 to
af201ed
Compare
|
/assign @derekwaynecarr |
af201ed to
def5c71
Compare
During the review it was pointed out that existing operators (software component or humans) make logic on the policy name. Options may change the behaviour of the policy in a non-backward-compatible way[1], thus we need a way to signal this to consumers; we agreed to add an alias for the static policy whose sole purpose is to make this change evident. +++ [1] the change we are proposing in this KEP is not, but the cpumanager infra change we are proposing which in turn makes the change possible would enable these future options. Signed-off-by: Francesco Romani <fromani@redhat.com>
def5c71 to
7dd48d6
Compare
|
looking for a more descriptive name, changed from |
|
/lgtm Let’s get this merged and we can always come back and tweak naming details like this after we’ve iterated on the implementation a bit. |
|
/lgtm from my side as well. Thanks @fromanirh for updates. |
derekwaynecarr
left a comment
There was a problem hiding this comment.
please remove the configurable alias and then I think this is good to proceed.
| - add a new flag in Kubelet called `CPUManagerPolicyOptions` in the kubelet config or command line argument called `cpumanager-policy-options` which allows the user to specify the CPU Manager policy option. | ||
| - finally, add a new cpu manager option called `reject-non-smt-aligned`; if present, this option will enable further refinements of the existing static policy. | ||
|
|
||
| The addition of an alias for the `static` policy is motivated by the fact the new CPUManager options can alter the behaviour of the policy in a backward incompatible way. |
There was a problem hiding this comment.
The static policy as understood today is a baseline. I am not sure I understand the configurable alias as that alias has the same meaning as the existing static policy. At the moment, I think its fine to just have the policy options bit because 'configurable' has no meaning yet until such time we enumerate each behavior as named policy tokens.
There was a problem hiding this comment.
basically, let's remove configurable alias at the moment until a future time lets you declare all attributes as policy options that inform a policy.
| ### Implementation strategy of reject-non-smt-aligned CPU Manager policy option | ||
|
|
||
| - In order to introduce the SMT-alignment check in CPU Manager, we introduce a new flag in Kubelet to allow the user to specify `cpumanager-policy-options` which when specified with `reject-non-smt-aligned` as its value provides the capability to modify the behaviour of static policy to strictly guarantee allocation of whole cores to a workload. | ||
| - We add a new policy called `configurable`. Only if this policy is selected the options, if given, will be considered. Otherwise, they will be ignored. |
There was a problem hiding this comment.
please remove configurable until a future time.
| - [X] Feature gate (also fill in values in `kep.yaml`). | ||
| - Feature gate name: `CPUManagerPolicyOptions`. | ||
| - Components depending on the feature gate: kubelet | ||
| - [X] Change the kubelet configuration to set the CPUManager policy to `configurable` |
There was a problem hiding this comment.
remove this and just set 'static' as existing policy.
| - [X] Change the kubelet configuration to set the CPUManager policy to `configurable` | ||
| - [X] Change the kubelet configuration adding the CPUManager policy option to `reject-non-smt-aligned` | ||
| * **Does enabling the feature change any default behavior?** | ||
| - Yes, it makes the behaviour of the CPUManager static policy more restrictive and can lead to pod admission rejection. |
There was a problem hiding this comment.
note for others that cpu manager static policy can already reject pods, this just adds another reason for said rejection.
Address review comments Signed-off-by: Francesco Romani <fromani@redhat.com>
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: derekwaynecarr, ehashman, fromanirh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
We propose a change in cpumanager to make the behaviour of latency-sensitive applications more predictable when running on SMT-enabled systems. The existing
staticpolicy is allowed to allocate single cpus (aka virtual threads on SMT-enabled systems), and this can cause physical core sharing which in turn is a major cause for noisy neighbours, which harms the behaviour of latency-sensitive applications.We add extra constraints to the aformenentioned policy, while preserving all its properties, to avoid physical core sharing.
Implementation PR link: kubernetes/kubernetes#101432
Signed-off-by: Francesco Romani fromani@redhat.com