KEP-2625: Promote the CPUManager Policy Option full-pcpus-only to GA#5108
Conversation
|
/sig node |
cosmetic changes, no changes in content besides the bare minimum neededl to adjust to the new template. Signed-off-by: Francesco Romani <fromani@redhat.com>
catch up with the updates since beta graduation Signed-off-by: Francesco Romani <fromani@redhat.com>
5bde288 to
eb7974c
Compare
|
/retitle KEP-2625: Promote CPUManager Policy Options to GA |
| ##### e2e tests | ||
|
|
||
| For all these reasons we postponed this work to a later date. | ||
| TBD |
There was a problem hiding this comment.
/hold
We should probably have this filled out before merging.
| The beta-quality options are visible by default, and the feature gate allows a positive acknowledgement that non stable features are being used, and also allows to optionally turn them off. | ||
| Based on the graduation criteria described below, a policy option will graduate from a group to the other (alpha to beta). | ||
| We plan to removete the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta. | ||
| We plan to remove the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta. |
There was a problem hiding this comment.
It feels like we won't ever remove CPUManagerPolicyAlphaOptions and CPUManagerPolicyBetaOptions.
We have uncorecache that is now an alpha policy option.
Is our plan really to remove this and then add it back when people want a new CPUManagerPolicy?
Or do we want to have dedicated feature gates for each policy option now?
There was a problem hiding this comment.
Fair point, I need to rephrase to convey the meaning here
There was a problem hiding this comment.
We still have feature gates guarding groups of options (and totally we have a bunch of alpha options, not just uncorecache). Arguably the removal strategy of these FGs is a bit unspecified. We didn't touch this subject much after the KEP was first discussed. I'm open to suggestions here.
There was a problem hiding this comment.
Yeah, we didn't explicitly discuss the removal strategy for feature gates. Removing a feature gate after a single cycle without new policy options seems a bit aggressive, but keeping feature gates indefinitely when they no longer guard any options isn’t ideal either.
Once all policy options have graduated to GA, it might be reasonable to wait at least two cycles before removing the feature gates if no new options are planned? Let's have this discussion during the PRR review of this PR.
Additionally, as new policy options become less frequent, we should revisit whether to reintroduce CPUManagerPolicyAlphaOptions and CPUManagerPolicyBetaOptions when new options are proposed after feature gates have been removed. This would help avoid feature gate proliferation - the main reason we moved away from a feature gate per policy option or determine if we should revert to the standard approach of gating each policy option individually.
There was a problem hiding this comment.
I'm fine with this alternative, but the problem here is which KEP is supposed to track this work.
This KEP wanted to introduce a single new option, and we added the CPUManagerPolicyAlphaOptions and CPUManagerPolicyBetaOptions along the way, as outcome of the conversation
There was a problem hiding this comment.
Good point. All policy option KEPs already enumerate the feature gates that guard them. If that is not the case, it should be rectified. So, when all options have graduated to GA, the responsibility of cleaning up these feature gates should fall on the last policy option reaching GA.
Based on our discussion in SIG Node yesterday, it would make sense to couple this with other cleanup work to be done after GA graduation. The recommended timeline here is three releases, so IMO we should follow the same approach.
There was a problem hiding this comment.
I tend to agree. Either that or a new minimal KEP to remove the gates once we reached idle time.
There was a problem hiding this comment.
Once PRR team has had a chance to weigh in on this, I would capture this discussion in the KEP or in the issue so that the plan is properly documented for future reference and doesn't get lost in github comments.
There was a problem hiding this comment.
Yes, that makes sense. Once we are done with most options we can retire the gates. If later we add an option here or there, we can probably give them their own gates (or not, we can decide then).
There was a problem hiding this comment.
Perfect, further rephrased to capture this conversation.
|
|
||
| N/A. | ||
|
|
||
| ###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? |
There was a problem hiding this comment.
I don't think these are SLIs.
https://github.com/kubernetes/community/blob/master/sig-scalability/slos/slos.md
There was a problem hiding this comment.
I see what you mean, but at the same time this is what the template is suggesting https://github.com/kubernetes/enhancements/blob/master/keps/NNNN-kep-template/README.md#what-are-the-slis-service-level-indicators-an-operator-can-use-to-determine-the-health-of-the-service and I followed that. What else should be set here?
There was a problem hiding this comment.
Are the "errors" here system or user error? If they are system errors, then the error rate (errors/requests) is an SLI. If they are user errors, it's not really. Latency for configuring this could be another SLI, but it is encompassed by the general Pod provisioning SLI and I don't think we need to track this separately.
Since there are no SLOs the SLIs don't mean much. But if this is a system error, we probably should use it to specify an SLO as well.
I am OK leaving these in here for documentation, regardless (or they could move to the troubleshooting section)
There was a problem hiding this comment.
cpu_manager_pinning_errors_total are system errors. Meaning: the system was asked to pin CPUs because the workloads requests them, but failed to do so. cpu_manager_pinning_requests_total tracks the number of workloads (containers) managed by the kubelet which requested cpu pinning (the usual cpumanager requirement: guaranteed QoS pods, integral CPU requested).
AFAICT I agree there is no SLOs, but still these metrics may be helpful to troubleshoot nodes.
TL;DR: AFAIU I can leave this part unchanged, so I'm doing that. Please let me know if I need to change something after the clarification above.
|
|
||
| ###### Are there any missing metrics that would be useful to have to improve observability of this feature? | ||
|
|
||
| We can detail the pinning errors total with a new metric like `cpu_manager_errors_count` or |
There was a problem hiding this comment.
For GA, I would expect that this metric is there already.
There was a problem hiding this comment.
Fair enough, we can add this as GA requirement.
Depends on: kubernetes/kubernetes#129529
There was a problem hiding this comment.
So if it depends on that issue, then are we not promoting this to GA in this release? and adding a new metric as a new beta?
There was a problem hiding this comment.
While we can wait for another cycle, adding metrics in the context of graduation is something we did repeatedly in the past and AFAIK is not controversial or uncommon.
I can also redo the work I did on 129529, it's just a bit wasteful and I'd prefer not to, but I can if this simplifies the flow.
There was a problem hiding this comment.
While we can wait for another cycle, adding metrics in the context of graduation is something we did repeatedly in the past and AFAIK is not controversial or uncommon. I can also redo the work I did on 129529, it's just a bit wasteful and I'd prefer not to, but I can if this simplifies the flow.
Actually no, we don't need to depend on this. On second look the redo is minimal. Will file a new PR. Thanks for the comment!
There was a problem hiding this comment.
Adding them in a separate PR.
There was a problem hiding this comment.
eb7974c to
a439e40
Compare
| In order to make the resource reporting consistent, and avoiding cascading changes in the system, we enforce the request constraints at admission time. | ||
| This approach follows what the Topology Manager already does. | ||
|
|
||
| ### Alternatives |
There was a problem hiding this comment.
Why remove the alternatives? I think we should keep them for future reference and context.
There was a problem hiding this comment.
Moved at the bottom per (AFAIU) new KEP template
| The beta-quality options are visible by default, and the feature gate allows a positive acknowledgement that non stable features are being used, and also allows to optionally turn them off. | ||
| Based on the graduation criteria described below, a policy option will graduate from a group to the other (alpha to beta). | ||
| We plan to removete the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta. | ||
| We plan to remove the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta. |
There was a problem hiding this comment.
Yeah, we didn't explicitly discuss the removal strategy for feature gates. Removing a feature gate after a single cycle without new policy options seems a bit aggressive, but keeping feature gates indefinitely when they no longer guard any options isn’t ideal either.
Once all policy options have graduated to GA, it might be reasonable to wait at least two cycles before removing the feature gates if no new options are planned? Let's have this discussion during the PRR review of this PR.
Additionally, as new policy options become less frequent, we should revisit whether to reintroduce CPUManagerPolicyAlphaOptions and CPUManagerPolicyBetaOptions when new options are proposed after feature gates have been removed. This would help avoid feature gate proliferation - the main reason we moved away from a feature gate per policy option or determine if we should revert to the standard approach of gating each policy option individually.
full-pcpus-only to GA
swatisehgal
left a comment
There was a problem hiding this comment.
/lgtm
Looks good from node-perspective!
| The beta-quality options are visible by default, and the feature gate allows a positive acknowledgement that non stable features are being used, and also allows to optionally turn them off. | ||
| Based on the graduation criteria described below, a policy option will graduate from a group to the other (alpha to beta). | ||
| We plan to removete the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta. | ||
| We plan to remove the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta. |
There was a problem hiding this comment.
Once PRR team has had a chance to weigh in on this, I would capture this discussion in the KEP or in the issue so that the plan is properly documented for future reference and doesn't get lost in github comments.
a439e40 to
507487e
Compare
|
@swatisehgal fully agree, the outcome must be recorded in the KEP itsellf. Perhaps other sig-arch members should be involved in the convo. I captured your comments and moved in a separate section, PTAL again when you have time |
johnbelamaric
left a comment
There was a problem hiding this comment.
A few points in the comments I sent individually, once those updates are done, PRR should be good to go.
File missing content Signed-off-by: Francesco Romani <fromani@redhat.com>
507487e to
da6db82
Compare
|
thanks @johnbelamaric ! I think I addressed all the review comments. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ffromani, johnbelamaric, mrunalp The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Add metrics to report alignment allocation failures See: kubernetes/enhancements#5108 Signed-off-by: Francesco Romani <fromani@redhat.com>
Add metrics to report alignment allocation failures See: kubernetes/enhancements#5108 Signed-off-by: Francesco Romani <fromani@redhat.com>
Add metrics to report alignment allocation failures See: kubernetes/enhancements#5108 Signed-off-by: Francesco Romani <fromani@redhat.com>
Add metrics to report alignment allocation failures See: kubernetes/enhancements#5108 Signed-off-by: Francesco Romani <fromani@redhat.com>
Add metrics to report alignment allocation failures See: kubernetes/enhancements#5108 Signed-off-by: Francesco Romani <fromani@redhat.com>
Add metrics to report alignment allocation failures See: kubernetes/enhancements#5108 Signed-off-by: Francesco Romani <fromani@redhat.com>
full-pcpus-only1.33 GAfull-pcpus-onlyoption will graduate to GACPUManagerPolicyOptionsmaster FG will be locked to true, and removed 3 versions later