Conversation
Add three tests for handling invalid metrics when scaling on multiple metrics - one for scaling up successfully (new behaviour) and two for ensuring we don't scale down (existing behaviour).
Handle a case in the Horizontal Pod Autoscaler Controller when scaling on multiple metrics and one or more is missing or invalid. If all metrics are missing - return an error and leave the isScalingActive condition as that for the last invalid metric. If some metrics are missing/invalid and some are valid and found - if a scale up would be triggered by the valid metrics ignore the missing metrics and scale up, if a scale down would be triggered, return an error and leave the isScalingActive condition as that for the last invalid metric.
|
Hi @gjtempleton. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
cc: @josephburnett |
|
/lgtm I'm not yet part of the Kubernetes org, so I can't add /ok-to-test. @mwielgus can you please. |
|
@josephburnett: changing LGTM is restricted to assignees, and only kubernetes/kubernetes repo collaborators may be assigned issues. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/ok-to-test |
|
Thanks! |
|
/test pull-kubernetes-bazel-build |
|
@MaciekPytel @mwielgus Can one of you approve? |
|
/test pull-kubernetes-bazel-build |
|
/assign @mwielgus |
|
/assign @MaciekPytel |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gjtempleton, mwielgus The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Maybe this is worthy of a new issue but regarding this merged feature. Its great that the hpa can scale up with an unknown metric, but is there a flag that is set-able to allow the HPA to scale down for unknown metrics? We have issues where the cluster scales up then a custom metric becomes unknown and the cluster stays way scaled up costing us a fortune. All because of this:
|
|
@aguckenber-chwy that's definitely best discussed as a new issue. Can definitely understand it as a use case users may want, but there's probably a number of different ways we could expose it as a feature, so likely to be some back on forth on that if nothing else. |
Quick question, where is the best spot to put feature requests? The |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Currently the HPA controller will fail to scale up targets if they are scaling on multiple metrics and one or more of these metrics is invalid/unavailable, even if one of the metrics is available and indicates a scale up should happen.
This PR allows a scale up to happen in this case, whilst still ensuring that it fails safe and refuses to scale down if one or more metrics are unavailable.
Which issue(s) this PR fixes:
Fixes #61007
Special notes for your reviewer:
I wasn't 100% sure whether to go for the first or last invalid metric to be reported as the error in the case of refusing to scale up, however going with the last simplified the changes required massively, meaning no need for passing around conditions.
Credit to @bskiba as a large amount of the changes come from her previous PR #61423
Does this PR introduce a user-facing change?:
/sig autoscaling
/priority important-longterm