feature(scheduler): implement plugin_evaluation_total metric#115082
feature(scheduler): implement plugin_evaluation_total metric#115082k8s-ci-robot merged 2 commits intokubernetes:masterfrom
Conversation
|
/cc @alculquicondor |
There was a problem hiding this comment.
counters should always end in _total https://prometheus.io/docs/practices/naming/#metric-names
plugin_evaluation_total? This is a low level metric, so I think it's fair to have the word plugin in it.
There was a problem hiding this comment.
Renamed as suggested, thanks.
|
I'll be back tomorrow. |
13a515b to
4f77178
Compare
|
Fixed based on the suggestion, please take a look. 🙏 |
By evaluating the plug being called times, what can we get from this? I mean in what scenario we need this metrics. Not opposing it, something I haven't figured out. |
|
And sorry for let this sitting a long time. Busy with works. |
This is a common question in PRR for any new features: how can an admin know if a feature is being used? This PR moves the needle in that direction. |
| if state.SkipFilterPlugins.Has(pl.Name()) { | ||
| continue | ||
| } | ||
| metrics.PluginEvaluationTotal.WithLabelValues(pl.Name(), Filter).Inc() |
There was a problem hiding this comment.
Let's add PreFilter too? Just in case some plugins only implement PreFilter.
There was a problem hiding this comment.
Or we can add to all extension points for counting plugin called times. Starting with filter is also OK.
There was a problem hiding this comment.
Let's start with preFilter/filter in this PR.
@kidddddddddddddddddddddd will work on the prescore/score.
#110643 (comment)
I'm not sure if it's worth implementing on other extension points for now because it's other extension points don't have Skip feature. I'd say we can support other extension points when we support Skip on them.
|
/label sig-instrumentation |
|
@alculquicondor: The label(s) DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/sig instrumentation |
Got it, before without the support of Skip, all enabled filter plugins' being called times should be equal, now they're not very likely, this can be grabbed by metrics. Same to new plugins added. |
| if state.SkipFilterPlugins.Has(pl.Name()) { | ||
| continue | ||
| } | ||
| metrics.PluginEvaluationTotal.WithLabelValues(pl.Name(), Filter).Inc() |
There was a problem hiding this comment.
Or we can add to all extension points for counting plugin called times. Starting with filter is also OK.
| Name: "plugin_evaluation_total", | ||
| Help: "Number of attempts to schedule pods by each plugin and the extension point (only Filter is supported now.).", | ||
| StabilityLevel: metrics.ALPHA, | ||
| }, []string{"plugin", "extension_point"}) |
There was a problem hiding this comment.
Maybe profile here is also needed?
|
/assign |
|
Thanks @kerthcet @alculquicondor. |
6fb7b49 to
608f480
Compare
|
@kerthcet @alculquicondor Sorry for being late on it. Fixed based on the suggestions:
|
|
| if state.SkipFilterPlugins.Has(pl.Name()) { | ||
| continue | ||
| } | ||
| metrics.PluginEvaluationTotal.WithLabelValues(pl.Name(), Filter, f.profileName).Inc() |
There was a problem hiding this comment.
🤔 why is the Filter value exported, but not preFilter? But this is outside of the scope of this PR.
There was a problem hiding this comment.
Seems Filter value is used in another package. (schedule_one.go)
https://github.com/sanposhiho/kubernetes/blob/799444a689da80470ad1aa87153f4feb5134b3db/pkg/scheduler/schedule_one.go#L533
It may be better to move those variables to metrics package? since they are used in the metric value's label.
There was a problem hiding this comment.
yes, that sounds like a better location.
|
LGTM label has been added. DetailsGit tree hash: 95cdc134b79f33807c3ee8847b54a32b64167d1d |
| Name: "plugin_evaluation_total", | ||
| Help: "Number of attempts to schedule pods by each plugin and the extension point (available only in PreFilter and Filter.).", | ||
| StabilityLevel: metrics.ALPHA, | ||
| }, []string{"plugin", "extension_point", "profile"}) |
There was a problem hiding this comment.
What's the cardinality of these labels?
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, sanposhiho The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
modified the release not a bit |
What type of PR is this?
/kind feature
What this PR does / why we need it:
New
"schedule_attempts""plugin_evaluation_total" metric is added to the scheduler. This metric counts how many times the specific plugin affects the scheduling result while taking Skip status into consideration.It's now only supporting the Filter status. After #114827, we can support "Score" as well.
Which issue(s) this PR fixes:
Part of #110643
Special notes for your reviewer:
Not sure if "schedule_attempts" is the most suitable name. Tell me if any other good idea on naming.
EDIT: renamed to
plugin_evaluation_totalas suggested in #115082 (comment).Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: