mgr/prometheus: introduce metric for collection time#36298
mgr/prometheus: introduce metric for collection time#36298tchaikov merged 3 commits intoceph:masterfrom
Conversation
9d5e796 to
8ad84c0
Compare
|
I think storing a method last run time may work in this particular case, because methods are called together with (before) get_collect_time_metrics. But sill I would prefer if we have counters similar to other "time" counters like latency. I.e. it would be nice if the profile_method collected not the time of the last run, but the accumulated time and the count of runs. And I would prefer 'method' label instead of 'subsection'. It is just my opinion though. |
jan--f
left a comment
There was a problem hiding this comment.
I agree with @trociny change requests.
I.e. we should measure the duration with a summary/simple histogramm metric type (https://prometheus.io/docs/practices/histograms/#count-and-sum-of-observations).
Also 👍 for the s/subsection/method/ change.
5661d65 to
6d69282
Compare
|
I've updated the PR according to the comments and changed the counter metric to be a summary. Please review. |
jan--f
left a comment
There was a problem hiding this comment.
See inline comment. One nit from me would be that the move of the MetricCollectionThread is a bit confusing and hides the actual change somewhat, but I could live with that
6d69282 to
ff57714
Compare
requested changes have been implemented
|
jenkins retest this please |
ff57714 to
d4f9768
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
d4f9768 to
5109c14
Compare
Introduces metric `prometheus_collect_duration_seconds` for the time it
takes the Prometheus manager module to collect all the metrics.
```
ceph_prometheus_collect_duration_seconds_sum{method="get_health"} 0.0002613067626953125
ceph_prometheus_collect_duration_seconds_sum{method="get_pool_stats"} 0.0018298625946044922
ceph_prometheus_collect_duration_seconds_sum{method="get_df"} 0.0005767345428466797
ceph_prometheus_collect_duration_seconds_sum{method="get_fs"} 0.0010402202606201172
ceph_prometheus_collect_duration_seconds_sum{method="get_quorum_status"} 0.0007524490356445312
ceph_prometheus_collect_duration_seconds_sum{method="get_mgr_status"} 0.0035364627838134766
ceph_prometheus_collect_duration_seconds_sum{method="get_pg_status"} 0.00021266937255859375
ceph_prometheus_collect_duration_seconds_sum{method="get_osd_stats"} 0.0018737316131591797
ceph_prometheus_collect_duration_seconds_sum{method="get_metadata_and_osd_status"} 0.0032796859741210938
ceph_prometheus_collect_duration_seconds_sum{method="get_num_objects"} 0.00011086463928222656
ceph_prometheus_collect_duration_seconds_sum{method="get_rbd_stats"} 0.00036144256591796875
ceph_prometheus_collect_duration_seconds_count{method="get_health"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_pool_stats"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_df"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_fs"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_quorum_status"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_mgr_status"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_pg_status"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_osd_stats"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_metadata_and_osd_status"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_num_objects"} 1.0
ceph_prometheus_collect_duration_seconds_count{method="get_rbd_stats"} 1.0
```
Fixes: https://tracker.ceph.com/issues/46703
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
Signed-off-by: Patrick Seidensal <pseidensal@suse.com>
10e079b to
806ef8a
Compare
Introduces metric
prometheus_collect_duration_secondssummary (_sumand_count) for the time ittakes the Prometheus manager module to collect all the metrics.
Fixes: https://tracker.ceph.com/issues/46703
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard backendjenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox