Bug #64321
openmgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance)
0%
Description
Description of problem¶
The ceph-mixins allow for dashboards and alerts to be made compatible with metrics of multiple Ceph clusters being stored in the same Prometheus instance. This can be achieved via the settings
clusterLabel: 'cluster',
showMultiCluster: true,
inside of https://github.com/ceph/ceph/blob/main/monitoring/ceph-mixin/config.libsonnet and then recompiling the dashboards and alerts.
Environment¶
ceph versionstring: ceph version 18.2.1 (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable)- Platform (OS/distro/release): Ubuntu 20.04 (Jammy)
How reproducible¶
Steps:
- set
showMultiClusterto true - run
make generate - check out
dashboards_outandprometheus_alerts.ymlin regards toclusterlabel used consistently to allow for individual clusters to be targeted and to tolerate metrics for multiple clusters be stored in the same Prometheus instance
- some tests (
make test) seem to also fail when the showMultiCluster option is enabled. Maybe testing of them is not properly implemented at all?
Actual results¶
Some queries don't filter on cluster label, so metrics of multiple clusters are returned. This results dashboards showing metrics of multiple clusters in the same graphs or, in case of joins, label collisions occur due to the same label and value e.g. ceph_daemon="osd.0" being present multiple times (from different clusters). For alerts using joins collisions cause them to not be evaluated. The cluster name is not mentioned consistently in the description or summary.
Expected results¶
After selecting a cluster in the template Grafana only metrics for the same Ceph cluster are shown.
For alerts I expect them to work for a single Prometheus instance hosting the metrics for multuple Ceph clusters.
Additional info¶
There seems to be also some inconsistencies related to the "style" of dealing with the instance label (vs. hostname).
I raised another bug about that one in general - https://tracker.ceph.com/issues/64288
Updated by Christian Rohmann about 2 years ago
I pushed a PR - https://github.com/ceph/ceph/pull/55495
Updated by Aashish Sharma almost 2 years ago
- Status changed from New to Pending Backport
- Backport set to squid,reef
- Pull request ID set to 55495
Updated by Aashish Sharma almost 2 years ago
- Status changed from Pending Backport to Fix Under Review
Updated by Aashish Sharma almost 2 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Aashish Sharma almost 2 years ago
- Status changed from Pending Backport to Fix Under Review
- Assignee set to Christian Rohmann
Updated by Aashish Sharma almost 2 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Pedro González Gómez almost 2 years ago
- Status changed from Pending Backport to New
- Backport changed from squid,reef to squid, reef
Updated by Pedro González Gómez almost 2 years ago
- Status changed from New to Pending Backport
Updated by Pedro González Gómez almost 2 years ago
- Status changed from Pending Backport to Resolved
Updated by Pedro González Gómez almost 2 years ago
- Status changed from Resolved to Pending Backport
Updated by Aashish Sharma almost 2 years ago
- Status changed from Pending Backport to Fix Under Review
- Assignee changed from Christian Rohmann to Aashish Sharma
Updated by Aashish Sharma almost 2 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Aashish Sharma almost 2 years ago
- Copied to Backport #65838: squid: mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance) added
Updated by Aashish Sharma almost 2 years ago
- Copied to Backport #65840: reef: mgr/dashboard: dashboards and alerts from ceph-mixins not fully compatible with showMultiCluster=true (multiple Ceph clusters some Prometheus instance) added
Updated by Aashish Sharma almost 2 years ago
- Assignee changed from Aashish Sharma to Christian Rohmann
Updated by Upkeep Bot over 1 year ago
- Tags (freeform) set to backport_processed
Updated by Upkeep Bot 9 months ago
- Merge Commit set to a8d01fff0050c90c634d1f645d10698a4d92ca47
- Fixed In set to v19.3.0-1906-ga8d01fff005
- Upkeep Timestamp set to 2025-07-09T14:05:26+00:00
Updated by Upkeep Bot 9 months ago
- Fixed In changed from v19.3.0-1906-ga8d01fff005 to v19.3.0-1906-ga8d01fff00
- Upkeep Timestamp changed from 2025-07-09T14:05:26+00:00 to 2025-07-14T17:41:24+00:00
Updated by Upkeep Bot 5 months ago
- Released In set to v20.2.0~2997
- Upkeep Timestamp changed from 2025-07-14T17:41:24+00:00 to 2025-11-01T00:58:10+00:00