Actions
Bug #72011
closedmgr/prometheus module at ceph_cluster is unreachable
% Done:
0%
Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
v20.3.0-1609-ge1c6a7d524
Released In:
Upkeep Timestamp:
2025-07-15T13:36:57+00:00
Description
/a/skanta-2025-07-04_23:32:34-rados-wip-bharath13-t[…]2025-07-04-0559-distro-default-smithi/8370622
Error in teuthology.log:
2025-07-05T03:25:49.688 INFO:teuthology.orchestra.run.smithi062.stderr:+ curl -s http://172.21.15.142:9095/api/v1/alerts
2025-07-05T03:25:49.692 INFO:teuthology.orchestra.run.smithi062.stderr:+ curl -s http://172.21.15.142:9095/api/v1/alerts
2025-07-05T03:25:49.692 INFO:teuthology.orchestra.run.smithi062.stderr:+ jq -e '.data | .alerts | .[] | select(.labels | .alertname == "CephMonDown") | .state == "firing"'
2025-07-05T03:25:50.273 DEBUG:teuthology.orchestra.run:got remote process result: 4
2025-07-05T03:25:50.274 INFO:teuthology.orchestra.run.smithi062.stdout:{"status":"success","data":{"alerts":[{"labels":{"alertname":"CephMgrPrometheusModuleInactive","cluster":"fa9941a0-594d-11f0-8720-adfe0268badd","instance":"ceph_cluster","job":"ceph","oid":"1.3.6.1.4.1.50495.1.2.1.6.2","severity":"critical","type":"ceph_default"},"annotations":{"description":"The mgr/prometheus module at ceph_cluster is unreachable. This could mean that the module has been disabled or the mgr daemon itself is down. Without the mgr/prometheus module metrics and alerts will no longer function. Open a shell to an admin node or toolbox pod and use 'ceph -s' to to determine whether the mgr is active. If the mgr is not active, restart it, otherwise you can determine module status with 'ceph mgr module ls'. If it is not listed as enabled, enable it with 'ceph mgr module enable prometheus'.","summary":"The mgr/prometheus module is not available"},"state":"firing","activeAt":"2025-07-05T03:22:03.245200013Z","value":"0e+00"}]}}
2025-07-05T03:25:50.275 ERROR:teuthology.run_tasks:Saw exception from tasks.
This error was also observed in this nightly run: https://pulpito.ceph.com/teuthology-2025-07-06_20:00:21-rados-main-distro-default-smithi/, but not the one before it: https://pulpito.ceph.com/teuthology-2025-06-29_20:00:18-rados-main-distro-default-smithi/. Looking at the PRs that were merged between these two runs, looks like https://github.com/ceph/ceph/pull/61468 might be the origin of the error.
Updated by Shraddha Agrawal 9 months ago
- Related to Bug #72012: Test failure: test_standby (tasks.mgr.test_prometheus.TestPrometheus) added
Updated by Shraddha Agrawal 9 months ago
/a/skanta-2025-07-04_23:32:34-rados-wip-bharath13-testing-2025-07-04-0559-distro-default-smithi/8370622
Updated by Nizamudeen A 9 months ago
- Status changed from New to Fix Under Review
- Assignee set to Nizamudeen A
- Pull request ID set to 64385
Updated by Laura Flores 9 months ago
- Status changed from Fix Under Review to Duplicate
Updated by Laura Flores 9 months ago
- Related to deleted (Bug #72012: Test failure: test_standby (tasks.mgr.test_prometheus.TestPrometheus))
Updated by Laura Flores 9 months ago
- Is duplicate of Bug #72012: Test failure: test_standby (tasks.mgr.test_prometheus.TestPrometheus) added
Updated by Upkeep Bot 8 months ago
- Merge Commit set to e1c6a7d5243069e841a41eee5f23181dade125a7
- Fixed In set to v20.3.0-1609-ge1c6a7d524
- Upkeep Timestamp set to 2025-07-15T13:36:57+00:00
Updated by Naveen Naidu 16 days ago
/a/yuriw-2026-03-02_18:34:01-rados-wip-yuri3-testing-2026-03-02-1622-distro-default-trial/76688
Actions