Actions
Bug #53448
closedcephadm: agent failures double reported by two health checks
% Done:
0%
Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
v17.0.0-9913-gdc8f3bedbb
Released In:
v17.2.0~198
Upkeep Timestamp:
2025-07-14T15:50:32+00:00
Description
Whe nagents are down they are reported in both the agent down and failed daemon health check.
It's only really necessary to have them in one and it can be confusing since the criteria for agent down is different than failed daemon (not reporting in time vs. systemd status) yet being put in the former automatically puts them in the latter.
Example, almost all the "failed cephadm daemon(s)" reported here are just repeat reports of the agents marked
cluster:
id: f148c330-47c9-11ec-9f19-1dfe2cdc6a6d
health: HEALTH_ERR
126 Cephadm Agent(s) are not reporting. Hosts may be offline
Kernel Security Module (SELinux/AppArmor) is inconsistent for 19 hosts
131 failed cephadm daemon(s)
failed to probe daemons or devices
Updated by Laura Flores about 4 years ago
@Adam King would you say that https://tracker.ceph.com/issues/53723 is related to this Tracker?
Updated by Sebastian Wagner about 4 years ago
- Related to Bug #53723: Cephadm agent fails to report and causes a health timeout added
Updated by Sebastian Wagner about 4 years ago
- Status changed from In Progress to Resolved
Updated by Laura Flores about 4 years ago
- Related to deleted (Bug #53723: Cephadm agent fails to report and causes a health timeout)
Updated by Laura Flores about 4 years ago
- Related to Bug #53723: Cephadm agent fails to report and causes a health timeout added
Updated by Laura Flores about 4 years ago
Accidentally deleted the related issue; ignore.
Updated by Upkeep Bot 8 months ago
- Merge Commit set to dc8f3bedbb1b946716242ecd888c90430ab3bec6
- Fixed In set to v17.0.0-9913-gdc8f3bedbb
- Released In set to v17.2.0~198
- Upkeep Timestamp set to 2025-07-14T15:50:32+00:00
Actions