Bug #65728
openDaemon managed by cephadm in an unknown state (CEPHADM_FAILED_DAEMON)
0%
Description
/a/yuriw-2024-04-20_15:32:38-rados-wip-yuriw-testing-20240419.185239-main-distro-default-smithi/7664960/remote/smithi045/log/8d9a18e8-ff41-11ee-bc93-c7b262605968/ceph-mon.a.log.gz
2024-04-20T18:19:18.046+0000 7f3e74eae700 20 mon.a@0(leader).mgrstat health checks:
{
"CEPHADM_FAILED_DAEMON": {
"severity": "HEALTH_WARN",
"summary": {
"message": "1 failed cephadm daemon(s)",
"count": 1
},
"detail": [
{
"message": "daemon alertmanager.smithi104 on smithi104 is in unknown state"
}
]
}
}
The cluster warning later cleared up.
2024-04-20T18:19:00.723654+0000 mon.a (mon.0) 774 : cluster [WRN] Health check failed: 1 failed cephadm daemon(s) (CEPHADM_FAILED_DAEMON) 2024-04-20T18:19:01.777510+0000 mgr.a (mgr.14427) 39 : cluster [DBG] pgmap v16: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:02.024168+0000 mgr.a (mgr.14427) 40 : cluster [DBG] pgmap v17: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:02.024389+0000 mgr.a (mgr.14427) 41 : cluster [DBG] pgmap v18: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:04.024929+0000 mgr.a (mgr.14427) 42 : cluster [DBG] pgmap v19: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:06.025320+0000 mgr.a (mgr.14427) 43 : cluster [DBG] pgmap v20: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:08.025932+0000 mgr.a (mgr.14427) 44 : cluster [DBG] pgmap v21: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:10.026479+0000 mgr.a (mgr.14427) 45 : cluster [DBG] pgmap v22: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:12.026820+0000 mgr.a (mgr.14427) 46 : cluster [DBG] pgmap v23: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:12.046177+0000 mgr.a (mgr.14427) 47 : cluster [DBG] pgmap v24: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:12.046297+0000 mgr.a (mgr.14427) 48 : cluster [DBG] pgmap v25: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:14.046559+0000 mgr.a (mgr.14427) 49 : cluster [DBG] pgmap v26: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:16.047048+0000 mgr.a (mgr.14427) 50 : cluster [DBG] pgmap v27: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:18.047381+0000 mgr.a (mgr.14427) 51 : cluster [DBG] pgmap v28: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:18.047535+0000 mgr.a (mgr.14427) 52 : cluster [DBG] pgmap v29: 1 pgs: 1 active+clean; 577 KiB data, 85 MiB used, 268 GiB / 268 GiB avail 2024-04-20T18:19:19.045348+0000 mon.a (mon.0) 792 : cluster [INF] Health check cleared: CEPHADM_FAILED_DAEMON (was: 1 failed cephadm daemon(s)) 2024-04-20T18:19:19.045384+0000 mon.a (mon.0) 793 : cluster [INF] Cluster is now healthy
Updated by Laura Flores almost 2 years ago
- Related to Bug #65521: Add expected warnings in cluster log to ignorelists added
Updated by Laura Flores almost 2 years ago
- Subject changed from Alertmanager in an unknown state to Daemon managed by cephadm in an unknown state (CEPHADM_FAILED_DAEMON)
/a/yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652483
2024-04-12T01:05:08.125+0000 7f5bbef19700 20 mon.a@0(leader).mgrstat health checks:
{
"CEPHADM_FAILED_DAEMON": {
"severity": "HEALTH_WARN",
"summary": {
"message": "1 failed cephadm daemon(s)",
"count": 1
},
"detail": [
{
"message": "daemon osd.0 on smithi073 is in unknown state"
}
]
}
}
Updated by Laura Flores almost 2 years ago
yuriw-2024-04-11_17:03:54-rados-wip-yuri6-testing-2024-04-02-1310-distro-default-smithi/7652484
Updated by Matan Breizman almost 2 years ago · Edited
/a/yuriw-2024-05-15_21:09:29-rados-wip-yuri5-testing-2024-05-15-0804-distro-default-smithi/7707712
/a/yuriw-2024-05-15_21:09:29-rados-wip-yuri5-testing-2024-05-15-0804-distro-default-smithi/7707734
/a/yuriw-2024-05-15_21:09:29-rados-wip-yuri5-testing-2024-05-15-0804-distro-default-smithi/7707751
/a/yuriw-2024-05-15_21:09:29-rados-wip-yuri5-testing-2024-05-15-0804-distro-default-smithi/7707848
/a/yuriw-2024-05-15_21:09:29-rados-wip-yuri5-testing-2024-05-15-0804-distro-default-smithi/7707867
/a/yuriw-2024-05-15_21:09:29-rados-wip-yuri5-testing-2024-05-15-0804-distro-default-smithi/7707905
/a/yuriw-2024-05-15_21:09:29-rados-wip-yuri5-testing-2024-05-15-0804-distro-default-smithi/7707972
Updated by Laura Flores over 1 year ago
- Tags set to main-failures
/a/yuriw-2024-07-05_14:04:08-rados-wip-yuri3-testing-2024-07-01-1610-distro-default-smithi/7788683
Updated by Kamoltat (Junior) Sirivadhna over 1 year ago
/a/yuriw-2024-07-04_14:11:56-rados-wip-yuri4-testing-2024-07-02-0909-distro-default-smithi/7787184/
Updated by Laura Flores over 1 year ago
/a/yuriw-2024-07-17_13:32:02-rados-wip-yuri12-testing-2024-07-16-1122-distro-default-smithi/7805728
Updated by Aishwarya Mathuria over 1 year ago
/a/yuriw-2024-07-16_01:05:51-rados-wip-yuri6-testing-2024-07-15-1335-distro-default-smithi/7803318
Updated by Nitzan Mordechai over 1 year ago
/a/yuriw-2024-07-31_14:27:44-rados-wip-yuri7-testing-2024-07-30-0859-distro-default-smithi/7828635
/a/yuriw-2024-07-31_14:27:44-rados-wip-yuri7-testing-2024-07-30-0859-distro-default-smithi/7828639
/a/yuriw-2024-07-31_14:27:44-rados-wip-yuri7-testing-2024-07-30-0859-distro-default-smithi/7828628
Updated by Laura Flores over 1 year ago
/a/skanta-2024-09-27_06:56:34-rados-wip-bharath14-testing-2024-09-26-2119-squid-distro-default-smithi/7921638
Updated by Sridhar Seshasayee about 1 year ago
/a/skanta-2025-01-26_15:56:13-rados-wip-bharath13-testing-2025-01-25-2124-squid-distro-default-smithi/8094286
Benign cluster warnings generated during osd creation caused test failure.
OSD.1 creation sequence with warnings:
2025-01-26T17:07:18.441+0000 7fa418064640 10 mon.smithi046@0(leader).osd e10 prepare_command_osd_new found id 1 to use
2025-01-26T17:07:18.441+0000 7fa418064640 10 mon.smithi046@0(leader).osd e10 prepare_command_osd_new id 1 uuid 0e3eb9b1-b671-44c0-bf55-b227d82229cc
2025-01-26T17:07:18.441+0000 7fa418064640 10 mon.smithi046@0(leader).osd e10 prepare_command_osd_new has lockbox 0 dmcrypt 0
2025-01-26T17:07:18.441+0000 7fa418064640 10 mon.smithi046@0(leader).osd e10 prepare_command_osd_new validate secrets using osd id 1
2025-01-26T17:07:18.441+0000 7fa418064640 10 mon.smithi046@0(leader).auth v7 validate_osd_new osd.1 uuid 0e3eb9b1-b671-44c0-bf55-b227d82229cc
...
2025-01-26T17:07:18.441+0000 7fa418064640 2 mon.smithi046@0(leader).osd e10 osd.1 IN
...
2025-01-26T17:07:19.102+0000 7fa418064640 10 mon.smithi046@0(leader).config refresh_config crush_location for remote_host smithi046 is {root=default}
2025-01-26T17:07:19.102+0000 7fa418064640 20 mon.smithi046@0(leader).config refresh_config osd.1 crush {root=default} device_class
...
2025-01-26T17:07:29.855+0000 7fa418064640 10 mon.smithi046@0(leader).log v121 logging 2025-01-26T17:07:29.095219+0000 mgr.smithi046.itzjxc (mgr.14210) 42 : cephadm [INF] Deploying daemon osd.1 on smithi046
...
2025-01-26T17:07:37.088+0000 7fa41a869640 20 mon.smithi046@0(leader).osd e11 osd.1 laggy halflife 3600 decay_k -0.000192541 down for 18.646313 decay 0.996416
...
2025-01-26T17:07:38.235+0000 7fa418064640 20 mon.smithi046@0(leader) e1 entity_name osd.1 global_id 14228 (new_ok) caps allow profile osd
...
2025-01-26T17:07:38.516+0000 7fa418064640 20 mon.smithi046@0(leader).mgrstat health checks:
{
"CEPHADM_FAILED_DAEMON": {
"severity": "HEALTH_WARN",
"summary": {
"message": "1 failed cephadm daemon(s)",
"count": 1
},
"detail": [
{
"message": "daemon osd.1 on smithi046 is in unknown state"
}
]
}
}
...
2025-01-26T17:07:39.510+0000 7fa41a869640 0 log_channel(cluster) log [WRN] : Health check failed: 1 failed cephadm daemon(s) (CEPHADM_FAILED_DAEMON)
...
2025-01-26T17:07:51.049+0000 7fa41a869640 0 log_channel(cluster) log [INF] : Health check cleared: CEPHADM_FAILED_DAEMON (was: 1 failed cephadm daemon(s))
Updated by Laura Flores about 1 year ago
- Tags changed from main-failures to main-failures, cluster-log-warning
Updated by Sridhar Seshasayee 12 months ago
/a/skanta-2025-03-27_08:02:07-rados-wip-bharath10-testing-2025-03-27-0430-distro-default-smithi/8212866
Updated by Laura Flores 11 months ago · Edited
/a/skanta-2025-04-03_15:46:29-rados-wip-bharath5-testing-2025-04-03-1526-reef-distro-default-smithi/8222954
2025-04-03T18:31:27.755+0000 7fdd52e26640 10 --2- [v2:172.21.15.28:3300/0,v1:172.21.15.28:6789/0] >> [v2:172.21.15.136:3300/0,v1:172.21.15.136:6789/0] conn(0x55f46b5f5c00 0x55f46b7f4100 secure :-1 s=THROTTLE_DONE pgs=16 cs=0 l=0 rev1=1 crypto rx=0x55f46c53ce70 tx=0x55f46c86b800 comp rx=0 tx=0).handle_read_frame_dispatch tag=17
2025-04-03T18:31:27.755+0000 7fdd52e26640 5 --2- [v2:172.21.15.28:3300/0,v1:172.21.15.28:6789/0] >> [v2:172.21.15.136:3300/0,v1:172.21.15.136:6789/0] conn(0x55f46b5f5c00 0x55f46b7f4100 secure :-1 s=THROTTLE_DONE pgs=16 cs=0 l=0 rev1=1 crypto rx=0x55f46c53ce70 tx=0x55f46c86b800 comp rx=0 tx=0).handle_message got 1836 + 0 + 0 byte message. envelope type=46 src mon.1 off 0
2025-04-03T18:31:27.755+0000 7fdd52625640 20 mon.a@0(leader).mgrstat health checks:
{
"CEPHADM_FAILED_DAEMON": {
"severity": "HEALTH_WARN",
"summary": {
"message": "1 failed cephadm daemon(s)",
"count": 1
},
"detail": [
{
"message": "daemon mon.a on smithi028 is in unknown state"
}
]
}
}
2025-04-03T18:31:33.666159+0000 mon.a (mon.0) 19 : cluster [INF] mon.a calling monitor election
2025-04-03T18:31:33.688281+0000 mon.a (mon.0) 20 : cluster [INF] mon.a is new leader, mons a,b in quorum (ranks 0,1)
2025-04-03T18:31:33.698290+0000 mon.a (mon.0) 21 : cluster [DBG] monmap e2: 2 mons at {a=[v2:172.21.15.28:3300/0,v1:172.21.15.28:6789/0],b=[v2:172.21.15.136:3300/0,v1:172.21.15.136:6789/0]} removed_ranks: {} disallowed_leaders: {}
2025-04-03T18:31:33.718150+0000 mon.a (mon.0) 22 : cluster [DBG] fsmap
2025-04-03T18:31:33.718202+0000 mon.a (mon.0) 23 : cluster [DBG] osdmap e14: 2 total, 2 up, 2 in
2025-04-03T18:31:33.718662+0000 mon.a (mon.0) 24 : cluster [DBG] mgrmap e14: a(active, since 5m), standbys: b
2025-04-03T18:31:33.718901+0000 mon.a (mon.0) 25 : cluster [WRN] Health detail: HEALTH_WARN 1 failed cephadm daemon(s)
2025-04-03T18:31:33.718931+0000 mon.a (mon.0) 26 : cluster [WRN] [WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s)
2025-04-03T18:31:33.718953+0000 mon.a (mon.0) 27 : cluster [WRN] daemon mon.a on smithi028 is in unknown state
2025-04-03T18:31:33.750674+0000 mgr.a (mgr.14150) 238 : cluster [DBG] pgmap v180: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:33.751049+0000 mgr.a (mgr.14150) 239 : cluster [DBG] pgmap v181: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:35.751526+0000 mgr.a (mgr.14150) 240 : cluster [DBG] pgmap v182: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:37.751900+0000 mgr.a (mgr.14150) 241 : cluster [DBG] pgmap v183: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:39.752368+0000 mgr.a (mgr.14150) 242 : cluster [DBG] pgmap v184: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:41.752882+0000 mgr.a (mgr.14150) 243 : cluster [DBG] pgmap v185: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:41.991448+0000 mgr.a (mgr.14150) 244 : cluster [DBG] pgmap v186: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:41.991753+0000 mgr.a (mgr.14150) 245 : cluster [DBG] pgmap v187: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:43.992068+0000 mgr.a (mgr.14150) 246 : cluster [DBG] pgmap v188: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:43.995251+0000 mgr.a (mgr.14150) 247 : cluster [DBG] pgmap v189: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:43.995405+0000 mgr.a (mgr.14150) 248 : cluster [DBG] pgmap v190: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:45.995867+0000 mgr.a (mgr.14150) 249 : cluster [DBG] pgmap v191: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:46.676834+0000 mgr.a (mgr.14150) 250 : cluster [DBG] pgmap v192: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:46.677282+0000 mgr.a (mgr.14150) 251 : cluster [DBG] pgmap v193: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:48.455586+0000 mgr.a (mgr.14150) 252 : cluster [DBG] pgmap v194: 0 pgs: ; 0 B data, 57 MiB used, 179 GiB / 179 GiB avail
2025-04-03T18:31:49.455932+0000 mon.a (mon.0) 51 : cluster [INF] Health check cleared: CEPHADM_FAILED_DAEMON (was: 1 failed cephadm daemon(s))
2025-04-03T18:31:49.455977+0000 mon.a (mon.0) 52 : cluster [INF] Cluster is now healthy
Updated by Laura Flores 11 months ago
- Related to Bug #67869: qa: cluster [WRN] Health check failed: 1 failed cephadm daemon(s) (CEPHADM_FAILED_DAEMON) with quiesce and fs/misc added
Updated by Laura Flores 6 months ago
/a/yuriw-2025-09-12_19:42:42-rados-wip-yuri3-testing-2025-09-12-0906-distro-default-smithi/8496826
Updated by Laura Flores 4 months ago
/a/lflores-2025-12-02_17:29:40-rados-wip-lflores-testing-4-2025-12-01-1527-distro-default-smithi/8636035
Updated by Laura Flores about 1 month ago
/a/yaarit-2026-02-05_17:05:15-rados:cephadm-wip-rocky10-branch-of-the-day-2026-02-03-1770151121-distro-default-trial/37003
Updated by Nitzan Mordechai 25 days ago
/a/nmordech-2026-02-25_11:35:39-rados:cephadm-wip-rocky10-branch-of-the-day-2026-02-24-1771941190-distro-default-trial/69855
Updated by Laura Flores 23 days ago
/a/nmordech-2026-02-25_11:36:23-rados-wip-rocky10-branch-of-the-day-2026-02-24-1771941190-distro-default-trial/70007
Updated by Nitzan Mordechai 20 days ago
/a/yaarit-2026-02-26_20:19:38-rados:cephadm-wip-rocky10-branch-of-the-day-2026-02-26-1772108951-distro-default-trial/72747
Updated by Nitzan Mordechai 20 days ago
/a/yaarit-2026-02-26_20:20:34-rados-wip-rocky10-branch-of-the-day-2026-02-26-1772108951-distro-default-trial/
6 jobs: ['73027', '72858', '72899', '72998', '73007', '72766']
Updated by Nitzan Mordechai 18 days ago
/a/yaarit-2026-03-04_01:18:18-rados:cephadm-wip-rocky10-branch-of-the-day-2026-03-03-1772558532-distro-default-trial/79890
Updated by Laura Flores 16 days ago
/a/yaarit-2026-03-05_02:43:32-rados-wip-rocky10-branch-of-the-day-2026-03-04-1772633736-distro-default-trial/86361
Updated by Laura Flores 12 days ago
Description: upgrade/tentacle-x/stress-split/{0-distro/ubuntu_22.04 0-roles 1-start 2-first-half-tasks/snaps-few-objects 3-stress-tasks/{radosbench rbd-cls rbd-import-export rbd_api readwrite snaps-few-objects} 4-second-half-tasks/rbd-import-export mon_election/connectivity overrides/ignorelist_health}
/a/yuriw-2026-03-06_21:35:20-upgrade-wip-rocky10-branch-of-the-day-2026-03-04-1772633736-distro-default-trial/91875
Updated by Nitzan Mordechai 11 days ago
/a/yuriw-2026-03-09_21:03:52-rados-wip-rocky10-branch-of-the-day-2026-03-09-1773079353-tentacle-distro-default-trial/
2 jobs: ['96534', '96721']
Updated by Jaya Prakash 6 days ago
/a/jayaprakash-2026-03-06_10:20:34-rados-wip-jaya-bs-testing-06-03-2025-distro-default-trial/
2 jobs: ['90364', '90340']