Project

General

Profile

Actions

Bug #67018

open

Destroyed osds warn as stray daemons

Added by Kevin Fox over 1 year ago. Updated 10 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:

Description

If you:
ceph orch osd rm 0 --replace --zap

It warns of stray daemons even though none are running.

Actions #1

Updated by Frédéric NASS over 1 year ago

Same here on Pacific. Seems odd that it warns about a stray daemon when there's none.

$ ceph orch osd rm 220 --replace

$ ceph orch osd rm status
No OSD remove/replace operations reported

$ ceph osd tree | grep osd.220
220 hdd 7.27699 osd.220 destroyed 0 1.00000

$ ceph -s
cluster:
id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
health: HEALTH_WARN
1 stray daemon(s) not managed by cephadm
...

$ ceph osd find 220 {
"osd": 220,
"addrs": {
"addrvec": [ {
"type": "v2",
"addr": "xxx.xx.xxx.x:6844",
"nonce": 3835037143
}, {
"type": "v1",
"addr": "xxx.xx.xxx.x:6845",
"nonce": 3835037143
}
]
},
"osd_fsid": "00000000-0000-0000-0000-000000000000",
"host": "mom02h07",
"crush_location": {
"datacenter": "mo",
"host": "mom02h07",
"rack": "xxxxx-xxx",
"region": "xxxxx",
"root": "default"
}
}

$ ceph node ls
...
"mom02h07": [
216,
217,
218,
219,
220, <--- still here
221,
222,
223,
224,
225,
226,
227,
228,
229,
230,
231
],

$ ceph orch ps | grep osd.220
<--- but not there

$ ssh root@mom02h07

[mom02h07] $ find /var/lib/ceph/$(ceph fsid) -iname osd.220*
/var/lib/ceph/c1fb64c3-b309-4b98-b24f-5b14c3e21ce1/custom_config_files/osd.220
/var/lib/ceph/c1fb64c3-b309-4b98-b24f-5b14c3e21ce1/removed/osd.220_2024-11-07T04:58:09.368886Z

[mom02h07] $ ps aux | grep -v grep | grep 'ceph-osd -n osd.220'

[mom02h07] $ systemctl -l | grep osd.220

We have to mute the warning message while waiting for the delivery and replacement of the hard drive with 'ceph health mute CEPHADM_STRAY_DAEMON 1w'.

Actions #2

Updated by Frédéric NASS over 1 year ago

Discussed here with Adam in the past : https://www.spinics.net/lists/ceph-users/msg74945.html

Actions #3

Updated by Frédéric NASS over 1 year ago

More to investigate

$ zgrep -h "osd.220" /var/log/ceph/$(ceph fsid)/* | sort -n -k1,1 -t ' '
2024-11-01T17:13:16.520874+0000 osd.220 (osd.220) 269 : cluster [DBG] 224.130 scrub starts
2024-11-01T17:16:15.681609+0000 osd.220 (osd.220) 270 : cluster [DBG] 224.130 scrub ok
2024-11-01T18:51:41.300382+0000 osd.220 (osd.220) 271 : cluster [DBG] 226.154 scrub starts
2024-11-01T18:56:49.617239+0000 osd.220 (osd.220) 272 : cluster [DBG] 226.154 scrub ok
2024-11-02T03:57:58.450355+0000 osd.220 (osd.220) 273 : cluster [DBG] 137.ec deep-scrub starts
2024-11-02T04:08:57.260480+0000 osd.220 (osd.220) 274 : cluster [DBG] 137.ec deep-scrub ok, 0 fixed
2024-11-02T04:57:10.699801+0000 osd.220 (osd.220) 275 : cluster [DBG] 224.13a deep-scrub starts
2024-11-02T05:32:11.283380+0000 osd.220 (osd.220) 276 : cluster [DBG] 224.13a deep-scrub ok, 0 fixed
2024-11-02T22:47:08.481623+0000 osd.220 (osd.220) 277 : cluster [DBG] purged_snaps scrub starts
2024-11-02T22:47:10.227685+0000 osd.220 (osd.220) 278 : cluster [DBG] purged_snaps scrub ok
2024-11-03T01:17:43.140670+0000 osd.220 (osd.220) 279 : cluster [DBG] 118.17f deep-scrub starts
2024-11-03T02:00:48.181229+0000 osd.220 (osd.220) 280 : cluster [DBG] 118.17f deep-scrub ok, 0 fixed
2024-11-03T02:00:48.992618+0000 osd.220 (osd.220) 281 : cluster [DBG] 235.697 scrub starts
2024-11-03T02:00:49.528254+0000 osd.220 (osd.220) 282 : cluster [DBG] 235.697 scrub ok
2024-11-03T18:39:17.847769+0000 osd.220 (osd.220) 283 : cluster [DBG] 137.127 deep-scrub starts
2024-11-03T18:50:00.611056+0000 osd.220 (osd.220) 284 : cluster [DBG] 137.127 deep-scrub ok, 0 fixed
2024-11-05T03:58:27.646021+0000 osd.220 (osd.220) 285 : cluster [DBG] 167.220 scrub starts
2024-11-05T04:07:14.217804+0000 osd.220 (osd.220) 286 : cluster [DBG] 167.220 scrub ok
2024-11-05T04:54:44.978414+0000 osd.220 (osd.220) 287 : cluster [DBG] 88.3f2 deep-scrub starts
2024-11-05T05:08:12.376371+0000 osd.220 (osd.220) 288 : cluster [DBG] 88.3f2 deep-scrub ok, 0 fixed
2024-11-05T17:10:11.905577+0000 osd.220 (osd.220) 289 : cluster [DBG] 175.3f7 scrub starts
2024-11-05T17:20:17.602916+0000 osd.220 (osd.220) 290 : cluster [DBG] 175.3f7 scrub ok
2024-11-05T17:42:11.987710+0000 osd.220 (osd.220) 291 : cluster [DBG] 224.403 scrub starts
2024-11-05T17:45:11.340401+0000 osd.220 (osd.220) 292 : cluster [DBG] 224.403 scrub ok
2024-11-05T17:45:14.917423+0000 osd.220 (osd.220) 293 : cluster [DBG] 235.237 scrub starts
2024-11-05T17:45:15.445544+0000 osd.220 (osd.220) 294 : cluster [DBG] 235.237 scrub ok
2024-11-06T03:45:03.583853+0000 osd.220 (osd.220) 295 : cluster [DBG] 88.646 scrub starts
2024-11-06T03:45:34.718407+0000 osd.220 (osd.220) 296 : cluster [DBG] 88.646 scrub ok
2024-11-06T08:22:38.694579+0000 mon.mem04h00 (mon.0) 3857989 : cluster [INF] Client mgr.mom02h00 marked osd.220 out, while it was still marked up
2024-11-06T08:22:40.007820+0000 mgr.mom02h00 (mgr.701563981) 1223008 : cephadm [INF] osd.220 now out
2024-11-06T08:22:42.890168+0000 osd.220 (osd.220) 297 : cluster [DBG] 118.60es0 starting backfill to osd.218(0) from (0'0,0'0] MAX to 2414155'6948424
2024-11-06T08:22:42.890214+0000 osd.220 (osd.220) 298 : cluster [DBG] 224.1a0 starting backfill to osd.345 from (0'0,0'0] MAX to 2414155'91264649
2024-11-06T08:22:42.890587+0000 osd.220 (osd.220) 299 : cluster [DBG] 175.7c4 starting backfill to osd.538 from (0'0,0'0] MAX to 2414155'2487286
2024-11-06T08:22:42.891153+0000 osd.220 (osd.220) 300 : cluster [DBG] 167.220s0 starting backfill to osd.227(0) from (0'0,0'0] MAX to 2414155'1462311
2024-11-06T08:22:42.902185+0000 osd.220 (osd.220) 301 : cluster [DBG] 224.1a0 starting backfill to osd.548 from (0'0,0'0] MAX to 2414155'91264649
2024-11-06T08:22:42.906261+0000 osd.220 (osd.220) 302 : cluster [DBG] 118.17fs0 starting backfill to osd.230(0) from (0'0,0'0] MAX to 2414155'6930385
2024-11-06T08:22:42.908652+0000 osd.220 (osd.220) 303 : cluster [DBG] 137.ecs0 starting backfill to osd.394(0) from (0'0,0'0] MAX to 2414155'3290481
2024-11-06T08:22:42.909734+0000 osd.220 (osd.220) 304 : cluster [DBG] 137.127s0 starting backfill to osd.230(0) from (0'0,0'0] MAX to 2414155'3296313
2024-11-06T08:22:42.912692+0000 osd.220 (osd.220) 305 : cluster [DBG] 224.13a starting backfill to osd.234 from (0'0,0'0] MAX to 2414155'90688315
2024-11-06T08:22:42.930009+0000 osd.220 (osd.220) 306 : cluster [DBG] 88.12a starting backfill to osd.541 from (0'0,0'0] MAX to 2414155'106678179
2024-11-06T08:22:42.941933+0000 osd.220 (osd.220) 307 : cluster [DBG] 88.3f2 starting backfill to osd.531 from (0'0,0'0] MAX to 2414155'44307655
2024-11-06T08:22:42.955770+0000 osd.220 (osd.220) 308 : cluster [DBG] 224.479 starting backfill to osd.549 from (0'0,0'0] MAX to 2414155'91604841
2024-11-06T08:22:43.006466+0000 osd.220 (osd.220) 309 : cluster [DBG] 226.154 starting backfill to osd.509 from (0'0,0'0] MAX to 2414155'10326988
2024-11-07T04:58:05.291896+0000 mgr.mom02h00 (mgr.701563981) 1338927 : cephadm [INF] osd.220 now down
2024-11-07T04:58:05.293177+0000 mgr.mom02h00 (mgr.701563981) 1338928 : cephadm [INF] Removing daemon osd.220 from mom02h07
2024-11-07T04:58:09.392358+0000 mgr.mom02h00 (mgr.701563981) 1338933 : cephadm [INF] Removing key for osd.220
2024-11-07T04:58:09.392672+0000 mon.mem04h00 (mon.0) 4062715 : audit [INF] from='mgr.701563981 100.74.190.125:0/157014789' entity='mgr.mom02h00' cmd=[{"prefix": "auth rm", "entity": "osd.220"}]: dispatch
2024-11-07T04:58:09.429979+0000 mon.mem04h00 (mon.0) 4062717 : audit [INF] from='mgr.701563981 100.74.190.125:0/157014789' entity='mgr.mom02h00' cmd='[{"prefix": "auth rm", "entity": "osd.220"}]': finished
2024-11-07T04:58:09.430677+0000 mgr.mom02h00 (mgr.701563981) 1338934 : cephadm [INF] Successfully removed osd.220 on mom02h07
2024-11-07T04:58:10.059639+0000 mgr.mom02h00 (mgr.701563981) 1338937 : cephadm [INF] Successfully destroyed old osd.220 on mom02h07; ready for replacement
2024-11-07T05:00:00.000125+0000 mon.mem04h00 (mon.0) 4062835 : cluster [WRN]     stray daemon osd.220 on host mom02h07 not managed by cephadm
2024-11-07T05:10:00.000143+0000 mon.mem04h00 (mon.0) 4063502 : cluster [WRN]     stray daemon osd.220 on host mom02h07 not managed by cephadm
2024-11-07T06:10:00.000119+0000 mon.mem04h00 (mon.0) 4067187 : cluster [WRN]     stray daemon osd.220 on host mom02h07 not managed by cephadm
2024-11-07T15:40:00.000124+0000 mon.mem04h00 (mon.0) 4109938 : cluster [WRN]     stray daemon osd.220 on host mom02h07 not managed by cephadm
2024-11-07T21:40:00.000141+0000 mon.mem04h00 (mon.0) 4131662 : cluster [WRN]     stray daemon osd.220 on host mom02h07 not managed by cephadm
2024-11-07T23:10:00.000146+0000 mon.mem04h00 (mon.0) 4136587 : cluster [WRN]     stray daemon osd.220 on host mom02h07 not managed by cephadm
2024-11-08T00:10:00.000118+0000 mon.mem04h00 (mon.0) 4139757 : cluster [WRN]     stray daemon osd.220 on host mom02h07 not managed by cephadm
2024-11-08T00:12:36.876+0000 7fa4613af700  0 [devicehealth ERROR root] Fail to parse JSON result from daemon osd.220 ()
2024-11-08T01:20:00.000127+0000 mon.mem04h00 (mon.0) 4143519 : cluster [WRN]     stray daemon osd.220 on host mom02h07 not managed by cephadm

From ceph-volume.log.20241108.gz at the time daemon osd.220 is removed

[2024-11-07 04:34:52,014][ceph_volume.devices.raw.list][INFO  ] device /dev/mapper/vg0-var does not have BlueStore information
[2024-11-07 04:58:08,717][ceph_volume.main][INFO  ] Running command: ceph-volume  lvm deactivate 220 55f0c5e5-91da-4b31-ba17-b84caa3ce331
[2024-11-07 04:58:08,722][ceph_volume.util.system][INFO  ] Executable lvs found on the host, will use /sbin/lvs
[2024-11-07 04:58:08,723][ceph_volume.process][INFO  ] Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/
uts /sbin/lvs --noheadings --separator=";" -a -o lv_tags,lv_path,lv_name,vg_name,lv_uuid,lv_size -S lv_tags={ceph.osd_fsid=55f0c5e5-91da-4b31-ba17-b84caa3ce331}
[2024-11-07 04:58:08,833][ceph_volume.process][INFO  ] stdout ceph.block_device=/dev/ceph-data-220/data-220,ceph.block_uuid=y0hKbM-NoM0-8WfA-SUXD-p1IW-ukf8-NqzYS5,ceph.cephx_lockbox_secret=,
ceph.cluster_fsid=c1fb64c3-b309-4b98-b24f-5b14c3e21ce1,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.db_device=/dev/ceph-db-sdr/db-220,ceph.db_uuid=00QOEB-p9fW-iaFx-CvSv-HujE-zmqE
-30ruGE,ceph.encrypted=0,ceph.osd_fsid=55f0c5e5-91da-4b31-ba17-b84caa3ce331,ceph.osd_id=220,ceph.type=block,ceph.vdo=0";"/dev/ceph-data-220/data-220";"data-220";"ceph-data-220";"y0hKbM-NoM0-
8WfA-SUXD-p1IW-ukf8-NqzYS5";"<7.28t
[2024-11-07 04:58:08,834][ceph_volume.process][INFO  ] stdout ceph.block_device=/dev/ceph-data-220/data-220,ceph.block_uuid=y0hKbM-NoM0-8WfA-SUXD-p1IW-ukf8-NqzYS5,ceph.cephx_lockbox_secret=,
ceph.cluster_fsid=c1fb64c3-b309-4b98-b24f-5b14c3e21ce1,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.db_device=/dev/ceph-db-sdr/db-220,ceph.db_uuid=00QOEB-p9fW-iaFx-CvSv-HujE-zmqE
-30ruGE,ceph.encrypted=0,ceph.osd_fsid=55f0c5e5-91da-4b31-ba17-b84caa3ce331,ceph.osd_id=220,ceph.type=db,ceph.vdo=0";"/dev/ceph-db-sdr/db-220";"db-220";"ceph-db-sdr";"00QOEB-p9fW-iaFx-CvSv-H
ujE-zmqE-30ruGE";"357.57g
[2024-11-07 04:58:08,834][ceph_volume.devices.lvm.deactivate][DEBUG ] Found cluster name ceph
[2024-11-07 04:58:08,834][ceph_volume.process][INFO  ] Running command: /usr/bin/findmnt -t tmpfs -M /var/lib/ceph/osd/ceph-220
[2024-11-07 04:58:08,840][ceph_volume.util.system][INFO  ] /var/lib/ceph/osd/ceph-220 does not appear to be a tmpfs mount
[2024-11-07 05:05:13,681][ceph_volume.main][INFO  ] Running command: ceph-volume  inventory --format=json-pretty --filter-for-batch
[2024-11-07 05:05:13,685][ceph_volume.util.system][INFO  ] Executable lvs found on the host, will use /sbin/lvs
[2024-11-07 05:05:13,685][ceph_volume.process][INFO  ] Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/lvs --noheadings --readonly --separator=";" -a --units=b --nosuffix -S  -o lv_tags,lv_path,lv_name,vg_name,lv_uuid,lv_size

Last logs recorded by osd.220 (nothing after that):

2024-11-07T04:58:05.724+0000 7eff0030e700 -1 osd.220 2414479 *** Got signal Terminated ***
2024-11-07T04:58:05.725+0000 7eff0030e700 -1 osd.220 2414479 *** Immediate shutdown (osd_fast_shutdown=true) ***
2024-11-07T04:58:05.725+0000 7eff0030e700  0 osd.220 2414479 prepare_to_stop telling mon we are shutting down and dead 
2024-11-07T04:58:05.762+0000 7efef00dd700  0 osd.220 2414479 got_stop_ack starting shutdown
2024-11-07T04:58:05.762+0000 7eff0030e700  0 osd.220 2414479 prepare_to_stop starting shutdown

Looking for process or container activity, nothing's happening:

watch -n 0.3 "podman ps | grep osd.220 ; ps aux | grep osd.220" | grep -v grep

There's no container for osd.220:

podman container ls | grep osd.220

It looks weird that stray daemon warning message appears at 5am sharp.

Could this be related to the 60s limit in def _check_for_strays(self) that may not be enough?

https://github.com/ceph/ceph/blob/f55fc4599a6c0da0f4bd2f3ecd2122e603ad94dd/src/pybind/mgr/cephadm/serve.py#L480

        # clear recently altered daemons that were created/removed more than 60 seconds ago
        self.mgr.recently_altered_daemons = {
            d: t for (d, t) in self.mgr.recently_altered_daemons.items()
            if ((datetime_now() - t).total_seconds() < 60)
        }
Actions #4

Updated by Frédéric NASS over 1 year ago

Also, muting the alert with 'ceph health mute CEPHADM_STRAY_DAEMON 1w' doesn't persist more than a couple of minutes / hours.

Actions #5

Updated by Frédéric NASS over 1 year ago

This is Pacific v16.2.10 (RHCS 5.3z7)

Actions #6

Updated by Michel Jouvin 11 months ago

Hi,

I experience the same issue in 18.2.2. After doing a `ceph orch osd rm --replace` for 2 OSDs (underlying disk with HW errors), these 2 OSDs are reported as stray daemon (present in `ceph node ls` output) despite the daemon no longer exists (checked with `podman ls` and `cephadm ls` on the OSD server). Restarting the mgr doesn't help. I also did a OSD server reboot, without any impact.

After removing the 2 OSDs, I got a message like the following one for each OSD every 10s:

```
2025-04-26T12:25:43.348867+0200 mgr.idr-mon1.glfwin [INF] osd.253 now down
2025-04-26T12:25:43.349498+0200 mgr.idr-mon1.glfwin [INF] Daemon osd.253 on dig-osd4 was already removed
2025-04-26T12:25:43.350393+0200 mgr.idr-mon1.glfwin [INF] Successfully destroyed old osd.253 on dig-osd4; ready for replacement
2025-04-26T12:25:43.350533+0200 mgr.idr-mon1.glfwin [INF] Zapping devices for osd.253 on dig-osd4
```

After doing a `ceph mgr fail`, the messages were no longer reported for one of the OSD which is still reported as a stray daemon. But it never stopped for the other OSD, despite forcing the mgr failover several times (3 times as we have 3 mgr instances).

I don't know if it may have some importance but I think I didn't add `--zap`when I removed the OSD (the main reason being that the disk is dead).

Also to be noted, I experienced the side issue reported by Frederic about the event mute lasting 1 hour despite specifying `1w` (1 week).

This problem seems to be the reason the upgrade to 18.2.6 is stucked (reported as in progress but nothing happening, without any related messages that I have been able to spot) and I have not find any workaround to get the upgrade done.

Actions #7

Updated by Frédéric NASS 11 months ago

Actions #8

Updated by chaojie xu 10 months ago

Hi, I experience the same issue, my version is v18.2.4. I also checked the source code and found that the stray daemon comes from this code, which uses the 'osd metadata' command to get the osd.

// src/mgr/Mgr.cc
void Mgr::load_all_metadata() {
ceph_assert(ceph_mutex_is_locked_by_me(lock));

JSONCommand mds_cmd;
mds_cmd.run(monc, "{\"prefix\": \"mds metadata\"}");
JSONCommand osd_cmd;
osd_cmd.run(monc, "{\"prefix\": \"osd metadata\"}");
JSONCommand mon_cmd;
mon_cmd.run(monc, "{\"prefix\": \"mon metadata\"}");
lock.unlock();
mds_cmd.wait();
osd_cmd.wait();
mon_cmd.wait();
lock.lock();
...
for (auto &metadata_val : mds_cmd.json_result.get_array()) {
...
daemon_state.insert(dm);
}
for (auto &metadata_val : mon_cmd.json_result.get_array()) {
...
daemon_state.insert(dm);
}
for (auto &osd_metadata_val : osd_cmd.json_result.get_array()) {
...
daemon_state.insert(dm);
}
}

and we run 'ceph osd metadata' manually and found that the destroyed osd was in the output.

Should we modify the code of 'ceph osd metadata' to filter out destroyed osds?

Actions

Also available in: Atom PDF