mgr/cephadm: fix and improve osd draining by liewegas · Pull Request #39536 · ceph/ceph

liewegas · 2021-02-17T21:40:23Z

when purging and osd, the osd was removed from teh osdmap before teh daemon was removed, resulting in the daemon not getting cleaned up.
when purging, we should set the crush weight to 0 instead of marking the osd out (fixes https://tracker.ceph.com/issues/49339)
clean up the log output (chattiness and formatting)

When adding an osd daemon explicitly, there is no created timestamp for the spec, and we should never not apply it. Fixes: b129c13 Signed-off-by: Sage Weil <sage@newdream.net>

Signed-off-by: Sage Weil <sage@newdream.net>

If we are replacing an OSD, we should mark it out and then back in again when a new device shows up. However, if we are going to destroy an OSD, we should just weight it to 0 in crush, so that data doesn't move again once the OSD is purged. Signed-off-by: Sage Weil <sage@newdream.net>

Signed-off-by: Sage Weil <sage@newdream.net>

Otherwise it doesn't work! Drop the fullname property: it is always "osd.{self.osd_id}". Signed-off-by: Sage Weil <sage@newdream.net>

Signed-off-by: Sage Weil <sage@newdream.net>

sebastian-philipp · 2021-02-24T14:30:01Z

https://pulpito.ceph.com/swagner-2021-02-22_13:51:42-rados:cephadm-wip-swagner-testing-2021-02-22-1115-distro-basic-smithi/

sebastian-philipp · 2021-02-24T14:43:40Z

oops. this broke the upgrade test:

 [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished

https://pulpito.ceph.com/swagner-2021-02-22_13:54:50-rados:cephadm-wip-swagner4-testing-2021-02-22-1119-distro-basic-smithi/

liewegas · 2021-02-24T18:43:35Z

1.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.2

I don't think this is related to my pull request. The process_removal_queue() method is called at the top of the serve loop and it unconditionally saves the list. (also, that run was a pass)

strenuous-life · 2021-07-05T02:48:54Z

@sebastian-philipp @liewegas Whether this pr will backport to octopus? In the issue #49339, the backport is described as 'pacific,octopus'.

sebastian-philipp · 2021-07-06T12:51:41Z

do you want to do the backport? That's mainly a git cherry-pick -x for all the commits here

liewegas added 6 commits February 17, 2021 15:32

mgr/cephadm: fix 'orch daemon add osd ...'

e864327

When adding an osd daemon explicitly, there is no created timestamp for the spec, and we should never not apply it. Fixes: b129c13 Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: less log noise from osd drain code

e2f0e56

Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: simplify OSD __str__ for drain

ca4050b

Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: remove daemon before osd destroy/purge

b5eab0d

Otherwise it doesn't work! Drop the fullname property: it is always "osd.{self.osd_id}". Signed-off-by: Sage Weil <sage@newdream.net>

mgr/cephadm: fix up the strings reporting osd ids

a1ff3a9

Signed-off-by: Sage Weil <sage@newdream.net>

liewegas requested a review from a team as a code owner February 17, 2021 21:40

github-actions bot added cephadm pybind labels Feb 17, 2021

sebastian-philipp approved these changes Feb 18, 2021

View reviewed changes

sebastian-philipp added the wip-swagner-testing My Teuthology tests label Feb 18, 2021

liewegas mentioned this pull request Feb 21, 2021

mgr/cephadm: It need to remove osd daemon after ceph orch rm osd. #39589

Closed

3 tasks

liewegas added the wip-sage-testing label Feb 22, 2021

sebastian-philipp merged commit 6913065 into ceph:master Feb 24, 2021

liewegas deleted the cephadm-drain-weight branch February 24, 2021 18:43

sebastian-philipp mentioned this pull request Mar 3, 2021

pacific: cephadm: Batch backport March (1) #39807

Merged

This was referenced Jun 13, 2022

octopus: mgr/cephadm: fix and improve osd draining #46645

Merged

octopus: mgr/cephadm: fix and improve osd draining SUSE/ceph#484

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mgr/cephadm: fix and improve osd draining#39536

mgr/cephadm: fix and improve osd draining#39536
sebastian-philipp merged 6 commits intoceph:masterfrom
liewegas:cephadm-drain-weight

liewegas commented Feb 17, 2021 •

edited

Loading

Uh oh!

sebastian-philipp commented Feb 24, 2021

Uh oh!

sebastian-philipp commented Feb 24, 2021

Uh oh!

liewegas commented Feb 24, 2021 •

edited

Loading

Uh oh!

strenuous-life commented Jul 5, 2021 •

edited

Loading

Uh oh!

sebastian-philipp commented Jul 6, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

liewegas commented Feb 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sebastian-philipp commented Feb 24, 2021

Uh oh!

sebastian-philipp commented Feb 24, 2021

Uh oh!

liewegas commented Feb 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

strenuous-life commented Jul 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sebastian-philipp commented Jul 6, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

liewegas commented Feb 17, 2021 •

edited

Loading

liewegas commented Feb 24, 2021 •

edited

Loading

strenuous-life commented Jul 5, 2021 •

edited

Loading