Skip to content

mgr/cephadm: fix and improve osd draining#39536

Merged
sebastian-philipp merged 6 commits intoceph:masterfrom
liewegas:cephadm-drain-weight
Feb 24, 2021
Merged

mgr/cephadm: fix and improve osd draining#39536
sebastian-philipp merged 6 commits intoceph:masterfrom
liewegas:cephadm-drain-weight

Conversation

@liewegas
Copy link
Member

@liewegas liewegas commented Feb 17, 2021

  • when purging and osd, the osd was removed from teh osdmap before teh daemon was removed, resulting in the daemon not getting cleaned up.
  • when purging, we should set the crush weight to 0 instead of marking the osd out (fixes https://tracker.ceph.com/issues/49339)
  • clean up the log output (chattiness and formatting)

When adding an osd daemon explicitly, there is no created timestamp
for the spec, and we should never not apply it.

Fixes: b129c13
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
If we are replacing an OSD, we should mark it out and then back in
again when a new device shows up.  However, if we are going to
destroy an OSD, we should just weight it to 0 in crush, so that data
doesn't move again once the OSD is purged.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
Otherwise it doesn't work!

Drop the fullname property: it is always "osd.{self.osd_id}".

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
@liewegas liewegas requested a review from a team as a code owner February 17, 2021 21:40
@sebastian-philipp
Copy link
Contributor

@sebastian-philipp sebastian-philipp merged commit 6913065 into ceph:master Feb 24, 2021
@sebastian-philipp
Copy link
Contributor

oops. this broke the upgrade test:

 [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished

https://pulpito.ceph.com/swagner-2021-02-22_13:54:50-rados:cephadm-wip-swagner4-testing-2021-02-22-1119-distro-basic-smithi/

@liewegas
Copy link
Member Author

liewegas commented Feb 24, 2021

1.15.120:0/2476104201' entity='mgr.x' cmd='[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]': finished
audit [INF] from='mgr.14646 172.21.15.120:0/2476104201' entity='mgr.x' cmd=[{"prefix":"config-key set","key":"mgr/cephadm/osd_remove_queue","val":"[]"}]: dispatch
audit [INF] from='mgr.14646 172.2

I don't think this is related to my pull request. The process_removal_queue() method is called at the top of the serve loop and it unconditionally saves the list. (also, that run was a pass)

@liewegas liewegas deleted the cephadm-drain-weight branch February 24, 2021 18:43
@strenuous-life
Copy link

strenuous-life commented Jul 5, 2021

@sebastian-philipp @liewegas Whether this pr will backport to octopus? In the issue #49339, the backport is described as 'pacific,octopus'.

@sebastian-philipp
Copy link
Contributor

do you want to do the backport? That's mainly a git cherry-pick -x for all the commits here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants