Skip to content

[wip] mgr/cephadm: Add host draining#34617

Closed
sebastian-philipp wants to merge 2 commits intoceph:masterfrom
sebastian-philipp:cephadm-drain-host
Closed

[wip] mgr/cephadm: Add host draining#34617
sebastian-philipp wants to merge 2 commits intoceph:masterfrom
sebastian-philipp:cephadm-drain-host

Conversation

@sebastian-philipp
Copy link
Contributor

@sebastian-philipp sebastian-philipp commented Apr 17, 2020

Blocked by:


Workflow is something like:

ceph orch host rm myhost # fails with daemons still running
ceph orch drain myhost # removes crash and node-exporter
ceph orch rm daemon osd.7
ceph orch rm daemon mon.a
ceph orch host rm myhost # succeeds

Also:

  • host rm: Assert host is empty (+tests + cleanup of the tests)
  • Added Inventory class. Later discovered that I can simply use a label for this.
  • Moved HostAssignment to a new module

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard backend
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

names = ', '.join(daemons_on_host.keys())
s = f'Daemons still on host `{host}`: {names}\n'
s += 'Please remove all daemons from this host, before removing the host.\n'
s += '(Add --force to ignore this message)'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There would still be daemons running on the host that are reporting to the cluster. Forcefully removing the host takes away your ability to remove these daemons. Not sure if that's the right move.
A --force should also imply a de-registration of these daemons from the cluster.

Not sure how this is done with the tools we have, but I wanted to raise awareness of this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, my idea was to add a ceph orch drain host0 that marks the host as _drain and then let serve() drain the host.

if any([host == h.hostname for h in placement.hosts]):
s = f'Service {name} still specifies its placement to be on host `{host}`\n'
s += 'Please alter the placement specification, before removing the host.\n'
s += '(Add --force to ignore this message)'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why we want to have a way to skip this, but here again, this may lead into chaos..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe hint to ceph orch host drain would be helpful?

@sebastian-philipp sebastian-philipp changed the title mgr/cephadm: Add host draining [blocked] mgr/cephadm: Add host draining May 7, 2020
@ricardoasmarques
Copy link
Contributor

Some notes for better usability:

  • users should be able to see if one host is "drain"? (e.g. by reflecting it on ceph orch host ls output?)
  • ceph orch apply should fail when referencing hosts that are "drain"? (similar to what happen when referencing a host that does not exists?)

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
Workflow is something like:

```bash
ceph orch host rm myhost # fails with daemons still running
ceph orch drain myhost # removes crash and node-exporter
ceph orch rm daemon osd.7
ceph orch rm daemon mon.a
ceph orch host rm myhost # succeeds
```

Signed-off-by: Sebastian Wagner <sebastian.wagner@suse.com>
@sebastian-philipp sebastian-philipp changed the title [blocked] mgr/cephadm: Add host draining [wip] mgr/cephadm: Add host draining Jun 9, 2020
@sebastian-philipp
Copy link
Contributor Author

closing this for now. we need this, but right now, the scheduler is not there yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants