mgr/cephadm: add count-per-host to PlacementSpec, and add support to scheduler#39979
mgr/cephadm: add count-per-host to PlacementSpec, and add support to scheduler#39979liewegas merged 17 commits intoceph:masterfrom
Conversation
sebastian-philipp
left a comment
There was a problem hiding this comment.
nothing in here that raises red flags.
src/pybind/mgr/cephadm/schedule.py
Outdated
| # If none of the above and also no <count> | ||
| if self.spec.placement.count is None: | ||
| elif self.spec.placement.count is not None: | ||
| # backward compatibility: consider an empty placements to be the same pattern = * |
There was a problem hiding this comment.
The comments are outdated now
| # If none of the above and also no <count> | |
| if self.spec.placement.count is None: | |
| elif self.spec.placement.count is not None: | |
| # backward compatibility: consider an empty placements to be the same pattern = * | |
| elif self.spec.placement.count is not None: | |
| # consider an empty placements to be the same pattern = * |
There was a problem hiding this comment.
fixed, along iwth bare count-per-host:N handling
| self.filter_new_host = filter_new_host | ||
| self.service_name = spec.service_name() | ||
| self.daemons = daemons | ||
| self.allow_colo = allow_colo |
There was a problem hiding this comment.
| self.allow_colo = allow_colo | |
| self.allow_colo = self.mgr.cephadm_services[spec.service_type].allow_colo()` |
Avoids a new parameter?
There was a problem hiding this comment.
there's not self.mgr here AFAICS
| # gen seed off of self.spec to make shuffling deterministic | ||
| seed = hash(self.spec.service_name()) | ||
| # shuffle for pseudo random selection | ||
| random.Random(seed).shuffle(host_pool) |
There was a problem hiding this comment.
Can you remove the SimpleScheduler class please?
- It's now a super complicated way of calling
host_pool[:count] SimpleScheduler's docstring is now misleading.
Alternative would be to leave it as it is, as it doesn't really harm anyone by this point.
There was a problem hiding this comment.
yeah, i was about to remove it, but i think there is still possibly a role for pluggable schedulers, but closer to the the get_placement() stage. it's silly as-is, though!
ac1c68b to
4990342
Compare
jmolmo
left a comment
There was a problem hiding this comment.
It is needed to document the new "count-per-host" attribute in the placement specification point
Ceph works just fine with an even number of monitors. Upside: more copies of critical cluster data Downside: we can tolerate the same number of down mons as N-1, and now we are slightly more likely to have a failing mon because we have 1 more that might fail. On balance it's not clear that have one fewer mon is any better, so avoid the confusion and complexity of second-guessing ourselves. Signed-off-by: Sage Weil <sage@newdream.net>
Cannot be combined with 'count'. Signed-off-by: Sage Weil <sage@newdream.net>
If we are told to deploy multiple instances on a host, do it. Signed-off-by: Sage Weil <sage@newdream.net>
In the no-count cases, our job is simple: we have a set of hosts specified via some other means (label, filter, explicit list) and simply need to do N instances for each of those hosts. Signed-off-by: Sage Weil <sage@newdream.net>
…xplicit placement At least for now we don't have a reasonable way to stamp out N placements when we're also constraining the name/network/ip--not in the general case. Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
- simpler - many/most callers already have the daemon list, so we save ourselves duplicated effort Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
Otherwise we may end out randomly doubling up on some hosts and none on others (when we have more than one placement per host). Signed-off-by: Sage Weil <sage@newdream.net>
For certain daemon types, we can deploy more than one per host (mds, rbd-mirror, rgw). Signed-off-by: Sage Weil <sage@newdream.net>
We already have to examine existing daemons to choose the placements. There is no reason to make the caller call another method to (re)calculate what the net additions and removals are. Signed-off-by: Sage Weil <sage@newdream.net>
It no longer does anything except slice the array. Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
… host_pattern I think this is better for the same reason we made PlacementSpec() not mean 'all hosts' by default. If you really want N daemons for every host in the cluster, be specific with 'count-per-host:2 *'. Signed-off-by: Sage Weil <sage@newdream.net>
- Rename get_host_selection_size() to get_target_size() since the host part of the name was a bit misleading - Take count-per-host into consideration. Signed-off-by: Sage Weil <sage@newdream.net>
4990342 to
2904c5e
Compare
|
|
not sure if it's related. i also ran into the failure of |
The next step is to handle port assignments...
Fixes: https://tracker.ceph.com/issues/47145