Improve mon location handling for stretch clusters by gregsfortytwo · Pull Request #40483 · ceph/ceph

gregsfortytwo · 2021-03-30T01:11:40Z

This PR has two parts, one a short bug fix and one handling a hole
in location handling. The bug fix is yet another invalid assert on structure
encoding triggered by kernel clients, but after an audit session it's the
last one. The hole in location handling is caused by monitors being able
to auto-join a cluster if they have the right keys, even if stretch mode
is engaged and there's no provided location. This PR closes the hole and
provides a mechanism for the daemons and admins/orchestrator to set
that location even when using this auto-bootstrap functionality instead
of creating the monitor in the monmap ahead of time.

Checklist

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox

We can adopt new monmaps while bootstrapping, or in election messages, in addition to MonmapMonitor::update_from_paxos. Since we use the notification to update our election strategy and such, we need to notify from these locations as well! Fixes: https://tracker.ceph.com/issues/47654 Signed-off-by: Greg Farnum <gfarnum@redhat.com>

We blocked off the other routes to add location-less monitors, but if you turn on a monitor with the right keys it can auto-join via the MMonJoin functionality. Block that off! Signed-off-by: Greg Farnum <gfarnum@redhat.com>

gregsfortytwo · 2021-03-30T01:32:05Z

Drat this is running into the boost build issue I've seen referenced elsewhere. I opened a pacific PR for it as well since I imagine that one will build boost correctly... #40484

tchaikov · 2021-03-30T01:57:06Z

jenkins test make check

tchaikov · 2021-03-30T03:00:43Z

jenkins test api

gregsfortytwo · 2021-03-30T16:54:31Z

jenkins test api

gregsfortytwo · 2021-03-30T17:20:57Z

jenkins test api

gregsfortytwo · 2021-03-30T17:43:57Z

I made a bug for the API tests; that seems to be happening everywhere https://tracker.ceph.com/issues/50058

gregsfortytwo · 2021-03-30T18:36:59Z

https://pulpito.ceph.com/gregf-2021-03-30_16:53:55-rados:monthrash-wip-stretch-mon-location-329-distro-basic-smithi/

monthrash passed with 2 failures in ansible.

Scheduled singleton-bluestore per Neha's request as a reproducer for the election mode bug: https://pulpito.ceph.com/gregf-2021-03-30_18:35:54-rados:singleton-bluestore-wip-stretch-mon-location-329-distro-basic-smithi/

gregsfortytwo · 2021-03-30T19:41:08Z

https://pulpito.ceph.com/gregf-2021-03-30_18:35:54-rados:singleton-bluestore-wip-stretch-mon-location-329-distro-basic-smithi/ passed 5 tests

athanatos · 2021-03-30T23:00:13Z

src/messages/MMonJoin.h

+  /* The location members are for stretch mode. crush_loc is the location
+   * (generally just a "datacenter=<foo>" statement) of the monitor. The
+   * force_loc is whether the mon cluster should replace a previously-known
+   * location. Geenrally the monitor will force an update if it's given a


athanatos · 2021-03-30T23:00:33Z

src/messages/MMonJoin.h

  std::string_view get_type_name() const override { return "mon_join"; }
  void print(std::ostream& o) const override {
-    o << "mon_join(" << name << " " << addrs << ")";
+    o << "mon_join(" << name << " " << addrs << crush_loc << ")";


<< " " << crush_loc ?

athanatos · 2021-03-30T23:02:59Z

src/mon/Monitor.cc

-    if (monmap->contains(name) &&
-        !monmap->get_addrs(name).front().is_blank_ip()) {
+    bool in_map = false;
+    const auto& my_info = monmap->mon_info.find(name);


This looks like undefined behaviour, my_info will end up as a ref to a free'd temporary since find returns an interator by value?

athanatos · 2021-03-30T23:04:06Z

src/mon/Monitor.cc

-        !monmap->get_addrs(name).front().is_blank_ip()) {
+    bool in_map = false;
+    const auto& my_info = monmap->mon_info.find(name);
+    const map<string,string> *map_crush_loc;


Initialize to nullptr. Also, can you simply use map_crush_loc being non-null to indicate in_map and remove the second variable?

athanatos

I made a few minor comments. LGTM, though I don't know this code particularly well.

tchaikov · 2021-03-31T01:00:52Z

jenkins test api

…join This will let newly-created monitors auto-join on startup in stretch mode, by providing the needed location. Signed-off-by: Greg Farnum <gfarnum@redhat.com>

Go to some effort to look at our location in the monmap and update it the same way we update names or IP addresses when necessary. Let users pass in the location on the CLI via "--set-crush-location". Signed-off-by: Greg Farnum <gfarnum@redhat.com>

As in dd63a3e for the OSDMap, this caused crashes when encoding for kernel clients, and is unnecessary for servers because they are separately gated. I did a full audit of every instance of "assert" I added to the codebase to make sure this is the very last one of these issues. Signed-off-by: Greg Farnum <gfarnum@redhat.com>

gregsfortytwo · 2021-03-31T07:58:17Z

Pushed updates for Sam's comments and will schedule upgrade suite against that.

https://pulpito.ceph.com/gregf-2021-03-30_20:35:40-rados-wip-stretch-mon-location-329-distro-basic-smithi/ looks good
6011982: looks to be tracker.ceph.com/issues/49868
6011990: selinux errors and a lot of cephadm tracebacks trying to find podman or docker
6012049: timed out on a "ceph pg dump" for two hours, but the cluster looks healthy that whole time. Maybe messenger transmit failures aren't getting retried on certain things again?
6012100: 1 PG went degraded during the upgrade, but nothing failed during the test run and there are no crashes or anything.
6011766 (dead): ...something...happened while trying to install new packages after previous tests had already completed successfully.

gregsfortytwo · 2021-03-31T15:26:06Z

jenkins test make check

gregsfortytwo · 2021-03-31T15:50:21Z

https://pulpito.ceph.com/gregf-2021-03-31_10:04:02-upgrade:octopus-x-wip-stretch-mon-location-331-distro-basic-smithi/
3 failures but looks to have passed.
6013418: rgw multisite failure "failed meta checkpoint for zone=test-zone2"
6013426: wait_until_healthy timed out, but the cluster WAS healthy. There's a bunch of other Tracebacks in the log that look like cephadm failing, too
6013432: raw multisite failure "failed meta checkpoint for zone=test-zone2"

gregsfortytwo added 2 commits March 22, 2021 21:40

gregsfortytwo added core needs-qa needs-review labels Mar 30, 2021

gregsfortytwo requested review from athanatos, jdurgin, neha-ojha and tchaikov March 30, 2021 01:11

github-actions bot added the mon label Mar 30, 2021

gregsfortytwo mentioned this pull request Mar 30, 2021

pacific: Improve mon location handling for stretch clusters #40484

Merged

3 tasks

athanatos reviewed Mar 30, 2021

View reviewed changes

athanatos self-requested a review March 30, 2021 23:05

athanatos approved these changes Mar 30, 2021

View reviewed changes

gregsfortytwo added 3 commits March 31, 2021 07:14

mon: messages: Extend MMonJoin so it can provide a crush_location on …

9c0e32d

…join This will let newly-created monitors auto-join on startup in stretch mode, by providing the needed location. Signed-off-by: Greg Farnum <gfarnum@redhat.com>

gregsfortytwo force-pushed the wip-stretch-mon-location branch from 807b16e to 589de8b Compare March 31, 2021 07:20

gregsfortytwo merged commit fdb2aad into ceph:master Mar 31, 2021

gregsfortytwo deleted the wip-stretch-mon-location branch February 3, 2022 17:43

Conversation

gregsfortytwo commented Mar 30, 2021

Checklist

Uh oh!

gregsfortytwo commented Mar 30, 2021

Uh oh!

tchaikov commented Mar 30, 2021

Uh oh!

tchaikov commented Mar 30, 2021

Uh oh!

gregsfortytwo commented Mar 30, 2021

Uh oh!

gregsfortytwo commented Mar 30, 2021

Uh oh!

gregsfortytwo commented Mar 30, 2021

Uh oh!

gregsfortytwo commented Mar 30, 2021

Uh oh!

gregsfortytwo commented Mar 30, 2021

Uh oh!

athanatos Mar 30, 2021

Choose a reason for hiding this comment

Uh oh!

athanatos Mar 30, 2021

Choose a reason for hiding this comment

Uh oh!

athanatos Mar 30, 2021

Choose a reason for hiding this comment

Uh oh!

athanatos Mar 30, 2021

Choose a reason for hiding this comment

Uh oh!

athanatos left a comment

Choose a reason for hiding this comment

Uh oh!

tchaikov commented Mar 31, 2021

Uh oh!

gregsfortytwo commented Mar 31, 2021

Uh oh!

gregsfortytwo commented Mar 31, 2021

Uh oh!

gregsfortytwo commented Mar 31, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants