Skip to content

Improve mon location handling for stretch clusters#40483

Merged
gregsfortytwo merged 5 commits intoceph:masterfrom
gregsfortytwo:wip-stretch-mon-location
Mar 31, 2021
Merged

Improve mon location handling for stretch clusters#40483
gregsfortytwo merged 5 commits intoceph:masterfrom
gregsfortytwo:wip-stretch-mon-location

Conversation

@gregsfortytwo
Copy link
Member

This PR has two parts, one a short bug fix and one handling a hole
in location handling. The bug fix is yet another invalid assert on structure
encoding triggered by kernel clients, but after an audit session it's the
last one. The hole in location handling is caused by monitors being able
to auto-join a cluster if they have the right keys, even if stretch mode
is engaged and there's no provided location. This PR closes the hole and
provides a mechanism for the daemons and admins/orchestrator to set
that location even when using this auto-bootstrap functionality instead
of creating the monitor in the monmap ahead of time.

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

We can adopt new monmaps while bootstrapping, or in election messages, in
addition to MonmapMonitor::update_from_paxos. Since we use the
notification to update our election strategy and such, we need to notify
from these locations as well!

Fixes: https://tracker.ceph.com/issues/47654

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
We blocked off the other routes to add location-less monitors, but if you
turn on a monitor with the right keys it can auto-join via the MMonJoin
functionality. Block that off!

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
@gregsfortytwo
Copy link
Member Author

Drat this is running into the boost build issue I've seen referenced elsewhere. I opened a pacific PR for it as well since I imagine that one will build boost correctly... #40484

@tchaikov
Copy link
Contributor

jenkins test make check

@tchaikov
Copy link
Contributor

jenkins test api

2 similar comments
@gregsfortytwo
Copy link
Member Author

jenkins test api

@gregsfortytwo
Copy link
Member Author

jenkins test api

@gregsfortytwo
Copy link
Member Author

I made a bug for the API tests; that seems to be happening everywhere https://tracker.ceph.com/issues/50058

@gregsfortytwo
Copy link
Member Author

@gregsfortytwo
Copy link
Member Author

/* The location members are for stretch mode. crush_loc is the location
* (generally just a "datacenter=<foo>" statement) of the monitor. The
* force_loc is whether the mon cluster should replace a previously-known
* location. Geenrally the monitor will force an update if it's given a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Geenrally

std::string_view get_type_name() const override { return "mon_join"; }
void print(std::ostream& o) const override {
o << "mon_join(" << name << " " << addrs << ")";
o << "mon_join(" << name << " " << addrs << crush_loc << ")";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<< " " << crush_loc ?

if (monmap->contains(name) &&
!monmap->get_addrs(name).front().is_blank_ip()) {
bool in_map = false;
const auto& my_info = monmap->mon_info.find(name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like undefined behaviour, my_info will end up as a ref to a free'd temporary since find returns an interator by value?

!monmap->get_addrs(name).front().is_blank_ip()) {
bool in_map = false;
const auto& my_info = monmap->mon_info.find(name);
const map<string,string> *map_crush_loc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialize to nullptr. Also, can you simply use map_crush_loc being non-null to indicate in_map and remove the second variable?

@athanatos athanatos self-requested a review March 30, 2021 23:05
Copy link
Contributor

@athanatos athanatos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a few minor comments. LGTM, though I don't know this code particularly well.

@tchaikov
Copy link
Contributor

jenkins test api

…join

This will let newly-created monitors auto-join on startup in stretch mode, by
providing the needed location.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
Go to some effort to look at our location in the monmap and update it the
same way we update names or IP addresses when necessary. Let users
pass in the location on the CLI via "--set-crush-location".

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
As in dd63a3e for the OSDMap, this
caused crashes when encoding for kernel clients, and is unnecessary
for servers because they are separately gated.

I did a full audit of every instance of "assert" I added to the codebase
to make sure this is the very last one of these issues.

Signed-off-by: Greg Farnum <gfarnum@redhat.com>
@gregsfortytwo gregsfortytwo force-pushed the wip-stretch-mon-location branch from 807b16e to 589de8b Compare March 31, 2021 07:20
@gregsfortytwo
Copy link
Member Author

Pushed updates for Sam's comments and will schedule upgrade suite against that.

https://pulpito.ceph.com/gregf-2021-03-30_20:35:40-rados-wip-stretch-mon-location-329-distro-basic-smithi/ looks good
6011982: looks to be tracker.ceph.com/issues/49868
6011990: selinux errors and a lot of cephadm tracebacks trying to find podman or docker
6012049: timed out on a "ceph pg dump" for two hours, but the cluster looks healthy that whole time. Maybe messenger transmit failures aren't getting retried on certain things again?
6012100: 1 PG went degraded during the upgrade, but nothing failed during the test run and there are no crashes or anything.
6011766 (dead): ...something...happened while trying to install new packages after previous tests had already completed successfully.

@gregsfortytwo
Copy link
Member Author

jenkins test make check

@gregsfortytwo
Copy link
Member Author

https://pulpito.ceph.com/gregf-2021-03-31_10:04:02-upgrade:octopus-x-wip-stretch-mon-location-331-distro-basic-smithi/
3 failures but looks to have passed.
6013418: rgw multisite failure "failed meta checkpoint for zone=test-zone2"
6013426: wait_until_healthy timed out, but the cluster WAS healthy. There's a bunch of other Tracebacks in the log that look like cephadm failing, too
6013432: raw multisite failure "failed meta checkpoint for zone=test-zone2"

@gregsfortytwo gregsfortytwo merged commit fdb2aad into ceph:master Mar 31, 2021
@gregsfortytwo gregsfortytwo deleted the wip-stretch-mon-location branch February 3, 2022 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants