Skip to content

mon,mds: map mds daemons to a particular fs#32015

Merged
batrick merged 6 commits intoceph:masterfrom
liewegas:wip-ssh-mds-fs
Jan 6, 2020
Merged

mon,mds: map mds daemons to a particular fs#32015
batrick merged 6 commits intoceph:masterfrom
liewegas:wip-ssh-mds-fs

Conversation

@liewegas
Copy link
Member

@liewegas liewegas commented Dec 4, 2019

  • add mds_fs option to map an mds to a particular fs
  • report it via the beacons
  • update fsmap and mon logic to respect the setting
  • update cephadm to set this for each set of mds daemons it deploys

@liewegas liewegas requested a review from a team as a code owner December 4, 2019 21:11
@liewegas liewegas requested a review from batrick December 4, 2019 21:11
@liewegas liewegas added the cephfs Ceph File System label Dec 4, 2019
@sebastian-philipp
Copy link
Contributor

On the other hand, would it work to treat the mds as something unrelated to the fs? Similar to the relationship between osds and pools?

@liewegas
Copy link
Member Author

liewegas commented Dec 5, 2019

On the other hand, would it work to treat the mds as something unrelated to the fs? Similar to the relationship between osds and pools?

I think that would work if the MDS could multiplex multiple file systems and manage resources accordingly, but it can't do that. Today, daemons for different volumes will be given different amounts of memory and may be tuned differently, have different settings applied, etc.

@liewegas
Copy link
Member Author

liewegas commented Dec 5, 2019

retest this please

@sebastian-philipp
Copy link
Contributor

On the other hand, would it work to treat the mds as something unrelated to the fs? Similar to the relationship between osds and pools?

I think that would work if the MDS could multiplex multiple file systems and manage resources accordingly, but it can't do that. Today, daemons for different volumes will be given different amounts of memory and may be tuned differently, have different settings applied, etc.

Then, 👍 for pinning to a particular mds

@batrick
Copy link
Member

batrick commented Dec 10, 2019

Also, this approach maps standby to an fscid. This has the unfortunate side-effect that we lose the preference if the mds comes up before the fs is created, since the mds requests an fs by name. We could modify the standby_daemons_fscid map to use an fs name instead... WDYT?

I think the fscid mapping is right. The fix should be that the periodic beacons may update standby_daemon_fscid. Either the fs is now available (or maybe deleted and recreated) or the mds_fs config was changed at runtime.

@batrick
Copy link
Member

batrick commented Dec 10, 2019

For posterity, the CDM etherpad: https://pad.ceph.com/p/mds-affinity

.set_flag(Option::FLAG_NO_MON_UPDATE)
.set_description("path to MDS data and keyring"),

Option("mds_fs", Option::TYPE_STR, Option::LEVEL_ADVANCED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this option name is not clear enough. What about mds_affinity_fs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mds_prefer_fs? mds_force_fs? That way it reads as verb+noun and not a string of 3 nouns.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about mds_join_fs which is consistent with other names we use like the joinable fs flag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, just saw this. works for me!

void set_health(const MDSHealth &h) { health = h; }

const string& get_fs() const { return fs; }
void set_fs(const string& s) { fs = s; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

void set_fs(std::string_view s)

@liewegas
Copy link
Member Author

repushed

  • the mon can now update the standby fscid if it changes (and the mds is still standby)
  • if the mds asks for an fs that doesn't exist, we register it as NONE (meaning, don't use this mds for anything). if there is no entry in the map, the mds has no preference and can be used for anything.

...except we don't actually factor this in yet when assigning the standby. I didn't rename the option yet either, what say ye?

@liewegas liewegas changed the title RFC: map mds daemons to a particular fs mon,mds: map mds daemons to a particular fs Dec 18, 2019
@liewegas liewegas requested a review from batrick December 18, 2019 18:44
@liewegas
Copy link
Member Author

@batrick this should now be complete, ready for review

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One missing piece I see is the forced replacement of actives if a standby is available for that fscid. Are you planning to do that in this PR or save it for a followup?

.set_flag(Option::FLAG_NO_MON_UPDATE)
.set_description("path to MDS data and keyring"),

Option("mds_fs", Option::TYPE_STR, Option::LEVEL_ADVANCED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

@liewegas
Copy link
Member Author

One missing piece I see is the forced replacement of actives if a standby is available for that fscid. Are you planning to do that in this PR or save it for a followup?

Can you point me to the path you mean?

Signed-off-by: Sage Weil <sage@redhat.com>
@batrick
Copy link
Member

batrick commented Dec 18, 2019

One missing piece I see is the forced replacement of actives if a standby is available for that fscid. Are you planning to do that in this PR or save it for a followup?

Can you point me to the path you mean?

I'm referring to this proposal in the etherpad:

If new standby becomes available with stronger affinity, replacement occurs.

This PR simply prefers standbys with mds_join_fs == fscid during replacement only. I think we should also do the replacement if a standby has stronger affinity than a current active rank (but not degraded rank).

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
@batrick
Copy link
Member

batrick commented Dec 18, 2019

This PR should also add a pending release note and small blurb in doc/cephfs/standby.rst

Signed-off-by: Sage Weil <sage@redhat.com>
@liewegas
Copy link
Member Author

Updated. Leaving the preemption of active MDSs for later.

batrick added a commit to batrick/ceph that referenced this pull request Jan 4, 2020
* refs/pull/32015/head:
	doc/cephfs/standby: document mds_join_fs
	mgr/cephadm: map mds daemons to a particular fs
	mon/MDSMonitor: respect mfs fscid preference
	mon/MDSMonitor: assign standbys to their preferred fscid
	mds/FSMap: track preferred fscid for standby daemons
	mds: add mds_join_fs option; pass via MMDSBeacon

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
batrick added a commit that referenced this pull request Jan 6, 2020
* refs/pull/32015/head:
	doc/cephfs/standby: document mds_join_fs
	mgr/cephadm: map mds daemons to a particular fs
	mon/MDSMonitor: respect mfs fscid preference
	mon/MDSMonitor: assign standbys to their preferred fscid
	mds/FSMap: track preferred fscid for standby daemons
	mds: add mds_join_fs option; pass via MMDSBeacon

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
@batrick batrick merged commit 9aa25d7 into ceph:master Jan 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants