mon: fix `fs set down` to adjust max_mds only when cluster is not down by littlebees · Pull Request #58582 · ceph/ceph

littlebees · 2024-07-15T02:35:39Z

Applying fs set down false on an up cluster will set the cluster's max_mds to 1, regardless of the cluster's current max_mds.

fs set down false should only change max_mds when the cluster was set to down. Otherwise, the cluster should remain unchanged.

Contribution Guidelines

To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e

batrick · 2024-07-24T12:51:32Z

src/mon/FSCommands.cc


      ss << fsp->get_mds_map().get_fs_name();

+      if (!is_down && fsp->get_mds_map().get_old_max_mds() == 0) {


Suggested change

if (!is_down && fsp->get_mds_map().get_old_max_mds() == 0) {

if (!is_down && fsp->get_mds_map().get_max_mds() > 0) {

would work, no?

dparmar18 · 2024-07-24T15:16:41Z

This looks be to intentional, cephfs-admin-docs say:

This will also restore the previous value of max_mds. MDS daemons are brought down in a way such that journals are flushed to the metadata pool and all client I/O is stopped.

If this behaviour is to be changed then the docs would also need to be updated. 4e9ffff brought in this documentation.

dparmar18 · 2024-07-24T18:17:21Z

So, when a CephFS cluster is taken down using ceph fs set <fs_name> down true, it does this:

ceph/src/mon/FSCommands.cc

Lines 604 to 608 in 5969c11

    
           if (is_down) { 
        
                    if (fs.get_mds_map().get_max_mds() > 0) { 
        
               fs.get_mds_map().set_old_max_mds(); 
        
               fs.get_mds_map().set_max_mds(0); 
        
                    } /* else already down! */

It first records the old max_mds val using:

void set_old_max_mds() { old_max_mds = max_mds; }

and then max_mds is set to 0. Which means first the max_mds is preserved and then reset to 0. Looks correct.

So, now when ceph fs set <fs_name> down false is called:

ceph/src/mon/FSCommands.cc

Lines 609 to 612 in 5969c11

    
           } else { 
        
             mds_rank_t oldmax = fs.get_mds_map().get_old_max_mds(); 
        
             fs.get_mds_map().set_max_mds(oldmax ? oldmax : 1); 
        
           }

old max_mds val is fetched; if it is 0 i.e. the old max_mds was 0 (huh?), it sets the new max_mds to 1, maybe it could be possible that the max_mds was always 1 OR some cmd was ran between ceph fs set <fs_name> down true and ceph fs set <fs_name> down false which reset the max_mds to 1?

It looks to me that this functionality is correct, it doesn't always set the max_mds to 1 but follows what the last recorded value is and sets it.

EDIT: Please discard this, the issue being here is when the ceph fs set <fs> down false is executed without ceph fs set a down true which makes fs.get_mds_map().set_max_mds(oldmax ? oldmax : 1); set max_mds to 1 since on a fresh cluster that never ran ceph fs set a down true will have mds_rank_t old_max_mds = 0;

dparmar18 · 2024-07-24T18:18:40Z

So, when a CephFS cluster is taken down using ceph fs set <fs_name> down true, it does this:
...
...
...
It looks to me that this functionality is correct, it doesn't always set the max_mds to 1 but follows what the last recorded value is and sets it.

~~@batrick any thoughts?~~

rishabh-d-dave

I couldn't reproduce the bug this PR is trying to fix, it doesn't exist. ceph fs set a down true successfully brought down the 2 active MDSs FS had and ceph fs set a down false successfully brought FS up with same number of active MDSs.

EDIT
@littlebees Was this bug seen in one of the release branches?

EDIT 2
The bug is reproducible on a new cluster when ceph fs set a down false is run without running ceph fs set a down true in the past.

rishabh-d-dave · 2024-07-25T07:15:52Z

It looks to me that this functionality is correct, it doesn't always set the max_mds to 1 but follows what the last recorded value is and sets it.

Ugh! Seeing this Dhairya's comment now since the page was refreshed while posting my review.

Agreed +1.

@dparmar18 @batrick

vshankar · 2024-07-25T07:58:12Z

This looks be to intentional, cephfs-admin-docs say:
This will also restore the previous value of max_mds. MDS daemons are brought down in a way such that journals are flushed to the metadata pool and all client I/O is stopped.
If this behaviour is to be changed then the docs would also need to be updated. 4e9ffff brought in this documentation.

I think the bug that this change is fixing is when fs set <> down false when the file system is online is causing max_mds to get reset to one (1).

dparmar18 · 2024-07-25T12:25:57Z

This looks be to intentional, cephfs-admin-docs say:
This will also restore the previous value of max_mds. MDS daemons are brought down in a way such that journals are flushed to the metadata pool and all client I/O is stopped.
If this behaviour is to be changed then the docs would also need to be updated. 4e9ffff brought in this documentation.
I think the bug that this change is fixing is when fs set <> down false when the file system is online is causing max_mds to get reset to one (1).

~~The code never resets to one unless the last recorded max_mds is 0. check #58582 (comment)~~

dparmar18 · 2024-07-25T12:30:09Z

This looks be to intentional, cephfs-admin-docs say:
This will also restore the previous value of max_mds. MDS daemons are brought down in a way such that journals are flushed to the metadata pool and all client I/O is stopped.
If this behaviour is to be changed then the docs would also need to be updated. 4e9ffff brought in this documentation.
I think the bug that this change is fixing is when fs set <> down false when the file system is online is causing max_mds to get reset to one (1).

My bad, I understand this now. You're right.

dparmar18 · 2024-07-25T12:44:51Z

this can be easily reproduced, better have a test case included as well.

$ ./bin/ceph fs set a max_mds 2
$ ./bin/ceph fs get a | grep "max_mds"
max_mds	2
$ ./bin/ceph fs set a down false
a marked up, max_mds = 1
$ ./bin/ceph fs get a | grep "max_mds"
max_mds	1

dparmar18

this can be easily reproduced, it'd be great to have a test case.

src/mon/FSCommands.cc

rishabh-d-dave · 2024-07-26T13:32:40Z

jenkins test api

rishabh-d-dave · 2024-07-26T13:32:50Z

jenkins test make check

batrick

Sorry, I disagree with the error message change. We shouldn't suggest the operator bring the fs down just so that this command will succeed. That doesn't make sense.

src/mon/FSCommands.cc

Applying `fs set down false` on an up cluster will set the cluster's max_mds to 1, regardless of the cluster's current max_mds. `fs set down false` should only change max_mds when the cluster was set to down. Otherwise, the cluster should remain unchanged. Fixes: https://tracker.ceph.com/issues/66960 Signed-off-by: chungfengz <chungfengz@synology.com>

Fixes: https://tracker.ceph.com/issues/66960 Signed-off-by: chungfengz <chungfengz@synology.com>

batrick · 2024-07-26T20:27:00Z

This PR is under test in https://tracker.ceph.com/issues/67214.

addressed

littlebees requested a review from a team as a code owner July 15, 2024 02:35

github-actions bot added cephfs Ceph File System core mon labels Jul 15, 2024

littlebees force-pushed the fix-set-down-false branch 2 times, most recently from 91e42fd to 57709f6 Compare July 16, 2024 01:37

vshankar requested a review from a team July 23, 2024 12:51

batrick requested changes Jul 24, 2024

View reviewed changes

rishabh-d-dave previously requested changes Jul 25, 2024

View reviewed changes

rishabh-d-dave self-requested a review July 25, 2024 12:53

littlebees force-pushed the fix-set-down-false branch from 57709f6 to 860153b Compare July 25, 2024 12:56

dparmar18 requested changes Jul 25, 2024

View reviewed changes

dparmar18 reviewed Jul 25, 2024

View reviewed changes

src/mon/FSCommands.cc Outdated Show resolved Hide resolved

littlebees force-pushed the fix-set-down-false branch from 860153b to 257d6a5 Compare July 25, 2024 13:52

github-actions bot added the tests label Jul 25, 2024

littlebees requested a review from batrick July 25, 2024 14:02

batrick approved these changes Jul 26, 2024

View reviewed changes

batrick requested changes Jul 26, 2024

View reviewed changes

src/mon/FSCommands.cc Outdated Show resolved Hide resolved

chungfengz-syno added 2 commits July 26, 2024 22:41

test: add a test for fs set down on online cluster

f8becaa

Fixes: https://tracker.ceph.com/issues/66960 Signed-off-by: chungfengz <chungfengz@synology.com>

littlebees force-pushed the fix-set-down-false branch from 257d6a5 to f8becaa Compare July 26, 2024 14:41

dparmar18 approved these changes Jul 26, 2024

View reviewed changes

batrick approved these changes Jul 26, 2024

View reviewed changes

batrick added needs-qa wip-pdonnell-testing labels Jul 26, 2024

batrick merged commit 95dffdc into ceph:main Aug 6, 2024


		ss << fsp->get_mds_map().get_fs_name();

		if (!is_down && fsp->get_mds_map().get_old_max_mds() == 0) {

	if (!is_down && fsp->get_mds_map().get_old_max_mds() == 0) {
	if (!is_down && fsp->get_mds_map().get_max_mds() > 0) {

Conversation

littlebees commented Jul 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Contribution Guidelines

Checklist

Uh oh!

batrick Jul 24, 2024

Choose a reason for hiding this comment

Uh oh!

dparmar18 Jul 24, 2024

Choose a reason for hiding this comment

Uh oh!

dparmar18 commented Jul 24, 2024

Uh oh!

dparmar18 commented Jul 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dparmar18 commented Jul 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rishabh-d-dave left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rishabh-d-dave commented Jul 25, 2024

Uh oh!

vshankar commented Jul 25, 2024

Uh oh!

dparmar18 commented Jul 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dparmar18 commented Jul 25, 2024

Uh oh!

dparmar18 commented Jul 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dparmar18 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rishabh-d-dave commented Jul 26, 2024

Uh oh!

rishabh-d-dave commented Jul 26, 2024

Uh oh!

batrick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

batrick commented Jul 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

littlebees commented Jul 15, 2024 •

edited

Loading

dparmar18 commented Jul 24, 2024 •

edited

Loading

dparmar18 commented Jul 24, 2024 •

edited

Loading

rishabh-d-dave left a comment •

edited

Loading

dparmar18 commented Jul 25, 2024 •

edited

Loading

dparmar18 commented Jul 25, 2024 •

edited

Loading