mds/FSMap: fix join_fscid being incorrectly reset for active MDS during filesystem removal#65640
Merged
rishabh-d-dave merged 1 commit intoceph:mainfrom Oct 3, 2025
Conversation
…ng filesystem removal
Fix bug where active MDS daemons in remaining filesystems incorrectly
have their join_fscid cleared to FS_CLUSTER_ID_NONE when any other
filesystem is removed.
The issue was caused by variable name shadowing in erase_filesystem()
where the loop variable 'fscid' shadowed the function parameter 'fscid':
Inside loop: if (info.join_fscid == fscid) compared against the
loop variable (remaining FS ID) instead of parameter (removed FS ID)
Renamed the loop variable to 'remaining_fscid' to eliminate the shadowing
and ensure the comparison uses the correct filesystem ID.
Reproducer:
../src/vstart.sh --new -x --localhost --bluestore
FS=b
./bin/ceph osd pool create cephfs.${FS}.meta 64 64 replicated
./bin/ceph osd pool create cephfs.${FS}.data 64 64 replicated
./bin/ceph fs new ${FS} cephfs.${FS}.meta cephfs.${FS}.data
./bin/ceph config set mds.a mds_join_fs a
./bin/ceph config set mds.b mds_join_fs a
./bin/ceph fs fail ${FS}
./bin/ceph fs rm ${FS} --yes-i-really-mean-it
Then from ./bin/ceph fs dump
We can see join_fscid in all active mds filesystem 'a' are reset.
Since there are standby mds with join_fscid=1
MDSMonitor think they have better affinity and trigger switch over.
Fixes: https://tracker.ceph.com/issues/73183
Signed-off-by: ethanwu <ethanwu@synology.com>
Contributor
|
This PR is under test in https://tracker.ceph.com/issues/73311. |
rishabh-d-dave
approved these changes
Oct 3, 2025
Contributor
There was a problem hiding this comment.
QA run was successful - https://tracker.ceph.com/projects/cephfs/wiki/QA_main_2025#wip-rishabh-testing-20250929182116
|
This is an automated message by src/script/redmine-upkeep.py. I have resolved the following tracker ticket due to the merge of this PR: No backports are pending for the ticket. If this is incorrect, please update the tracker Update Log: https://github.com/ceph/ceph/actions/runs/18220800786 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix bug where active MDS daemons in remaining filesystems incorrectly have their join_fscid cleared to FS_CLUSTER_ID_NONE when any other filesystem is removed.
The issue was caused by variable name shadowing in erase_filesystem() where the loop variable 'fscid' shadowed the function parameter 'fscid': Inside loop: if (info.join_fscid == fscid) compared against the loop variable (remaining FS ID) instead of parameter (removed FS ID)
Renamed the loop variable to 'remaining_fscid' to eliminate the shadowing and ensure the comparison uses the correct filesystem ID.
Reproducer:
../src/vstart.sh --new -x --localhost --bluestore
FS=b
./bin/ceph osd pool create cephfs.${FS}.meta 64 64 replicated
./bin/ceph osd pool create cephfs.${FS}.data 64 64 replicated
./bin/ceph fs new ${FS} cephfs.${FS}.meta cephfs.${FS}.data
./bin/ceph config set mds.a mds_join_fs a
./bin/ceph config set mds.b mds_join_fs a
./bin/ceph fs fail ${FS}
./bin/ceph fs rm ${FS} --yes-i-really-mean-it
Then from ./bin/ceph fs dump
We can see join_fscid in all active mds filesystem 'a' are reset. Since there are standby mds with join_fscid=1
MDSMonitor think they have better affinity and trigger switch over.
Fixes: https://tracker.ceph.com/issues/73183
Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job DefinitionYou must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.