Skip to content

squid: mds/FSMap: fix join_fscid being incorrectly reset for active MDS during filesystem removal#65822

Open
joscollin wants to merge 1 commit intoceph:squidfrom
joscollin:wip-73349-squid
Open

squid: mds/FSMap: fix join_fscid being incorrectly reset for active MDS during filesystem removal#65822
joscollin wants to merge 1 commit intoceph:squidfrom
joscollin:wip-73349-squid

Conversation

@joscollin
Copy link
Member

backport tracker: https://tracker.ceph.com/issues/73349


backport of #65640
parent tracker: https://tracker.ceph.com/issues/73183

this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh

…ng filesystem removal

Fix bug where active MDS daemons in remaining filesystems incorrectly
have their join_fscid cleared to FS_CLUSTER_ID_NONE when any other
filesystem is removed.

The issue was caused by variable name shadowing in erase_filesystem()
where the loop variable 'fscid' shadowed the function parameter 'fscid':
Inside loop: if (info.join_fscid == fscid) compared against the
loop variable (remaining FS ID) instead of parameter (removed FS ID)

Renamed the loop variable to 'remaining_fscid' to eliminate the shadowing
and ensure the comparison uses the correct filesystem ID.

Reproducer:
../src/vstart.sh --new -x --localhost --bluestore
FS=b
./bin/ceph osd pool create cephfs.${FS}.meta 64 64 replicated
./bin/ceph osd pool create cephfs.${FS}.data 64 64 replicated
./bin/ceph fs new ${FS} cephfs.${FS}.meta cephfs.${FS}.data
./bin/ceph config set mds.a mds_join_fs a
./bin/ceph config set mds.b mds_join_fs a
./bin/ceph fs fail ${FS}
./bin/ceph fs rm ${FS} --yes-i-really-mean-it

Then from ./bin/ceph fs dump
We can see join_fscid in all active mds filesystem 'a' are reset.
Since there are standby mds with join_fscid=1
MDSMonitor think they have better affinity and trigger switch over.

Fixes: https://tracker.ceph.com/issues/73183
Signed-off-by: ethanwu <ethanwu@synology.com>
(cherry picked from commit cfecf7c)
@joscollin joscollin added this to the squid milestone Oct 8, 2025
@joscollin joscollin added the cephfs Ceph File System label Oct 8, 2025
@joscollin
Copy link
Member Author

jenkins test make check

@joscollin
Copy link
Member Author

This PR is under test in https://tracker.ceph.com/issues/73560.

@batrick batrick modified the milestones: squid, v19.2.4 Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants