mds: Fix readdir when osd is full.#64663
Merged
Conversation
380d679 to
7429624
Compare
7429624 to
6464603
Compare
635b0ce to
9803260
Compare
Contributor
Author
|
jenkins test make check |
2174ee4 to
fff7d78
Compare
Contributor
Author
|
jenkins test make check arm64 |
batrick
requested changes
Jul 29, 2025
vshankar
reviewed
Aug 4, 2025
Problem: The readdir wouldn't list all the entries in the directory when the osd is full with rstats enabled. Cause: The issue happens only in multi-mds cephfs cluster. If rstats is enabled, the readdir would request 'Fa' cap on every dentry, basically to fetch the size of the directories. Note that 'Fa' is CEPH_CAP_GWREXTEND which maps to CEPH_CAP_FILE_WREXTEND and is used by CEPH_STAT_RSTAT. The request for the cap is a getattr call and it need not go to the auth mds. If rstats is enabled, the getattr would go with the mask CEPH_STAT_RSTAT which mandates the requirement for auth-mds in 'handle_client_getattr', so that the request gets forwarded to auth mds if it's not the auth. But if the osd is full, the indode is fetched in the 'dispatch_client_request' before calling the handler function of respective op, to check the FULL cap access for certain metadata write operations. If the inode doesn't exist, ESTALE is returned. This is wrong for the operations like getattr, where the inode might not be in memory on the non-auth mds and returning ESTALE is confusing and client wouldn't retry. This is introduced by the commit 6db81d8 which fixes subvolume deletion when osd is full. Fix: Fetch the inode required for the FULL cap access check for the relevant operations in osd full scenario. This makes sense because all the operations would mostly be preceded with lookup and load the inode in memory or they would handle ESTALE gracefully. Fixes: https://tracker.ceph.com/issues/72260 Introduced-by: 6db81d8 Signed-off-by: Kotresh HR <khiremat@redhat.com>
fff7d78 to
41e8afa
Compare
Contributor
Author
|
The test is validating the fix. Please check the following teuthology run with and without fix. Without Fix, ran against main branch: |
batrick
approved these changes
Aug 7, 2025
Member
batrick
left a comment
There was a problem hiding this comment.
Good work!
(Please also address the comment request in #64663 (comment) )
vshankar
approved these changes
Aug 8, 2025
Contributor
vshankar
left a comment
There was a problem hiding this comment.
Conditionally approved (pending: #64663 (review))
Fixes: https://tracker.ceph.com/issues/72260 Signed-off-by: Kotresh HR <khiremat@redhat.com>
41e8afa to
8547e57
Compare
Contributor
Author
Thank you. Done. ceph/qa/workunits/fs/full/subvolume_ls.sh Line 61 in 8547e57 |
Contributor
|
This PR is under test in https://tracker.ceph.com/issues/72684. |
vshankar
approved these changes
Sep 1, 2025
This was referenced Sep 3, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem:
The readdir wouldn't list all the entries in the directory when the osd is full with rstats enabled.
Cause:
The issue happens only in multi-mds cephfs cluster. If rstats is enabled, the readdir would request 'Fa' cap on every dentry, basically to fetch the size of the directories. Note that 'Fa' is CEPH_CAP_GWREXTEND which maps to CEPH_CAP_FILE_WREXTEND and is used by CEPH_STAT_RSTAT.
The request for the cap is a getattr call and it need not go to the auth mds. If rstats is enabled, the getattr would go with the mask CEPH_STAT_RSTAT which mandates the requirement for auth-mds in 'handle_client_getattr', so that the request gets forwarded to auth mds if it's not the auth. But if the osd is full, in the 'dispatch_client_request' before calling the handler function of respective op, the inode is fetched to check the FULL cap access for certain metadata write operations. If the inode doesn't exist, ESTALE is returned. This is wrong for the operations like getattr, where the inode might not be in memory on the non-auth mds and returning ESTALE is confusing and client wouldn't retry. This is introduced by the commit 6db81d8 which fixes subvolume deletion when osd is full.
Fix:
Fetch the inode required for the FULL cap access check for the relevant operations in osd full scenario. This makes sense because all the operations would mostly be preceded with lookup and load the inode in memory or they would handle ESTALE gracefully.
Fixes: https://tracker.ceph.com/issues/72260
Introduced-by: 6db81d8
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job Definition