Skip to content

client: use path supplied in statfs#64864

Merged
chrisphoffman merged 6 commits intoceph:mainfrom
chrisphoffman:wip-choffman-72355
Aug 19, 2025
Merged

client: use path supplied in statfs#64864
chrisphoffman merged 6 commits intoceph:mainfrom
chrisphoffman:wip-choffman-72355

Conversation

@chrisphoffman
Copy link
Contributor

If path provided, use in statfs. Replumb internal statfs
for internal only to allow for use in ll_statfs and statfs

In statfs, get quota_root for inode provided. This will allow for
quotas not set at root to be viewed.

Fixes: https://tracker.ceph.com/issues/72355
Signed-off-by: Christopher Hoffman choffman@redhat.com

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

If path provided, use in statfs. Replumb internal statfs
for internal only to allow for use in ll_statfs and statfs

Fixes: https://tracker.ceph.com/issues/72355
Signed-off-by: Christopher Hoffman <choffman@redhat.com>
@github-actions github-actions bot added cephfs Ceph File System tests labels Aug 6, 2025
@chrisphoffman chrisphoffman requested a review from a team August 7, 2025 19:40
Copy link
Contributor

@ffilz ffilz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, but overall this looks good. Nice that it ended up being a simple and concise change.

stbuf->f_files = total_files_on_fs;
stbuf->f_ffree = -1;
stbuf->f_favail = -1;
stbuf->f_fsid = -1; // ??
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So outside a sub-volume f_fsid will be -1? I guess that will work.

I do hope using an inode number is sufficiently unique if multiple filesystems are involved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible to run into collisions if multiple FSes are used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I'm not sure we have to deal with multiple FSes. Is there any way to make more unique?

@@ -12973,7 +12974,21 @@ int Client::ll_statfs(Inode *in, struct statvfs *stbuf, const UserPerm& perms)
/* Since the only thing this does is wrap a call to statfs, and
statfs takes a lock, it doesn't seem we have a need to split it
out. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And those look great, so nice addition of statfs method with path.

Although Ganesha won't use it, you might want to add a path based ll_statfs_path and a ceph_ll_statfs_path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean.

I'm more of a minimalist for adding/changing apis. I'm not aware of a hard requirement for those since the other statfs calls were added in 2013. Perhaps if it's required we add them later?

stbuf->f_fsid = -1; // ??
if (in->snaprealm) {
// this is subvolume root ino
stbuf->f_fsid = in->snaprealm->ino;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this would be an unexpected divergence in behavior for f_fsid. It should remain the same for any CephFS file system. (It also should definitely not be -1.)

I would suggest the proper fix here is to use the last 4 bytes of ceph_fsid (the Ceph cluster UUID) as the most significant 4 bytes of stbuf->f_fsid and the file system ID as the least significant 4 bytes of stbuf->f_fsid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ffilz do you really need this to be unique for each subvolume? That's not typical for any quota enabled file system I'm aware of.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Linux NFS client, when a user does df, it will do a statfs on each filesystem mounted. The Linux NFS client creates additional mounts whenever it sees a new filesystem (NFS fsid attribute changes). I'm pretty sure statfs will also translate into a statfs on the root inode of such filesystems.

So if we don't show the fsid changing, the Linux NFS client won't understand that there are different statfs attributes (the sub-volume quota for ceph).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, ideally, the fsid Ganesha gives to the NFS client should be whatever CephFS gives.

I'm concerned about making subvolumes a "file system" abstraction even though they masquerade as such in various ways (e.g. no external links). At the very least, this fsid needs to include parts of the cluster id and CephFS FSID. I'm not sure how to consolidate all of that into a 16 byte field that will work as you intend. The subvolume inode number itself can be 8 bytes and the CEphFS FSID is probably safely considered a 4 byte int.

Copy link
Contributor

@mattbenjamin mattbenjamin Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concatenate the bytes and hash (e.g., xxhash, or even a cryptographic hash)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is if we want the client to understand that a sub-volume is something different. The documented way to clue in the client is to change fsid which in NFSv4 is 2 uint64_t so 16 bytes. In NFSv3 it's 2 uint32_t so only 8 bytes.

Hmm, but it looks like Cephfs allows a quota to be set on any directory. Does that create a sub-volume? Or is quota root something different from sub-volumes (with the no external links and such)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is if we want the client to understand that a sub-volume is something different. The documented way to clue in the client is to change fsid which in NFSv4 is 2 uint64_t so 16 bytes. In NFSv3 it's 2 uint32_t so only 8 bytes.

So I think the better question is: can CephFS provide enough information to Ganesha using libcephfs that it's dealing with a subvolume root? Would NFS-Ganesha be able to communicate that to the NFS client in some standard way that doesn't require encoding it as a 16 byte (or 8 byte for NFSv3) identifier?

Hmm, but it looks like Cephfs allows a quota to be set on any directory. Does that create a sub-volume? Or is quota root something different from sub-volumes (with the no external links and such)?

Setting a quota does not create a subvolume, no. In practice, we expect that clients will not have the necessary CephFS ("mds") capabilities to create/set quotas for subvolumes. Those quotas would be set when the subvolume is created.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it really sounds like for Ganesha and given the Linux kernel behavior, we need to fake up a per-subvolume FSID to get any sensible behavior.
If Rook/Kubernetes had the same experience, would that break things? Since they're sharing a single mount, but have a bunch of sub volumes? Would diverging between these clients for this API call be a problem?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I don't see f_fsid changes in the recent revision. Did we decide not to do that? If yes, can someone update why is that bit is not required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was removed as we didn't want to hold up the changes with statfs using path/inode provided. Discussion on f_fsid wasn't finalized or fully agreed on. We broke out this requirement here: https://tracker.ceph.com/issues/72575

stbuf->f_files = total_files_on_fs;
stbuf->f_ffree = -1;
stbuf->f_favail = -1;
stbuf->f_fsid = -1; // ??
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I'm not sure we have to deal with multiple FSes. Is there any way to make more unique?

@@ -12529,24 +12529,25 @@ int Client::getcwd(string& dir, const UserPerm& perms)
int Client::_statfs(Inode *in, struct statvfs *stbuf,
const UserPerm& perms)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the changes in this patch, perhaps more description in the commit message?

In statfs, get quota_root for inode provided. Check if a quota
is directly applied to inode. If not, reverse tree walk up and
maybe find a quota set higher up the tree.

Fixes: https://tracker.ceph.com/issues/72355
Signed-off-by: Christopher Hoffman <choffman@redhat.com>
Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good work!

@batrick
Copy link
Member

batrick commented Aug 13, 2025

	
CTest Failure

3 tests failed out of 318

Total Test time (real) = 3579.92 sec

The following tests FAILED:
	209 - unittest_bluefs (SEGFAULT)
	210 - unittest_bluefs_ex (Failed)
	212 - unittest_bdev (SEGFAULT)

@batrick
Copy link
Member

batrick commented Aug 13, 2025

jenkins test make check

@batrick
Copy link
Member

batrick commented Aug 13, 2025

jenkins test make check arm64

@batrick
Copy link
Member

batrick commented Aug 13, 2025

jenkins test windows

@batrick batrick removed their assignment Aug 13, 2025
@batrick
Copy link
Member

batrick commented Aug 13, 2025

@chrisphoffman please open a ticket for discussion on f_fsid and summarize the slack discussion.

Signed-off-by: Christopher Hoffman <choffman@redhat.com>
Signed-off-by: Christopher Hoffman <choffman@redhat.com>
Signed-off-by: Christopher Hoffman <choffman@redhat.com>
@chrisphoffman
Copy link
Contributor Author

I did one more minor repush.
What changed:
-Removed the fsid test previously included.
-Added more description to a commit message.

Copy link
Contributor

@ffilz ffilz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now.

@chrisphoffman
Copy link
Contributor Author

@chrisphoffman
Copy link
Contributor Author

During test_df_for_invalid_directory, path_walk is now called.
Use a more general error message as more errnos can be returned
and this will be a better catch all.

Signed-off-by: Christopher Hoffman <choffman@redhat.com>
@chrisphoffman
Copy link
Contributor Author

chrisphoffman commented Aug 18, 2025

@chrisphoffman
Copy link
Contributor Author

@chrisphoffman chrisphoffman merged commit 7b5fe62 into ceph:main Aug 19, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants