Skip to content

mds/client: check the cephx mds auth access in client side#48027

Merged
rishabh-d-dave merged 21 commits intoceph:mainfrom
lxbsz:wip-57154
Sep 22, 2023
Merged

mds/client: check the cephx mds auth access in client side#48027
rishabh-d-dave merged 21 commits intoceph:mainfrom
lxbsz:wip-57154

Conversation

@lxbsz
Copy link
Member

@lxbsz lxbsz commented Sep 9, 2022

Currently for the setattr it could buffer the changes locally if the required
caps are issued and just succeeds the setattr request. But later when
flushing the changes back to MDS via the cap update request the MDS
may fail it when checking the cephx mds auth access. This will cause to
lose the memtadata/data when the mountpoint remounted or something
else. It's bad!!!

And also the same for the open, which may just open the inode locally
by sending a check_caps() to MDS.

This PR will pass the parsed the mds auth caps back to clients when
opening sessions. And the client could do the same access checking
if clients want to buffer the changes or without sending a sync request
to MDS.

After this I will make this change in kernel client, which need much more
work than this, because we couldn't reuse the mds auth related code
there.

Fixes: https://tracker.ceph.com/issues/57154
Signed-off-by: Xiubo Li xiubli@redhat.com

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

Copy link
Contributor

@mchangir mchangir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please rephrase commit message

mds: add ceph_subsys_mds_auth subsys for MDSAuthCaps.cc
In client side we also need to debug it without debug_mds being enabled.

@batrick
Copy link
Member

batrick commented Sep 13, 2022

Currently for the setattr it could buffer the changes locally if the required
caps are issued and just succeeds the setattr request. But later when
flushing the changes back to MDS via the cap update request the MDS
may fail it when checking the cephx mds auth access. This will cause to
lose the memtadata/data when the mountpoint remounted or something
else. It's bad!!!

Circling back to my comment in #47506 (comment) . Why are we issuing Fw caps for a client with root_squash? Also, any fix should begin with a test case.

@lxbsz
Copy link
Member Author

lxbsz commented Sep 13, 2022

Currently for the setattr it could buffer the changes locally if the required
caps are issued and just succeeds the setattr request. But later when
flushing the changes back to MDS via the cap update request the MDS
may fail it when checking the cephx mds auth access. This will cause to
lose the memtadata/data when the mountpoint remounted or something
else. It's bad!!!

Circling back to my comment in #47506 (comment) . Why are we issuing Fw caps for a client with root_squash?

Once a permissions access check passed for a non-root user and the MDS will successfully open the inode in MDS and the caps will be issued to the corresponding client. We shouldn't prevent issuing the Fw caps when the root_squash enabled just to prevent root user's potential modifications, right ? Or even non-root user couldn't write to the file any more.

The root-user's access could come in any time so we couldn't predict it here.

I am thinking maybe we should always set the cap update request's caller_uid/caller_gid to the caller user's uid/gid instead of setting them to -1/-1 just in case a malicious client could attack ?

Also, any fix should begin with a test case.

Sure. Will do it. Thanks!

@github-actions github-actions bot added the tests label Sep 14, 2022
@lxbsz lxbsz force-pushed the wip-57154 branch 2 times, most recently from 8e486e9 to 2d479a6 Compare September 14, 2022 14:16
@vshankar
Copy link
Contributor

Currently for the setattr it could buffer the changes locally if the required
caps are issued and just succeeds the setattr request. But later when
flushing the changes back to MDS via the cap update request the MDS
may fail it when checking the cephx mds auth access. This will cause to
lose the memtadata/data when the mountpoint remounted or something
else. It's bad!!!

Circling back to my comment in #47506 (comment) . Why are we issuing Fw caps for a client with root_squash?

Once a permissions access check passed for a non-root user and the MDS will successfully open the inode in MDS and the caps will be issued to the corresponding client. We shouldn't prevent issuing the Fw caps when the root_squash enabled just to prevent root user's potential modifications, right ? Or even non-root user couldn't write to the file any more.

The root-user's access could come in any time so we couldn't predict it here.

I am thinking maybe we should always set the cap update request's caller_uid/caller_gid to the caller user's uid/gid instead of setting them to -1/-1 just in case a malicious client could attack ?

Do we know why the kclient always uses 0/0 for uid/gid rather than using caller uid/gid? (I understand we do not track uid/gid for per cap (flushes)).

@lxbsz
Copy link
Member Author

lxbsz commented Sep 15, 2022

Currently for the setattr it could buffer the changes locally if the required
caps are issued and just succeeds the setattr request. But later when
flushing the changes back to MDS via the cap update request the MDS
may fail it when checking the cephx mds auth access. This will cause to
lose the memtadata/data when the mountpoint remounted or something
else. It's bad!!!

Circling back to my comment in #47506 (comment) . Why are we issuing Fw caps for a client with root_squash?

Once a permissions access check passed for a non-root user and the MDS will successfully open the inode in MDS and the caps will be issued to the corresponding client. We shouldn't prevent issuing the Fw caps when the root_squash enabled just to prevent root user's potential modifications, right ? Or even non-root user couldn't write to the file any more.
The root-user's access could come in any time so we couldn't predict it here.
I am thinking maybe we should always set the cap update request's caller_uid/caller_gid to the caller user's uid/gid instead of setting them to -1/-1 just in case a malicious client could attack ?

Do we know why the kclient always uses 0/0 for uid/gid rather than using caller uid/gid? (I understand we do not track uid/gid for per cap (flushes)).

This was introduced by Sage's commit in userspace client 7 years ago:

commit 1957aeddbf05f2ecf3be0a760ff5b5c313370eea
Author: Sage Weil <sweil@redhat.com>
Date:   Wed Aug 19 09:48:16 2015 -0400

    client: do sync setattr when caller != last cap dirtier
    
    This way we can still do cap writeback in general when the caller is not
    the same as the mount_uid/gid, but we flip to a sync setattr when we have
    to because the dirty caps have a different uid/gid than the current
    caller.
    
    Signed-off-by: Sage Weil <sage@redhat.com>

Before this it would also set it to 0/0 too and I am not sure why kclient didn't fix this.

There also have some other cases, which fixes are in kclient but not in userspace client, just like this. I have fixed many of them already.

@vshankar
Copy link
Contributor

Currently for the setattr it could buffer the changes locally if the required caps are issued and just succeeds the setattr request. But later when flushing the changes back to MDS via the cap update request the MDS may fail it when checking the cephx mds auth access. This will cause to lose the memtadata/data when the mountpoint remounted or something else. It's bad!!!

And also the same for the open, which may just open the inode locally by sending a check_caps() to MDS.

This PR will pass the parsed the mds auth caps back to clients when opening sessions. And the client could do the same access checking if clients want to buffer the changes or without sending a sync request to MDS.

I presume this is related to the issue with root squash. Is there any other benefit of passing the auth caps back to the client than just having the mds hint the client that it has root squash and then client then uses this bit of information to deny write/update operations from 0/0?

After this I will make this change in kernel client, which need much more work than this, because we couldn't reuse the mds auth related code there.

@lxbsz
Copy link
Member Author

lxbsz commented Sep 15, 2022

Currently for the setattr it could buffer the changes locally if the required caps are issued and just succeeds the setattr request. But later when flushing the changes back to MDS via the cap update request the MDS may fail it when checking the cephx mds auth access. This will cause to lose the memtadata/data when the mountpoint remounted or something else. It's bad!!!
And also the same for the open, which may just open the inode locally by sending a check_caps() to MDS.
This PR will pass the parsed the mds auth caps back to clients when opening sessions. And the client could do the same access checking if clients want to buffer the changes or without sending a sync request to MDS.

I presume this is related to the issue with root squash.

Yeah. While in kclient the async dirops could benefit from this, which is irrelevant to the root_squash. Such as for:

$ ceph fs authorize a client.test_a / rw /volume1 r

Currently it will issue Frw caps to kclient for /volume1, then the kclient could do the async directory or file create but it will fail after the create request sent to MDS. And the userspace app will get a success but when operating the new created dir or file it will return EIO.

Is there any other benefit of passing the auth caps back to the client than just having the mds hint the client that it has root squash and then client then uses this bit of information to deny write/update operations from 0/0?

The root_squash may only effect some path, just like:

$ ceph fs authorize a client.test_a / rw root_squash /volume1 rw /volume1/dir2 rw root_squash

The above will enable root_squash for / and /volume1/dir2 but all the others in the /volume1/. It maybe more complex than this.

Then how could we handle this case ? Maybe we could add one map for each CInode and then pass it back to clients, but for different client id may have different keyring and different mds caps, then we need to record all of them in the CInode, it will be like:

$ ceph fs authorize a client.test_a / rw root_squash /volume1 rw /volume1/dir2 rw root_squash
$ ceph fs authorize a client.test_b / rws /volume1 rw root_squash /volume1/dir2 rw
...
$ ceph fs authorize a client.test_x / rw root_squash /volume1/dir2 rw

Such as for the / the CInode::root_squash_map maybe:

client.test_a <--> root_squash: true, mds_caps: rw
client.test_b <--> root_squash: false, mds_caps: rws
...
client.test_x <--> root_squash: true, mds_caps: rw

This will make it more complicated than the fix in this PR currently and hard to maintain in MDS side ?

After this I will make this change in kernel client, which need much more work than this, because we couldn't reuse the mds auth related code there.

@vshankar
Copy link
Contributor

vshankar commented Sep 20, 2022

Currently for the setattr it could buffer the changes locally if the required caps are issued and just succeeds the setattr request. But later when flushing the changes back to MDS via the cap update request the MDS may fail it when checking the cephx mds auth access. This will cause to lose the memtadata/data when the mountpoint remounted or something else. It's bad!!!
And also the same for the open, which may just open the inode locally by sending a check_caps() to MDS.
This PR will pass the parsed the mds auth caps back to clients when opening sessions. And the client could do the same access checking if clients want to buffer the changes or without sending a sync request to MDS.

I presume this is related to the issue with root squash.

Yeah. While in kclient the async dirops could benefit from this, which is irrelevant to the root_squash. Such as for:

$ ceph fs authorize a client.test_a / rw /volume1 r

Currently it will issue Frw caps to kclient for /volume1, then the kclient could do the async directory or file create but it will fail after the create request sent to MDS. And the userspace app will get a success but when operating the new created dir or file it will return EIO.

Is there any other benefit of passing the auth caps back to the client than just having the mds hint the client that it has root squash and then client then uses this bit of information to deny write/update operations from 0/0?

The root_squash may only effect some path, just like:

$ ceph fs authorize a client.test_a / rw root_squash /volume1 rw /volume1/dir2 rw root_squash

The above will enable root_squash for / and /volume1/dir2 but all the others in the /volume1/. It maybe more complex than this.

Then how could we handle this case ? Maybe we could add one map for each CInode and then pass it back to clients, but for different client id may have different keyring and different mds caps, then we need to record all of them in the CInode, it will be like:

$ ceph fs authorize a client.test_a / rw root_squash /volume1 rw /volume1/dir2 rw root_squash
$ ceph fs authorize a client.test_b / rws /volume1 rw root_squash /volume1/dir2 rw
...
$ ceph fs authorize a client.test_x / rw root_squash /volume1/dir2 rw

Such as for the / the CInode::root_squash_map maybe:

client.test_a <--> root_squash: true, mds_caps: rw
client.test_b <--> root_squash: false, mds_caps: rws
...
client.test_x <--> root_squash: true, mds_caps: rw

This will make it more complicated than the fix in this PR currently and hard to maintain in MDS side ?

That's a valid point @lxbsz, however, passing mds auth caps to the client is making me feel a bit uncomfortable, although, its nothing to do with security aspects as such.

After this I will make this change in kernel client, which need much more work than this, because we couldn't reuse the mds auth related code there.

@lxbsz
Copy link
Member Author

lxbsz commented Oct 8, 2022

...

This will make it more complicated than the fix in this PR currently and hard to maintain in MDS side ?

That's a valid point @lxbsz, however, passing mds auth caps to the client is making me feel a bit uncomfortable, although, its nothing to do with security aspects as such.

Not only for root_squash case, for non-root_sequash case I also want to use this to fix the kclient's async dirops' similar issue. Maybe we can build a hierarchy just like a snaprealm does? Such as:

$ ceph fs authorize a client.test_b / rws /volume1 rw root_squash /volume1/dir2 rw

The simple graph:

                             '/'(rws)
                  /        ...             \
            volume0          volume1(rw root_squash) 
                                     /           \
                                 dir1       dir2(rw)
                                               /    \
                                             d0      d1

And then, for example, for all the subdirs' inode in clients side of dir2 will have a pointer pointing to dir2(rw) auth caps.

This should simplify it. But still not sure whether is there any potential issue for this approach we haven't foresee yet ?

@vshankar Any idea ?

@vshankar
Copy link
Contributor

...

This will make it more complicated than the fix in this PR currently and hard to maintain in MDS side ?

That's a valid point @lxbsz, however, passing mds auth caps to the client is making me feel a bit uncomfortable, although, its nothing to do with security aspects as such.

Not only for root_squash case, for non-root_sequash case I also want to use this to fix the kclient's async dirops' similar issue.

Which issue?

Maybe we can build a hierarchy just like a snaprealm does? Such as:

$ ceph fs authorize a client.test_b / rws /volume1 rw root_squash /volume1/dir2 rw

The simple graph:

                             '/'(rws)
                  /        ...             \
            volume0          volume1(rw root_squash) 
                                     /           \
                                 dir1       dir2(rw)
                                               /    \
                                             d0      d1

And then, for example, for all the subdirs' inode in clients side of dir2 will have a pointer pointing to dir2(rw) auth caps.

This should simplify it. But still not sure whether is there any potential issue for this approach we haven't foresee yet ?

Are you talking about a different approach than #48027 (comment)?

@lxbsz
Copy link
Member Author

lxbsz commented Oct 12, 2022

...

Not only for root_squash case, for non-root_sequash case I also want to use this to fix the kclient's async dirops' similar issue.

Which issue?

The async dirops in kclient allows users to cache creating files or directories locally without getting a change to do the cephx check in MDS side before it. That means users will see the files/directories were created successfully but later it will see a -EIO error when accessing them.

We can use to check access before caching the creating.

[...]

And then, for example, for all the subdirs' inode in clients side of dir2 will have a pointer pointing to dir2(rw) auth caps.
This should simplify it. But still not sure whether is there any potential issue for this approach we haven't foresee yet ?

Are you talking about a different approach than #48027 (comment)?

Yeah. Just a little different but mostly are the same. There is no need to record the following info in each CInode:

client.test_a <--> root_squash: true, mds_caps: rw
client.test_b <--> root_squash: false, mds_caps: rws
...
client.test_x <--> root_squash: true, mds_caps: rw

But this will make the client side code more complicated than this PR than as expected. Currently this PR is still the easiest and simplest to fix this bug.

@vshankar
Copy link
Contributor

...

Not only for root_squash case, for non-root_sequash case I also want to use this to fix the kclient's async dirops' similar issue.

Which issue?

The async dirops in kclient allows users to cache creating files or directories locally without getting a change to do the cephx check in MDS side before it. That means users will see the files/directories were created successfully but later it will see a -EIO error when accessing them.

We can use to check access before caching the creating.

Oh yeh - I recall we discussing this.

[...]

And then, for example, for all the subdirs' inode in clients side of dir2 will have a pointer pointing to dir2(rw) auth caps.
This should simplify it. But still not sure whether is there any potential issue for this approach we haven't foresee yet ?

Are you talking about a different approach than #48027 (comment)?

Yeah. Just a little different but mostly are the same. There is no need to record the following info in each CInode:

client.test_a <--> root_squash: true, mds_caps: rw
client.test_b <--> root_squash: false, mds_caps: rws
...
client.test_x <--> root_squash: true, mds_caps: rw

But this will make the client side code more complicated than this PR than as expected. Currently this PR is still the easiest and simplest to fix this bug.

Fair enough. I'll go through it in detail.

@ajarr
Copy link
Contributor

ajarr commented Oct 13, 2022

...

This will make it more complicated than the fix in this PR currently and hard to maintain in MDS side ?

That's a valid point @lxbsz, however, passing mds auth caps to the client is making me feel a bit uncomfortable, although, its nothing to do with security aspects as such.

Not only for root_squash case, for non-root_sequash case I also want to use this to fix the kclient's async dirops' similar issue. Maybe we can build a hierarchy just like a snaprealm does? Such as:

$ ceph fs authorize a client.test_b / rws /volume1 rw root_squash /volume1/dir2 rw

The simple graph:

                             '/'(rws)
                  /        ...             \
            volume0          volume1(rw root_squash) 
                                     /           \
                                 dir1       dir2(rw)
                                               /    \
                                             d0      d1

And then, for example, for all the subdirs' inode in clients side of dir2 will have a pointer pointing to dir2(rw) auth caps.

@lxbsz I want to make sure that we're on the same page . The cephx MDS auth caps are ORed. Since cephx ID test_b has MDS auth caps rws assigned to the / (root of the file system), a client using test_b ID and mounting path /volume1/ will have rws MDS access and not rw root_squash MDS access to the subtree.

$ sudo ./bin/ceph-fuse --id test_b -r /volume1 /mnt/cephfs
$ pushd /mnt/cephfs
$ # client can still create and snapshot directory as root within "/volume1" as it has "rws" access
$ sudo mkdir testdir00
$ sudo mkdir testdir00/.snap/snap00
$ popd

Similarly, a client using test_b ID and mounting path /volume1/dir2 will have rws access and not rw access to the subtree.

$ sudo ./bin/ceph-fuse --id test_b -r /volume1/dir2 /mnt/cephfs1
$ pushd /mnt/cephfs1
$ # client can still create snapshots within "/volume1/dir2" as it has "rws" access
$ mkdir testdir00
$ mkdir testdir00/.snap/snap00

I think a more useful path based restriction of MDS caps for a cephx ID would look like,

$ ceph fs authorize a client.test_b / r /volume1 rw root_squash /volume1/dir2 rw

Anyways, I think your current approach of passing all of the cephx MDS caps to the userspace FS client makes sense to me.

This should simplify it. But still not sure whether is there any potential issue for this approach we haven't foresee yet ?

@lxbsz
Copy link
Member Author

lxbsz commented Oct 14, 2022

And then, for example, for all the subdirs' inode in clients side of dir2 will have a pointer pointing to dir2(rw) auth caps.

@lxbsz I want to make sure that we're on the same page . The cephx MDS auth caps are ORed.

Okay, thanks for this. I never tested this yet. Just for a example and yours below is the correct one.

Since cephx ID test_b has MDS auth caps rws assigned to the / (root of the file system), a client using test_b ID and mounting path /volume1/ will have rws MDS access and not rw root_squash MDS access to the subtree.

$ sudo ./bin/ceph-fuse --id test_b -r /volume1 /mnt/cephfs
$ pushd /mnt/cephfs
$ # client can still create and snapshot directory as root within "/volume1" as it has "rws" access
$ sudo mkdir testdir00
$ sudo mkdir testdir00/.snap/snap00
$ popd

Similarly, a client using test_b ID and mounting path /volume1/dir2 will have rws access and not rw access to the subtree.

$ sudo ./bin/ceph-fuse --id test_b -r /volume1/dir2 /mnt/cephfs1
$ pushd /mnt/cephfs1
$ # client can still create snapshots within "/volume1/dir2" as it has "rws" access
$ mkdir testdir00
$ mkdir testdir00/.snap/snap00

I think a more useful path based restriction of MDS caps for a cephx ID would look like,

$ ceph fs authorize a client.test_b / r /volume1 rw root_squash /volume1/dir2 rw

Anyways, I think your current approach of passing all of the cephx MDS caps to the userspace FS client makes sense to me.

Yeah, this is the simplest approach too. Still waiting @vshankar 's reply on it. And after that I will fix your comments.

Thanks!

lxbsz and others added 11 commits September 11, 2023 09:29
If couldn't get the absolute path string we need to force it to
do the sync setattr.

Fixes: https://tracker.ceph.com/issues/57154
Signed-off-by: Xiubo Li <xiubli@redhat.com>
This feature bit could be used to distinguish old and new clients.

Fixes: https://tracker.ceph.com/issues/57154
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Since the setattr will check the cephx mds auth access before
buffering the changes, so it makes no sense any more to let the
cap update to check the access in MDS again.

Fixes: https://tracker.ceph.com/issues/57154
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Before opening the file locally we need to check the cephx access.

Fixes: https://tracker.ceph.com/issues/57154
Signed-off-by: Xiubo Li <xiubli@redhat.com>
as it's too late. Session access authorization already happens
before new caps are issued.

Fixes: https://tracker.ceph.com/issues/56067
Signed-off-by: Ramana Raja <rraja@redhat.com>
... MDS auth caps but don't have CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK
feature bit (i.e., can't check the auth caps sent back to it by the
MDS) from establishing a session. Do this in
Server::handle_client_session(), and Server::handle_client_reconnect(),
where old clients try to reconnect to MDS servers after an upgrade.

If the client doesn't have the ability to authorize session access
based on the MDS auth caps send back to it by the MDS, then the
client may buffer changes locally during open and setattr operations
when it's not supposed to, e.g., when enforcing root_squash MDS auth
caps.

Fixes: https://tracker.ceph.com/issues/56067
Signed-off-by: Ramana Raja <rraja@redhat.com>
The test.cc will be included in ceph_test_libcephfs, no need to
include it to access.

Fixes: https://tracker.ceph.com/issues/57154
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Fixes: https://tracker.ceph.com/issues/57154
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Test the 'chown' and 'truncate', which will call the setattr and
'cat' will open the files. Before each testing will open the file
by non-root user and keep it to make sure the Fxw caps are issued,
and then user the 'sudo' do to the tests, which will set the uid/gid
to 0/0.

Fixes: https://tracker.ceph.com/issues/57154
Signed-off-by: Xiubo Li <xiubli@redhat.com>
…6067

A kernel CephFS client with MDS root_squash caps is able to write to a
file as non-root user. However, the data written is lost after clearing
the kernel client cache, or re-mounting the client. This issue is not
observed with a FUSE CephFS client.

Signed-off-by: Ramana Raja <rraja@redhat.com>
kclient doesn't have CEPHFS_FEATURE_MDS_AUTH_CAPS required to
enforce root_squash. Run root_squash tests only for FUSE client.

Signed-off-by: Ramana Raja <rraja@redhat.com>
@lxbsz
Copy link
Member Author

lxbsz commented Sep 11, 2023

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

Rebased it and resolve the conflicts with #41779.

@vshankar
Copy link
Contributor

Rebased it and resolve the conflicts with #41779.

@rishabh-d-dave Please run this through tests on prio.

@vshankar
Copy link
Contributor

jenkins test api

@vshankar
Copy link
Contributor

Rebased it and resolve the conflicts with #41779.

@rishabh-d-dave Please run this through tests on prio.

or let me know if you are tied up this week - I can plan to get this run through fs suite.

Copy link
Contributor

@rishabh-d-dave rishabh-d-dave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rishabh-d-dave
Copy link
Contributor

Venky is also okay merging this PR, so I am proceeding.

@gregsfortytwo
Copy link
Member

I'm looking at a small part of this code for unrelated reasons (checking paths live within a share on a shared Ganesha Client), and I don't think that the make_path_string() path-checking logic works right. In particular:

Inode::make_path_string() returns without setting anything if (client->_get_root_ino(false) == ino) .

But this root ino is not the full CephFS filesystem root, it's the mount point of this particular Client. So if the client mounts at /volumes/foo, and there is a file at /volumes/foo/bar/baz, this function will actually return "/bar/baz". And that means none of the path checking will work.

Am I missing something about this? I see some tests for pieces of the PR, but not for this functionality — is there some that I'm missing? @vshankar @batrick @lxbsz

@gregsfortytwo
Copy link
Member

I also see that if the dentry doesn't have a parent, it's happy to just stick "???" into the directory, which...maybe works?
An Inode conveniently just returns false, meaning it didn't succeed in constructing a path, if it doesn't have a dentry. But I don't see any way to fix that, if we need to get a full path. Right now we just will return EPERM, which is a problem for NFS file handle lookups where we need to be able to get them back at any arbitrary time.

@lxbsz
Copy link
Member Author

lxbsz commented Oct 9, 2023

I'm looking at a small part of this code for unrelated reasons (checking paths live within a share on a shared Ganesha Client), and I don't think that the make_path_string() path-checking logic works right. In particular:

Inode::make_path_string() returns without setting anything if (client->_get_root_ino(false) == ino) .

But this root ino is not the full CephFS filesystem root, it's the mount point of this particular Client. So if the client mounts at /volumes/foo, and there is a file at /volumes/foo/bar/baz, this function will actually return "/bar/baz". And that means none of the path checking will work.

Am I missing something about this? I see some tests for pieces of the PR, but not for this functionality — is there some that I'm missing? @vshankar @batrick @lxbsz

Yeah, this is buggy. BTW, is just passing the mountpoint's full path to client safe ? If so we can pass it to clients and then make the full path from fs' root.

@gregsfortytwo @batrick @vshankar ?

@lxbsz
Copy link
Member Author

lxbsz commented Oct 9, 2023

I also see that if the dentry doesn't have a parent, it's happy to just stick "???" into the directory, which...maybe works?

The dentry should always have a dir, and it shouldn't happen here. Maybe we should give a warning in this case instead of returning ??? ?

An Inode conveniently just returns false, meaning it didn't succeed in constructing a path, if it doesn't have a dentry. But I don't see any way to fix that, if we need to get a full path. Right now we just will return EPERM, which is a problem for NFS file handle lookups where we need to be able to get them back at any arbitrary time.

In this case we should issue a lookup first. I will fix it.

@lxbsz
Copy link
Member Author

lxbsz commented Oct 9, 2023

I'm looking at a small part of this code for unrelated reasons (checking paths live within a share on a shared Ganesha Client), and I don't think that the make_path_string() path-checking logic works right. In particular:
Inode::make_path_string() returns without setting anything if (client->_get_root_ino(false) == ino) .
But this root ino is not the full CephFS filesystem root, it's the mount point of this particular Client. So if the client mounts at /volumes/foo, and there is a file at /volumes/foo/bar/baz, this function will actually return "/bar/baz". And that means none of the path checking will work.
Am I missing something about this? I see some tests for pieces of the PR, but not for this functionality — is there some that I'm missing? @vshankar @batrick @lxbsz

Yeah, this is buggy. BTW, is just passing the mountpoint's full path to client safe ? If so we can pass it to clients and then make the full path from fs' root.

@gregsfortytwo @batrick @vshankar ?

Okay, we have already recorded the mount position in client side in metadata["root"], we can just prepend this to the full path.

@lxbsz
Copy link
Member Author

lxbsz commented Oct 9, 2023

I also see that if the dentry doesn't have a parent, it's happy to just stick "???" into the directory, which...maybe works?

The dentry should always have a dir, and it shouldn't happen here. Maybe we should give a warning in this case instead of returning ??? ?

An Inode conveniently just returns false, meaning it didn't succeed in constructing a path, if it doesn't have a dentry. But I don't see any way to fix that, if we need to get a full path. Right now we just will return EPERM, which is a problem for NFS file handle lookups where we need to be able to get them back at any arbitrary time.

In this case we should issue a lookup first. I will fix it.

I think I have already handled this, please see https://github.com/ceph/ceph/blob/main/src/client/Client.cc#L8041.

It will try to do sync setattr/open if make_path_string() return false, which is because when the dentries is empty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants