Skip to content

mds/Locker: remove session check access when doing cap updates#47506

Closed
ajarr wants to merge 16 commits intoceph:mainfrom
ajarr:fix-56067
Closed

mds/Locker: remove session check access when doing cap updates#47506
ajarr wants to merge 16 commits intoceph:mainfrom
ajarr:fix-56067

Conversation

@ajarr
Copy link
Contributor

@ajarr ajarr commented Aug 9, 2022

Author: Ramana Raja rraja@redhat.com
Date: Fri Jul 29 17:51:14 2022 -0400

mds/Locker: remove session check access when doing cap updates

as it's too late. Session access authorization already happens
before new caps are issued.

Fixes: https://tracker.ceph.com/issues/56067
Signed-off-by: Ramana Raja <rraja@redhat.com>

Author: Ramana Raja rraja@redhat.com
Date: Tue Nov 15 14:00:24 2022 -0500

mds/Server: disallow clients that have root_squash

... MDS auth caps but don't have CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK
feature bit (i.e., can't check the auth caps sent back to it by the
MDS) from establishing a session. Do this in
Server::handle_client_session(), and Server::handle_client_reconnect(),
where old clients try to reconnect to MDS servers after an upgrade.

If the client doesn't have the ability to authorize session access
based on the MDS auth caps send back to it by the MDS, then the
client may buffer changes locally during open and setattr operations
when it's not supposed to, e.g., when enforcing root_squash MDS auth
caps.

Fixes: https://tracker.ceph.com/issues/56067
Signed-off-by: Ramana Raja <rraja@redhat.com>

Author: Ramana Raja rraja@redhat.com
Date: Mon Aug 8 14:33:06 2022 -0400

qa/tasks/cephfs: Add reproducer for https://tracker.ceph.com/issues/56067

A kernel CephFS client with MDS root_squash caps is able to write to a
file as non-root user. However, the data written is lost after clearing
the kernel client cache, or re-mounting the client. This issue is not
observed with a FUSE CephFS client.

Signed-off-by: Ramana Raja <rraja@redhat.com>

Author: Ramana Raja rraja@redhat.com
Date: Thu Nov 24 15:48:27 2022 -0500

qa/tasks/cephfs/test_admin: run root_squash tests only for FUSE client

kclient doesn't have CEPHFS_FEATURE_MDS_AUTH_CAPS required to
enforce root_squash. Run root_squash tests only for FUSE client.

Signed-off-by: Ramana Raja <rraja@redhat.com>

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@ajarr ajarr added the cephfs Ceph File System label Aug 9, 2022
@github-actions github-actions bot added the tests label Aug 9, 2022
@ajarr ajarr added the DNM label Aug 9, 2022
@ajarr ajarr marked this pull request as draft August 9, 2022 02:44
@ajarr ajarr requested review from batrick and vshankar August 9, 2022 02:45
@ajarr
Copy link
Contributor Author

ajarr commented Aug 9, 2022

With this fix, a fuse client with root_squash MDS caps worked as expected (as before). When using a kclient with root_squash MDS caps, the data written by a non-root user wasn't lost after remounting the client or clearing the client's cache. A root user wasn't able to create, remove, truncate a file, but was able to write to the file (sudo vim foo.txt). I expected the write to fail for the root user.

$ ./bin/ceph fs authorize a client.test_a / rw root_squash
$ sudo ./bin/mount.ceph test_a@.a=/ /mnt/cephfs1/
$ cd /mnt/cephfs1/

$ date > kernel-root-squash-100.txt
$ su -c 'echo 3 > /proc/sys/vm/drop_caches'
$ cat kernel-root-squash-100.txt
Wed Jul 27 08:26:47 PM EDT 2022
$ # non-root user writes are not lost anymore when using the kclient with root_squash MDS caps

$ sudo touch kernel-root-squash-100.txt
touch: setting times of 'kernel-root-squash-100.txt': Permission denied

$ sudo rm kernel-root-squash-100.txt
rm: cannot remove 'kernel-root-squash-100.txt': Permission denied

$ sudo truncate -s 0 kernel-root-squash-100.txt
truncate: failed to truncate 'kernel-root-squash-100.txt' at 0 bytes: Permission denied
$ # truncate as root failed too!

$ sudo vim kernel-root-squash-100.txt
$ # Able edit the file as root user, which I didn't expect to happen. I don't see this behavior with FUSE client.
$ su -c 'echo 3 > /proc/sys/vm/drop_caches'
$ cat kernel-root-squash-100.txt
Hello World! Wed Jul 27 08:26:47 PM EDT 2022

During write as a root user from the kclient, I see the following in MDS log, where I suspect the write is allowed to succeed.

2022-07-27T20:28:24.100-0400 7fd0192a2640  1 -- [v2:10.0.0.148:6810/3875245833,v1:10.0.0.148:6811/3875245833] <== osd.0 v2:10.0.0.148:6802/2003141 18 ==== osd_op_reply(181 200.00000000 [writefull 0~90 [fadvise_dontneed]] v31'129 uv129 ondisk = 0) v8 ==== 156+0+0 (crc 0 0 0) 0x5600b617ab40 con 0x5600b6395000
2022-07-27T20:28:24.102-0400 7fd015a9b640  1 -- [v2:10.0.0.148:6810/3875245833,v1:10.0.0.148:6811/3875245833] <== client.4163 10.0.0.148:0/3662841394 34266 ==== client_caps(flush ino 0x100000001fb 12 seq 8 tid 46 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=pAsxXsxFsxcrwb follows 1 size 45/0 mtime 2022-07-27T20:28:24.099525-0400 tws 1 xattrs(v=18446637042079226624 l=0)) v10 ==== 236+0+0 (crc 0 0 0) 0x5600b6b00000 con 0x5600b625d400
2022-07-27T20:28:24.102-0400 7fd015a9b640  7 mds.0.locker handle_client_caps  on 0x100000001fb tid 46 follows 1 op flush flags 0x1
2022-07-27T20:28:24.102-0400 7fd015a9b640 20 mds.0.3 get_session have 0x5600b62f0000 client.4163 10.0.0.148:0/3662841394 state open
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.locker  head inode [inode 0x100000001fb [2,head] /kernel-root-squash-100.txt auth v140 DIRTYPARENT s=32 n(v0 rc2022-07-27T20:28:24.079525-0400 b32 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4163=0-4194304@1} caps={4163=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@8},l=4163 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x5600b62ae680]
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.cache pick_inode_snap follows 1 on [inode 0x100000001fb [2,head] /kernel-root-squash-100.txt auth v140 DIRTYPARENT s=32 n(v0 rc2022-07-27T20:28:24.079525-0400 b32 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4163=0-4194304@1} caps={4163=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@8},l=4163 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x5600b62ae680]
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.locker  follows 1 retains pAsxLsXsxFsxcrwb dirty Fw on [inode 0x100000001fb [2,head] /kernel-root-squash-100.txt auth v140 DIRTYPARENT s=32 n(v0 rc2022-07-27T20:28:24.079525-0400 b32 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4163=0-4194304@1} caps={4163=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@8},l=4163 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x5600b62ae680]
2022-07-27T20:28:24.102-0400 7fd015a9b640  7 mds.0.locker  flush client.4163 dirty Fw seq 8 on [inode 0x100000001fb [2,head] /kernel-root-squash-100.txt auth v140 DIRTYPARENT s=32 n(v0 rc2022-07-27T20:28:24.079525-0400 b32 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4163=0-4194304@1} caps={4163=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@8},l=4163 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x5600b62ae680]
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.locker _do_cap_update dirty Fw issued pAsxLsXsxFsxcrwb wanted pAsxXsxFsxcrwb on [inode 0x100000001fb [2,head] /kernel-root-squash-100.txt auth v140 DIRTYPARENT s=32 n(v0 rc2022-07-27T20:28:24.079525-0400 b32 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4163=0-4194304@1} caps={4163=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@8},l=4163 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x5600b62ae680]
2022-07-27T20:28:24.102-0400 7fd015a9b640 20 mds.0.locker inode is file
2022-07-27T20:28:24.102-0400 7fd015a9b640 20 mds.0.locker client has write caps; m->get_max_size=0; old_max=4194304
2022-07-27T20:28:24.102-0400 7fd015a9b640 20 mds.0.3 get_session have 0x5600b62f0000 client.4163 10.0.0.148:0/3662841394 state open
2022-07-27T20:28:24.102-0400 7fd015a9b640 20 Session check_access path /kernel-root-squash-100.txt
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 MDSAuthCap is_capable inode(path /kernel-root-squash-100.txt owner 1000:1000 mode 0100664) by caller 4294967295:4294967295 mask 2 new 0:0 cap: MDSAuthCaps[allow rw fsname=a root_squash]
2022-07-27T20:28:24.102-0400 7fd015a9b640 15 mds.0.cache.ino(0x100000001fb) project_inode 0x100000001fb
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.cache.dir(0x1) pre_dirty 142
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.cache.den(0x1 kernel-root-squash-100.txt) pre_dirty [dentry #0x1/kernel-root-squash-100.txt [2,head] auth (dn sync l) (dversion lock) pv=142 v=140 ino=0x100000001fb state=1610612736 | request=0 lock=0 inodepin=1 dirty=1 authpin=0 clientlease=1 0x5600b6414000]
2022-07-27T20:28:24.102-0400 7fd015a9b640 10 mds.0.cache.ino(0x100000001fb) pre_dirty 142 (current v 140)

Writing as root user from a fuse client with root_squash MDS caps fails as expected. I'm trying to compare fuse client and kclient calls to the MDS when trying to write to a file.

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise, lgtm

Copy link
Member

@gregsfortytwo gregsfortytwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During write as a root user from the kclient, I see the following in MDS log, where I suspect the write is allowed to succeed.

This seems not okay.

@lxbsz
Copy link
Member

lxbsz commented Aug 18, 2022

When using root_squash MDS auth caps, a kernel CephFS client is able to write data as a non-root user. However, after it is remounted or forced to drop its cache, the data is lost. When the kclient tries to flush 'Fw' caps, the cap update is dropped by the MDS. This is because the kclient always sends cap updates with caller_uid and caller_gid as zero (root), and the MDS Locker disallows them for clients with root_squash MDS auth caps.

In the case of FUSE client, the cap update messages are sent to the MDS with caller_uid and caller_gid as -1. It does not hit the data loss issue faced by the kclient.

This is not always true.

There have cases will set them, for example when truncating a file with a larger size, it will set the caller_uid/caller_gid to the caller process'.

Since it's not easy for the clients to keep track of the caller that dirtied the caps, change the MDS Locker code to always check access for caller_uid and caller_gid as -1 when doing cap updates.

The logic of this between user space client and kclient is different and the kclient will always use the inode->i_uid/i_gid for the cap update request, but not for the user space client.

@lxbsz
Copy link
Member

lxbsz commented Aug 18, 2022

During write as a root user from the kclient, I see the following in MDS log, where I suspect the write is allowed to succeed.

This seems not okay.

IMO it should fail when opening the file with a WR mode not the write stage ? Once the client get Fw caps it will be allowed to write and won't do the permission check in write path.

@lxbsz
Copy link
Member

lxbsz commented Aug 18, 2022

During write as a root user from the kclient, I see the following in MDS log, where I suspect the write is allowed to succeed.

This seems not okay.

IMO it should fail when opening the file with a WR mode not the write stage ? Once the client get Fw caps it will be allowed to write and won't do the permission check in write path.

Please take care of the async creating. You can try to disable the ASYNC_DIROPS and then try it.

@ajarr
Copy link
Contributor Author

ajarr commented Aug 22, 2022

When using root_squash MDS auth caps, a kernel CephFS client is able to write data as a non-root user. However, after it is remounted or forced to drop its cache, the data is lost. When the kclient tries to flush 'Fw' caps, the cap update is dropped by the MDS. This is because the kclient always sends cap updates with caller_uid and caller_gid as zero (root), and the MDS Locker disallows them for clients with root_squash MDS auth caps.
In the case of FUSE client, the cap update messages are sent to the MDS with caller_uid and caller_gid as -1. It does not hit the data loss issue faced by the kclient.

This is not always true.

There have cases will set them, for example when truncating a file with a larger size, it will set the caller_uid/caller_gid to the caller process'.

@lxbsz Are you saying it's not always the case that the user space client sends cap updates as m->caller_uid:m->caller_gid as -1:-1 as mentioned in the commit message? I will fix the commit message. If we consider the example that you mention, truncating a file using a FUSE client, would the change I made in the MDSLocker code affect the existing behavior when truncating a file?

Since it's not easy for the clients to keep track of the caller that dirtied the caps, change the MDS Locker code to always check access for caller_uid and caller_gid as -1 when doing cap updates.

The logic of this between user space client and kclient is different and the kclient will always use the inode->i_uid/i_gid for the cap update request, but not for the user space client.

Sorry, I'm unable to understand the above statement. Are you saying that kclient uses inode->i_uid/i_gid to track caller_uid and caller_gid when requesting cap updates?

       /*
         * caller_uid/caller_gid (version 7)
         *
         * Currently, we don't properly track which caller dirtied the caps
         * last, and force a flush of them when there is a conflict. For now,
         * just set this to 0:0, to emulate how the MDS has worked up to now.
         */
        ceph_encode_32(&p, 0);
        ceph_encode_32(&p, 0);

Based on the above kclient code, fs/ceph/caps.c, I meant to say that caller_uid and caller_gid are always set as 0:0 when updating caps.

@ajarr
Copy link
Contributor Author

ajarr commented Aug 22, 2022

During write as a root user from the kclient, I see the following in MDS log, where I suspect the write is allowed to succeed.

This seems not okay.

IMO it should fail when opening the file with a WR mode not the write stage ? Once the client get Fw caps it will be allowed to write and won't do the permission check in write path.

Got it.

With this PR, when using a FUSE client with root_squash MDS auth caps, I cannot write to file as a root user (sudo vim foo.txt fails to edit the file), which is the expected behavior. But when using root_squash MDS auth caps for a kclient, I'm able to write to a file (sudo vim foo.txt able to edit the file). @lxbsz would you know why FUSE client behaves differently from kclient here?

@lxbsz
Copy link
Member

lxbsz commented Aug 22, 2022

When using root_squash MDS auth caps, a kernel CephFS client is able to write data as a non-root user. However, after it is remounted or forced to drop its cache, the data is lost. When the kclient tries to flush 'Fw' caps, the cap update is dropped by the MDS. This is because the kclient always sends cap updates with caller_uid and caller_gid as zero (root), and the MDS Locker disallows them for clients with root_squash MDS auth caps.
In the case of FUSE client, the cap update messages are sent to the MDS with caller_uid and caller_gid as -1. It does not hit the data loss issue faced by the kclient.

This is not always true.
There have cases will set them, for example when truncating a file with a larger size, it will set the caller_uid/caller_gid to the caller process'.

@lxbsz Are you saying it's not always the case that the user space client sends cap updates as m->caller_uid:m->caller_gid as -1:-1 as mentioned in the commit message? I will fix the commit message.

Yeah, correct.

If we consider the example that you mention, truncating a file using a FUSE client, would the change I made in the MDSLocker code affect the existing behavior when truncating a file?

I think it's okay. In that case the client will buffer that changes and just assume the operation is successful. IMO we shouldn't defer checking the permission in the MDS when flushing the caps and it should be fail before buffering the changes instead.

Since it's not easy for the clients to keep track of the caller that dirtied the caps, change the MDS Locker code to always check access for caller_uid and caller_gid as -1 when doing cap updates.

The logic of this between user space client and kclient is different and the kclient will always use the inode->i_uid/i_gid for the cap update request, but not for the user space client.

Sorry, I'm unable to understand the above statement. Are you saying that kclient uses inode->i_uid/i_gid to track caller_uid and caller_gid when requesting cap updates?

       /*
         * caller_uid/caller_gid (version 7)
         *
         * Currently, we don't properly track which caller dirtied the caps
         * last, and force a flush of them when there is a conflict. For now,
         * just set this to 0:0, to emulate how the MDS has worked up to now.
         */
        ceph_encode_32(&p, 0);
        ceph_encode_32(&p, 0);

Based on the above kclient code, fs/ceph/caps.c, I meant to say that caller_uid and caller_gid are always set as 0:0 when updating caps.

You are right. I think I was checking another code path for kclient.

@lxbsz
Copy link
Member

lxbsz commented Aug 22, 2022

During write as a root user from the kclient, I see the following in MDS log, where I suspect the write is allowed to succeed.

This seems not okay.

IMO it should fail when opening the file with a WR mode not the write stage ? Once the client get Fw caps it will be allowed to write and won't do the permission check in write path.

Got it.

With this PR, when using a FUSE client with root_squash MDS auth caps, I cannot write to file as a root user (sudo vim foo.txt fails to edit the file), which is the expected behavior. But when using root_squash MDS auth caps for a kclient, I'm able to write to a file (sudo vim foo.txt able to edit the file). @lxbsz would you know why FUSE client behaves differently from kclient here?

Still not sure about this yet. If you have logs that will be very easy to find the root cause.

@ajarr
Copy link
Contributor Author

ajarr commented Aug 23, 2022

During write as a root user from the kclient, I see the following in MDS log, where I suspect the write is allowed to succeed.

This seems not okay.

IMO it should fail when opening the file with a WR mode not the write stage ? Once the client get Fw caps it will be allowed to write and won't do the permission check in write path.

Got it.
With this PR, when using a FUSE client with root_squash MDS auth caps, I cannot write to file as a root user (sudo vim foo.txt fails to edit the file), which is the expected behavior. But when using root_squash MDS auth caps for a kclient, I'm able to write to a file (sudo vim foo.txt able to edit the file). @lxbsz would you know why FUSE client behaves differently from kclient here?

Still not sure about this yet. If you have logs that will be very easy to find the root cause.

@lxbsz , I've added the logs below. Please let me know if you need more details.

Built Ceph (main branch) including the change in this PR. Mounted a CephFS volume using a FUSE client with root squash MDS caps. I was unable to write to an existing file as a root user (sudo vim <file>) , which is the expected behavior.

$ MON=1 MGR=1 OSD=1 MDS=1 ../src/vstart.sh -d -n -x
$ ./bin/ceph fs authorize a client.test_a / rw root_squash
$ sudo ./bin/ceph-fuse --id=test_a /mnt/cephfs
$ cd /mnt/cephfs/
$ date > test-fuse-client.txt
$ sudo vim test-fuse-client.txt
$ # I was unable to make changes to "test-fuse-client.txt" as root

The relevant MDS log during "sudo vim test-fuse-client.txt" can be downloaded from,
https://pastebin.com/FcyJBpnV
or
ceph-post-file: 6d54a359-d961-45cb-a14f-15f4398efdf3

Mounted the same CephFS volume using a kernel client with root squash MDS caps. I was able to write to an existing file as a root user (sudo vim <file>), which I did not expect to happen.

$ sudo ./bin/mount.ceph test_a@.a=/ /mnt/cephfs1/
$ cd /mnt/cephfs1/
$ date > test-kclient.txt
$ su -c 'echo 3 > /proc/sys/vm/drop_caches'
$ sudo vim test-kclient.txt
$ # I was able to make changes to "test-kclient.txt" as a root user

The relevant MDS log during "sudo vim test-kclient.txt" can be downloaded from,
https://pastebin.com/8vWY1g0k
or
ceph-post-file: ceph-post-file: 377af32c-7906-485f-abeb-63303269fb32

I grepped the MDS logs (shared above) for 'reply_client_request' to understand how FUSE client requests differs from kclient requests during sudo vim <foo>.

The MDS replies to FUSE client requests,

$ grep 'reply_client_request' aug-22-fuse-client-mds.log
2022-08-22T16:25:02.871-0400 7f72ef5fb640  7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.4165:183 lookup #0x1/test-fuse-client.txt 2022-08-22T16:25:02.871197-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:02.897-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:184 lookup #0x1/.test-fuse-client.txt.swp 2022-08-22T16:25:02.898237-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:02.898-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:185 lookup #0x1/.test-fuse-client.txt.swp 2022-08-22T16:25:02.898644-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:02.898-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:186 lookup #0x1/.test-fuse-client.txt.swp 2022-08-22T16:25:02.899091-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:02.898-0400 7f72ef5fb640  7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:187 create #0x1/.test-fuse-client.txt.swp 2022-08-22T16:25:02.899380-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:02.899-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:188 lookup #0x1/.test-fuse-client.txt.swp 2022-08-22T16:25:02.899949-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:02.899-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:189 lookup #0x1/.test-fuse-client.txt.swp 2022-08-22T16:25:02.900240-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:02.899-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:190 lookup #0x1/.test-fuse-client.txt.swp 2022-08-22T16:25:02.900533-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:02.900-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:191 lookup #0x1/.test-fuse-client.txt.swp 2022-08-22T16:25:02.900817-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:02.900-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:192 lookup #0x1/.test-fuse-client.txt.swp 2022-08-22T16:25:02.901119-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:02.900-0400 7f72ef5fb640  7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:193 create #0x1/.test-fuse-client.txt.swp 2022-08-22T16:25:02.901343-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:02.901-0400 7f72ef5fb640  7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.4165:194 lookup #0x1/test-fuse-client.txt 2022-08-22T16:25:02.901794-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.253-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:195 lookup #0x1/4913 2022-08-22T16:25:08.253066-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.254-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:196 lookup #0x1/4913 2022-08-22T16:25:08.254981-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.256-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:197 lookup #0x1/4913 2022-08-22T16:25:08.256636-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.257-0400 7f72ef5fb640  7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:198 create #0x1/4913 2022-08-22T16:25:08.257968-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.258-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:199 lookup #0x1/test-fuse-client.txt~ 2022-08-22T16:25:08.259433-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.259-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:200 lookup #0x1/test-fuse-client.txt~ 2022-08-22T16:25:08.259923-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.259-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:201 lookup #0x1/test-fuse-client.txt~ 2022-08-22T16:25:08.260422-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.260-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4165:202 lookup #0x1/test-fuse-client.txt~ 2022-08-22T16:25:08.260837-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.260-0400 7f72ef5fb640  7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:203 create #0x1/test-fuse-client.txt~ 2022-08-22T16:25:08.261143-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.262-0400 7f72ef5fb640  7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:204 open #0x10000000001 2022-08-22T16:25:08.262671-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:10.805-0400 7f72ef5fb640  7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.4165:205 getattr pAsLsXsFs #0x1 2022-08-22T16:25:10.804942-0400 caller_uid=1000, caller_gid=1000{10,1000,}) v6

The MDS reply to kclient requests during sudo vim test-kclient.txt,

$ grep 'reply_client_request' aug-22-kclient-mds.log
2022-08-22T17:52:12.820-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4171:18 open #0x1/.test-kclient.txt.swp 2022-08-22T17:52:12.819697-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:12.820-0400 7f72ef5fb640  7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:19 create #0x1/.test-kclient.txt.swp 2022-08-22T17:52:12.820697-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:12.821-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4171:20 lookup #0x1/.test-kclient.txt.swp 2022-08-22T17:52:12.820697-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:12.821-0400 7f72ef5fb640  7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:21 create #0x1/.test-kclient.txt.swp 2022-08-22T17:52:12.821697-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:12.822-0400 7f72ef5fb640  7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.4171:22 lookup #0x1/test-kclient.txt 2022-08-22T17:52:12.821697-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:16.789-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4171:23 lookup #0x1/4913 2022-08-22T17:52:16.788692-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:16.791-0400 7f72ef5fb640  7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:24 create #0x1/4913 2022-08-22T17:52:16.789692-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:16.793-0400 7f72ef5fb640  7 mds.0.server reply_client_request -2 ((2) No such file or directory) client_request(client.4171:25 lookup #0x1/test-kclient.txt~ 2022-08-22T17:52:16.792692-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:16.795-0400 7f72ef5fb640  7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:26 create #0x1/test-kclient.txt~ 2022-08-22T17:52:16.793692-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:16.820-0400 7f72e95ef640  7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:27 setattr size=0 mtime=2022-08-22T17:52:16.797692-0400 #0x100000003e9 2022-08-22T17:52:16.797692-0400 caller_uid=0, caller_gid=0{0,}) v4

From the above you can see that the last setattr requested by kclient is denied by the MDS. Later in the MDS log, I see the following

2022-08-22T17:52:16.825-0400 7f72ef5fb640  1 -- [v2:10.0.0.148:6810/3984361122,v1:10.0.0.148:6811/3984361122] <== client.4171 10.0.0.148:0/1435379467 267 ==== client_caps(flush ino 0x100000003e9 10 seq 6 tid 4 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=pAsxXsxFsxcrwb follows 1 size 63/0 mtime 2022-08-22T17:52:16.821692-0400 xattrs(v=18446630721976908800 l=0)) v10 ==== 236+0+0 (crc 0 0 0) 0x55634f72f180 con 0x55634f612000
2022-08-22T17:52:16.825-0400 7f72ef5fb640 10 mds.0.objecter ms_dispatch 0x55634a71a000 client_caps(flush ino 0x100000003e9 10 seq 6 tid 4 caps=pAsxLsXsxFsxcrwb dirty=Fw wanted=pAsxXsxFsxcrwb follows 1 size 63/0 mtime 2022-08-22T17:52:16.821692-0400 xattrs(v=18446630721976908800 l=0)) v10
2022-08-22T17:52:16.825-0400 7f72ef5fb640  7 mds.0.locker handle_client_caps  on 0x100000003e9 tid 4 follows 1 op flush flags 0x1
2022-08-22T17:52:16.825-0400 7f72ef5fb640 20 mds.0.3 get_session have 0x55634f5c5400 client.4171 10.0.0.148:0/1435379467 state open
2022-08-22T17:52:16.825-0400 7f72ef5fb640 10 mds.0.locker  head inode [inode 0x100000003e9 [2,head] /test-kclient.txt auth v34 DIRTYPARENT s=54 n(v0 rc2022-08-22T17:49:56.023874-0400 b54 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4171=0-4194304@1} caps={4171=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@6},l=4171 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x55634f592680]
2022-08-22T17:52:16.825-0400 7f72ef5fb640 10 mds.0.cache pick_inode_snap follows 1 on [inode 0x100000003e9 [2,head] /test-kclient.txt auth v34 DIRTYPARENT s=54 n(v0 rc2022-08-22T17:49:56.023874-0400 b54 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4171=0-4194304@1} caps={4171=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@6},l=4171 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x55634f592680]
2022-08-22T17:52:16.825-0400 7f72ef5fb640 10 mds.0.locker  follows 1 retains pAsxLsXsxFsxcrwb dirty Fw on [inode 0x100000003e9 [2,head] /test-kclient.txt auth v34 DIRTYPARENT s=54 n(v0 rc2022-08-22T17:49:56.023874-0400 b54 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4171=0-4194304@1} caps={4171=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@6},l=4171 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x55634f592680]
2022-08-22T17:52:16.825-0400 7f72ef5fb640  7 mds.0.locker  flush client.4171 dirty Fw seq 6 on [inode 0x100000003e9 [2,head] /test-kclient.txt auth v34 DIRTYPARENT s=54 n(v0 rc2022-08-22T17:49:56.023874-0400 b54 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4171=0-4194304@1} caps={4171=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@6},l=4171 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x55634f592680]
2022-08-22T17:52:16.825-0400 7f72ef5fb640 10 mds.0.locker _do_cap_update dirty Fw issued pAsxLsXsxFsxcrwb wanted pAsxXsxFsxcrwb on [inode 0x100000003e9 [2,head] /test-kclient.txt auth v34 DIRTYPARENT s=54 n(v0 rc2022-08-22T17:49:56.023874-0400 b54 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4171=0-4194304@1} caps={4171=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@6},l=4171 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x55634f592680]
2022-08-22T17:52:16.825-0400 7f72ef5fb640 20 mds.0.locker inode is file
2022-08-22T17:52:16.825-0400 7f72ef5fb640 20 mds.0.locker client has write caps; m->get_max_size=0; old_max=4194304
2022-08-22T17:52:16.825-0400 7f72ef5fb640 20 mds.0.3 get_session have 0x55634f5c5400 client.4171 10.0.0.148:0/1435379467 state open
2022-08-22T17:52:16.825-0400 7f72ef5fb640 20 Session check_access path /test-kclient.txt
2022-08-22T17:52:16.825-0400 7f72ef5fb640 10 MDSAuthCap is_capable inode(path /test-kclient.txt owner 1000:1000 mode 0100664) by caller 4294967295:4294967295 mask 2 new 0:0 cap: MDSAuthCaps[allow rw fsname=a root_squash]
2022-08-22T17:52:16.825-0400 7f72ef5fb640 15 mds.0.cache.ino(0x100000003e9) project_inode 0x100000003e9
2022-08-22T17:52:16.825-0400 7f72ef5fb640 10 mds.0.cache.dir(0x1) pre_dirty 36
2022-08-22T17:52:16.825-0400 7f72ef5fb640 10 mds.0.cache.den(0x1 test-kclient.txt) pre_dirty [dentry #0x1/test-kclient.txt [2,head] auth (dn sync l) (dversion lock) pv=36 v=34 ino=0x100000003e9 state=1610612736 | request=0 lock=0 inodepin=1 dirty=1 authpin=0 clientlease=1 0x55634f5baf00]
2022-08-22T17:52:16.825-0400 7f72ef5fb640 10 mds.0.cache.ino(0x100000003e9) pre_dirty 36 (current v 34)
2022-08-22T17:52:16.825-0400 7f72ef5fb640  7 mds.0.locker   ctime 2022-08-22T17:49:56.023874-0400 -> 2022-08-22T17:52:16.821692-0400 for [inode 0x100000003e9 [2,head] /test-kclient.txt auth v34 pv36 DIRTYPARENT s=54 n(v0 rc2022-08-22T17:49:56.023874-0400 b54 1=1+0)->n(v0 rc2022-08-22T17:49:56.023874-0400 b54 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4171=0-4194304@1} caps={4171=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@6},l=4171 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x55634f592680]
2022-08-22T17:52:16.825-0400 7f72ef5fb640  7 mds.0.locker   change_attr 3 -> 4 for [inode 0x100000003e9 [2,head] /test-kclient.txt auth v34 pv36 DIRTYPARENT s=54 n(v0 rc2022-08-22T17:49:56.023874-0400 b54 1=1+0)->n(v0 rc2022-08-22T17:52:16.821692-0400 b54 1=1+0)/n(v0 rc2022-08-22T17:49:56.023874-0400 b54 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4171=0-4194304@1} caps={4171=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@6},l=4171 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x55634f592680]
2022-08-22T17:52:16.825-0400 7f72ef5fb640  7 mds.0.locker   mtime 2022-08-22T17:49:56.020874-0400 -> 2022-08-22T17:52:16.821692-0400 for [inode 0x100000003e9 [2,head] /test-kclient.txt auth v34 pv36 DIRTYPARENT s=54 n(v0 rc2022-08-22T17:49:56.023874-0400 b54 1=1+0)->n(v0 rc2022-08-22T17:52:16.821692-0400 b54 1=1+0)/n(v0 rc2022-08-22T17:49:56.023874-0400 b54 1=1+0) (iauth excl) (ifile excl) (ixattr excl) (iversion lock) cr={4171=0-4194304@1} caps={4171=pAsxLsXsxFsxcrwb/pAsxXsxFsxcrwb@6},l=4171 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x55634f592680]

@vshankar vshankar requested a review from a team August 24, 2022 12:07
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@lxbsz
Copy link
Member

lxbsz commented Sep 1, 2022

@ajarr This looks good to me. Please rebase the PR and remove the draft.
Thanks!

@vshankar vshankar self-assigned this Sep 1, 2022
@lxbsz
Copy link
Member

lxbsz commented Sep 1, 2022

@ajarr This looks good to me. Please rebase the PR and remove the draft. Thanks!

Wait. Checked it again as we discussed above, for the setattr case if the client will buffer the changes it won't check the permission and will defer checking it when flushing the cap as it's doing here. So we still need make sure this won't introduce issues such as when a non-permitted user could update it ?

As mentioned above we may need to check the permission before buffering it ?

@lxbsz
Copy link
Member

lxbsz commented Sep 1, 2022

For fuse client:

2022-08-22T16:25:08.260-0400 7f72ef5fb640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:203 create #0x1/test-fuse-client.txt~ 2022-08-22T16:25:08.261143-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.262-0400 7f72ef5fb640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:204 open #0x10000000001 2022-08-22T16:25:08.262671-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:10.805-0400 7f72ef5fb640 7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.4165:205 getattr pAsLsXsFs #0x1 2022-08-22T16:25:10.804942-0400 caller_uid=1000, caller_gid=1000{10,1000,}) v6

The vim will try to create the backup file, which is suffixed with ~, test-fuse-client.txt~. It failed but it seems it won't affect the vim to continue. But after that the vim will continue try to open the test-fuse-client.txt file, which the inode# is #0x10000000001, and was permission denied.

While for the kernel client there was no open operation:

2022-08-22T17:52:16.795-0400 7f72ef5fb640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:26 create #0x1/test-kclient.txt~ 2022-08-22T17:52:16.793692-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:16.820-0400 7f72e95ef640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:27 setattr size=0 mtime=2022-08-22T17:52:16.797692-0400 #0x100000003e9 2022-08-22T17:52:16.797692-0400 caller_uid=0, caller_gid=0{0,}) v4

It seems the open succeeded in the client side directly. I need to check the kernel code later to check why. Though the backup file also failed to create but the vim could continue to open and write to it, which the inode# is #0x100000003e9.

@vshankar
Copy link
Contributor

vshankar commented Sep 1, 2022

@ajarr This looks good to me. Please rebase the PR and remove the draft. Thanks!

Wait. Checked it again as we discussed above, for the setattr case if the client will buffer the changes it won't check the permission and will defer checking it when flushing the cap as it's doing here. So we still need make sure this won't introduce issues such as when a non-permitted user could update it ?

As mentioned above we may need to check the permission before buffering it ?

That sounds like the correct approach IMO.

@lxbsz
Copy link
Member

lxbsz commented Sep 1, 2022

@ajarr This looks good to me. Please rebase the PR and remove the draft. Thanks!

Wait. Checked it again as we discussed above, for the setattr case if the client will buffer the changes it won't check the permission and will defer checking it when flushing the cap as it's doing here. So we still need make sure this won't introduce issues such as when a non-permitted user could update it ?
As mentioned above we may need to check the permission before buffering it ?

That sounds like the correct approach IMO.

If so we need to do the same check in client side anyway ?

@lxbsz
Copy link
Member

lxbsz commented Sep 1, 2022

For fuse client:

2022-08-22T16:25:08.260-0400 7f72ef5fb640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:203 create #0x1/test-fuse-client.txt~ 2022-08-22T16:25:08.261143-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.262-0400 7f72ef5fb640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:204 open #0x10000000001 2022-08-22T16:25:08.262671-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:10.805-0400 7f72ef5fb640 7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.4165:205 getattr pAsLsXsFs #0x1 2022-08-22T16:25:10.804942-0400 caller_uid=1000, caller_gid=1000{10,1000,}) v6

The vim will try to create the backup file, which is suffixed with ~, test-fuse-client.txt~. It failed but it seems it won't affect the vim to continue. But after that the vim will continue try to open the test-fuse-client.txt file, which the inode# is #0x10000000001, and was permission denied.

While for the kernel client there was no open operation:

2022-08-22T17:52:16.795-0400 7f72ef5fb640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:26 create #0x1/test-kclient.txt~ 2022-08-22T17:52:16.793692-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:16.820-0400 7f72e95ef640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:27 setattr size=0 mtime=2022-08-22T17:52:16.797692-0400 #0x100000003e9 2022-08-22T17:52:16.797692-0400 caller_uid=0, caller_gid=0{0,}) v4

It seems the open succeeded in the client side directly. I need to check the kernel code later to check why. Though the backup file also failed to create but the vim could continue to open and write to it, which the inode# is #0x100000003e9.

When open a existing file and if the inode is cached in client side:

For the fuse client:

1, Try to open the inode and check the caps issued to see whether the Fw wanted caps is issued.
2, It's not and it will issue a open request to MDS
3, The mds just rejected it with permission denied.

For the kclient:

1, Try to open the inode and check the caps issued to see whether the Fw wanted caps is issued.
2, It's not and then it will issue a caps update to get the wanted Fw caps instead of sending a open request to MDS.

This is why the kclient could succeed, while the fuse client won't.

We need to find a way to force check this in kernel.

@lxbsz
Copy link
Member

lxbsz commented Sep 1, 2022

For fuse client:

2022-08-22T16:25:08.260-0400 7f72ef5fb640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:203 create #0x1/test-fuse-client.txt~ 2022-08-22T16:25:08.261143-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.262-0400 7f72ef5fb640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:204 open #0x10000000001 2022-08-22T16:25:08.262671-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:10.805-0400 7f72ef5fb640 7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.4165:205 getattr pAsLsXsFs #0x1 2022-08-22T16:25:10.804942-0400 caller_uid=1000, caller_gid=1000{10,1000,}) v6

The vim will try to create the backup file, which is suffixed with ~, test-fuse-client.txt~. It failed but it seems it won't affect the vim to continue. But after that the vim will continue try to open the test-fuse-client.txt file, which the inode# is #0x10000000001, and was permission denied.
While for the kernel client there was no open operation:

2022-08-22T17:52:16.795-0400 7f72ef5fb640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:26 create #0x1/test-kclient.txt~ 2022-08-22T17:52:16.793692-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:16.820-0400 7f72e95ef640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:27 setattr size=0 mtime=2022-08-22T17:52:16.797692-0400 #0x100000003e9 2022-08-22T17:52:16.797692-0400 caller_uid=0, caller_gid=0{0,}) v4

It seems the open succeeded in the client side directly. I need to check the kernel code later to check why. Though the backup file also failed to create but the vim could continue to open and write to it, which the inode# is #0x100000003e9.

When open a existing file and if the inode is cached in client side:

For the fuse client:

1, Try to open the inode and check the caps issued to see whether the Fw wanted caps is issued. 2, It's not and it will issue a open request to MDS 3, The mds just rejected it with permission denied.

For the kclient:

1, Try to open the inode and check the caps issued to see whether the Fw wanted caps is issued. 2, It's not and then it will issue a caps update to get the wanted Fw caps instead of sending a open request to MDS.

This is why the kclient could succeed, while the fuse client won't.

We need to find a way to force check this in kernel.

The kernel patch fixed it.

@vshankar
Copy link
Contributor

vshankar commented Sep 2, 2022

For fuse client:

2022-08-22T16:25:08.260-0400 7f72ef5fb640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:203 create #0x1/test-fuse-client.txt~ 2022-08-22T16:25:08.261143-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:08.262-0400 7f72ef5fb640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4165:204 open #0x10000000001 2022-08-22T16:25:08.262671-0400 caller_uid=0, caller_gid=0{0,}) v6
2022-08-22T16:25:10.805-0400 7f72ef5fb640 7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.4165:205 getattr pAsLsXsFs #0x1 2022-08-22T16:25:10.804942-0400 caller_uid=1000, caller_gid=1000{10,1000,}) v6

The vim will try to create the backup file, which is suffixed with ~, test-fuse-client.txt~. It failed but it seems it won't affect the vim to continue. But after that the vim will continue try to open the test-fuse-client.txt file, which the inode# is #0x10000000001, and was permission denied.
While for the kernel client there was no open operation:

2022-08-22T17:52:16.795-0400 7f72ef5fb640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:26 create #0x1/test-kclient.txt~ 2022-08-22T17:52:16.793692-0400 caller_uid=0, caller_gid=0{0,}) v4
2022-08-22T17:52:16.820-0400 7f72e95ef640 7 mds.0.server reply_client_request -13 ((13) Permission denied) client_request(client.4171:27 setattr size=0 mtime=2022-08-22T17:52:16.797692-0400 #0x100000003e9 2022-08-22T17:52:16.797692-0400 caller_uid=0, caller_gid=0{0,}) v4

It seems the open succeeded in the client side directly. I need to check the kernel code later to check why. Though the backup file also failed to create but the vim could continue to open and write to it, which the inode# is #0x100000003e9.

When open a existing file and if the inode is cached in client side:

For the fuse client:

1, Try to open the inode and check the caps issued to see whether the Fw wanted caps is issued. 2, It's not and it will issue a open request to MDS 3, The mds just rejected it with permission denied.

For the kclient:

1, Try to open the inode and check the caps issued to see whether the Fw wanted caps is issued. 2, It's not and then it will issue a caps update to get the wanted Fw caps instead of sending a open request to MDS.

This is why the kclient could succeed, while the fuse client won't.

That's one issue. The other (probably a bit more important) thing is that I expected the flush/sync to error out, but that doesn't happen - do we know why?

We need to find a way to force check this in kernel.

@lxbsz
Copy link
Member

lxbsz commented Sep 2, 2022

...

When open a existing file and if the inode is cached in client side:
For the fuse client:
1, Try to open the inode and check the caps issued to see whether the Fw wanted caps is issued. 2, It's not and it will issue a open request to MDS 3, The mds just rejected it with permission denied.
For the kclient:
1, Try to open the inode and check the caps issued to see whether the Fw wanted caps is issued. 2, It's not and then it will issue a caps update to get the wanted Fw caps instead of sending a open request to MDS.
This is why the kclient could succeed, while the fuse client won't.

That's one issue. The other (probably a bit more important) thing is that I expected the flush/sync to error out, but that doesn't happen - do we know why?

With this PR after being remounted the changed data will be lost. While actually the contents have already written to the data pool, just the metadatas are lost in MDS when flushing the caps.

If with this PR, I can see it was wrote normally.

@ajarr
Copy link
Contributor Author

ajarr commented Nov 22, 2022

As we discussed in the meeting today:

  • I don't think this check belongs in _do_cap_update. We check authorization in cap issuance, not during update (when it's too late).

I've removed the session check access in Locker::_do_cap_update().

  • Audit this method to ensure that a client can only mutate metadata it has been given permission by the MDS to modify (i.e. what caps has it been issued). We should issue some kind of debug warning (level -1) when the MDS drops a cap update as that's really not good. Probably some kind of cluster warning too.

I went through Locker::_do_cap_update() and Locker::_update_cap_fields() , where inode metadata is actually updated.

I couldn't find anything that stood out to me in Locker::_do_cap_update() , except
https://github.com/ceph/ceph/blob/v17.2.5/src/mds/Locker.cc#L3794
I don't completely understand cap handling. I was wondering if the condition should be,

if (cap && (cap->issued() & CEPH_CAP_ANY_FILE_WR))

instead of

if (cap && ((cap->issued() | cap->wanted()) & CEPH_CAP_ANY_FILE_WR))
  • Create tests which send synthetic cap updates for a client to the MDS which should trip on your new permission checks based on the issued caps.

I'm not sure how to do this.

Some other tests to consider: two users on a root_squashed mount, one root (uid=0) and other (uid=1?), both editing a file. The non-root user obtained full (all rights) caps first on the file. What should happen if the root uid tries to modify it on that client mount?

I think this is what @lxbsz 's commit is trying to test here,
b5c7952#diff-9998476d9d02f4136e9f6b4b512be93a77a5911a30027bec6a120706a1a024f3R205

@lxbsz
Copy link
Member

lxbsz commented Nov 23, 2022

As we discussed in the meeting today:

  • I don't think this check belongs in _do_cap_update. We check authorization in cap issuance, not during update (when it's too late).

Yeah. Agree.

[...]

if (cap && (cap->issued() & CEPH_CAP_ANY_FILE_WR))

instead of

if (cap && ((cap->issued() | cap->wanted()) & CEPH_CAP_ANY_FILE_WR))

Sage has one commit to add the wanted caps:

commit 0b454296a3a5ce21b7414c9248a364c0017afa81
Author: Sage Weil <sweil@redhat.com>
Date:   Mon Jul 13 11:31:31 2009 -0700

    mds: base client_ranges update on issued|wanted, not just issued

But didn't mention why.

... MDS auth caps but don't have CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK
feature bit (i.e., can't check the auth caps sent back to it by the
MDS) from establishing a session. Do this in
Server::handle_client_session(), and Server::handle_client_reconnect(),
where old clients try to reconnect to MDS servers after an upgrade.

If the client doesn't have the ability to authorize session access
based on the MDS auth caps send back to it by the MDS, then the
client may buffer changes locally during open and setattr operations
when it's not supposed to, e.g., when enforcing root_squash MDS auth
caps.

Fixes: https://tracker.ceph.com/issues/56067
Signed-off-by: Ramana Raja <rraja@redhat.com>
@lxbsz
Copy link
Member

lxbsz commented Nov 25, 2022

IMO this approach is nice. And the changes LGTM.

…6067

A kernel CephFS client with MDS root_squash caps is able to write to a
file as non-root user. However, the data written is lost after clearing
the kernel client cache, or re-mounting the client. This issue is not
observed with a FUSE CephFS client.

Signed-off-by: Ramana Raja <rraja@redhat.com>
kclient doesn't have CEPHFS_FEATURE_MDS_AUTH_CAPS required to
enforce root_squash. Run root_squash tests only for FUSE client.

Signed-off-by: Ramana Raja <rraja@redhat.com>
@ajarr
Copy link
Contributor Author

ajarr commented Nov 28, 2022

As we discussed in the meeting today:

  • I don't think this check belongs in _do_cap_update. We check authorization in cap issuance, not during update (when it's too late).

Yeah. Agree.

[...]

I went through Locker::_do_cap_update() and Locker::_update_cap_fields() , where inode metadata is actually updated.
I couldn't find anything that stood out to me in Locker::_do_cap_update() , except
https://github.com/ceph/ceph/blob/v17.2.5/src/mds/Locker.cc#L3794
I don't completely understand cap handling. I was wondering if the condition should be,

if (cap && (cap->issued() & CEPH_CAP_ANY_FILE_WR))

instead of

if (cap && ((cap->issued() | cap->wanted()) & CEPH_CAP_ANY_FILE_WR))

Sage has one commit to add the wanted caps:

commit 0b454296a3a5ce21b7414c9248a364c0017afa81
Author: Sage Weil <sweil@redhat.com>
Date:   Mon Jul 13 11:31:31 2009 -0700

    mds: base client_ranges update on issued|wanted, not just issued

But didn't mention why.

@batrick @gregsfortytwo would you know why?

@ajarr ajarr marked this pull request as ready for review November 28, 2022 19:51
@ajarr ajarr requested a review from a team as a code owner November 28, 2022 19:51
std::map<std::string, std::string> metadata;
feature_bitset_t supported_features;
metric_spec_t metric_spec;
MDSAuthCaps auth_caps;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing down a discussion I had with Ramana today:

I do not believe this is the right approach. While it appears to be future-proof because now the Client "knows" its auth caps and can be relied on to implement new authorizations (like root_squash), it has serious issues:

  • Old clients will not understand new MDSAuthCaps versions. So a new client feature flag will be required for every change which requires new authorization handling in the Client.
  • Clients have to know how to parse these caps. This is trivial for ceph-fuse but not for the kernel client. Parsing grammars in the kernel is something we should avoid.

Because we still need feature bits for any new authorization types, I think this should be handled as new fields (gid/uid/root_squash) in the MClientSession message with corresponding client feature flags. The server tells the client about these authorization restrictions. It's up to the clients to implement them. If the operator wants to enforce them, they can set the required client features [1].

[1] https://docs.ceph.com/en/latest/cephfs/administration/#required-client-features

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, @batrick . This discussion is more appropriate for the PR #48027 . I'm going to link it that there. This PR is based on that one, and adds changes to the MDS to disallow old clients that lack the feature to parse MDS auth caps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing down a discussion I had with Ramana today:

I do not believe this is the right approach. While it appears to be future-proof because now the Client "knows" its auth caps and can be relied on to implement new authorizations (like root_squash), it has serious issues:

  • Old clients will not understand new MDSAuthCaps versions. So a new client feature flag will be required for every change which requires new authorization handling in the Client.
  • Clients have to know how to parse these caps. This is trivial for ceph-fuse but not for the kernel client. Parsing grammars in the kernel is something we should avoid.

Yeah, for kclient it's true.

Because we still need feature bits for any new authorization types, I think this should be handled as new fields (gid/uid/root_squash) in the MClientSession message with corresponding client feature flags. The server tells the client about these authorization restrictions. It's up to the clients to implement them. If the operator wants to enforce them, they can set the required client features [1].

Let me check how to do with this.
Thanks @batrick

[1] https://docs.ceph.com/en/latest/cephfs/administration/#required-client-features

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we still need feature bits for any new authorization types, I think this should be handled as new fields (gid/uid/root_squash) in the MClientSession message with corresponding client feature flags. The server tells the client about these authorization restrictions. It's up to the clients to implement them. If the operator wants to enforce them, they can set the required client features [1].

I think for something security-focused like this, we can't rely on the operator setting required flags. What we can do is have the MDS refuse to mount a client which doesn't understand cephx caps that involve client-side enforcement. Does that work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we still need feature bits for any new authorization types, I think this should be handled as new fields (gid/uid/root_squash) in the MClientSession message with corresponding client feature flags. The server tells the client about these authorization restrictions. It's up to the clients to implement them. If the operator wants to enforce them, they can set the required client features [1].

@batrick, I remember we discussed a bit about the above approach a while back and recall that @lxbsz was intending to use MDSAuthCaps to fix a set of async dirops bugs in the kclient. Are those still valid @lxbsz? FWIW, The alternate approach to this was discussed here - #48027 (comment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for something security-focused like this, we can't rely on the operator setting required flags. What we can do is have the MDS refuse to mount a client which doesn't understand cephx caps that involve client-side enforcement. Does that work?

We can make it the default for the mons to automatically add these bits to the required client features bitset but that would probably be very surprising to have all existing client mounts get evicted on upgrade.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we still need feature bits for any new authorization types, I think this should be handled as new fields (gid/uid/root_squash) in the MClientSession message with corresponding client feature flags. The server tells the client about these authorization restrictions. It's up to the clients to implement them. If the operator wants to enforce them, they can set the required client features [1].

@batrick, I remember we discussed a bit about the above approach a while back and recall that @lxbsz was intending to use MDSAuthCaps to fix a set of async dirops bugs in the kclient. Are those still valid @lxbsz? FWIW, The alternate approach to this was discussed here - #48027 (comment).

@vshankar Yeah, it should be enough for async case. Currently I think the cephx caps are enough for async. This really could reduce a lot of coding work in kclient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lxbsz. I'll have a look at the changes this week (trying to clear merge backlog right now).

@lxbsz
Copy link
Member

lxbsz commented Jan 9, 2023

I have pulled @ajarr's new commits in this PR to my #48027.

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@github-actions
Copy link

github-actions bot commented May 8, 2023

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label May 8, 2023
@ajarr ajarr closed this May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants