mon, cephfs: Add auth caps for CephFS fsids by rishabh-d-dave · Pull Request #32581 · ceph/ceph

rishabh-d-dave · 2020-01-09T17:20:45Z

Fixes: https://tracker.ceph.com/issues/15070
First 3 commits are rebased and modified version of PR #26855.

Add new MON and MDS auth caps to restrict access based on fsnames.
Allow passing fsnames and paths in same cap.
Make fs authorize subcommand assign MON cap specific to that FS
Update doc client-auth for the changes above and improve it.
Make changes to qa/ code to support multi-FS tests -
- Create new CephFSMount attributes for client's keyring, CephFS name and mountpoints on host and Ceph FS
- Update all mount creation calls to use keyword arguments.
- Allow reusing mount objects
- Add remount method to remount Ceph FS in a single call
- Allow not aborting when mount command fails (useful for negative tests)
- Improve filesystem.py
  - Add a method to destroy an FS
  - Modify Filesytem.recreate() and mds_cluster.delete_all_filesystems()
- Add helper methods to read/write files from CephFS mounts
- Improve setup/teardown for CephFS tests in general
Add tests for multifs_auth and for fs authorize subcommand.

Checklist

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

Show available Jenkins commands

jenkins retest this please
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard backend
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox

src/mds/Server.cc

rishabh-d-dave · 2020-01-09T17:31:33Z

@batrick and @fullerdj if the PR looks fine, what next? Anything besides adding tests? Some of the changes requested on #26855 are outstanding, I'll go ahead with right now.

fullerdj · 2020-01-09T21:14:04Z

Hi Rishabh,

This will need a test associated with it.

batrick

Also need to address especially: #26855 (comment)

@gregsfortytwo @liewegas do you have a suggestion on how that would look? mon allow r fscid=<fscid>?

src/mds/Server.cc

gregsfortytwo · 2020-01-20T22:46:08Z

Also need to address especially: #26855 (comment)

@gregsfortytwo @liewegas do you have a suggestion on how that would look? mon allow r fscid=<fscid>?

Yes, that looks right to me if you're trying to restrict a user to reading only from fscid 1. As you note in the previous comment, an "allow rw" would supersede any separate "allow fscid=N" clause, so they'd need to be stuck together in any case.
Was that the whole question?

batrick · 2020-01-21T01:50:46Z

Also need to address especially: #26855 (comment)
@gregsfortytwo @liewegas do you have a suggestion on how that would look? mon allow r fscid=<fscid>?

Yes, that looks right to me if you're trying to restrict a user to reading only from fscid 1. As you note in the previous comment, an "allow rw" would supersede any separate "allow fscid=N" clause, so they'd need to be stuck together in any case.
Was that the whole question?

Yes, I think so. Thanks!

batrick · 2020-01-30T12:31:31Z

Needs rebase

rishabh-d-dave · 2020-02-12T04:51:42Z

Testing bits of my patch in that applies in qa that can't be tested with vstart_runner.py

rishabh-d-dave · 2020-02-19T19:13:43Z

@batrick

Teuthology testing fails with both kernel as well as FUSE client. Here's the links to the tests -
http://pulpito.ceph.com/rishabh-2020-02-14_12:07:11-fs-wip-rishabh-wip-djf-15070-distro-basic-smithi/
http://pulpito.ceph.com/rishabh-2020-02-18_05:31:37-kcephfs-wip-rishabh-wip-djf-15070-distro-basic-smithi/

Here's the failure of kernel client -

======
2020-02-18T13:23:43.531 INFO:tasks.cephfs_test_runner:FAIL: test_write (tasks.cephfs.test_multifs.TestClientsWithOutAuth)
2020-02-18T13:23:43.531 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2020-02-18T13:23:43.531 INFO:tasks.cephfs_test_runner:Only have 1 clients, require 2
2020-02-18T13:23:43.531 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2020-02-18T13:23:43.531 INFO:tasks.cephfs_test_runner:Ran 5 tests in 5.179s
2020-02-18T13:23:43.532 INFO:tasks.cephfs_test_runner:

Looks like I need to modify some file qa/suites to provision 2 clients instead of 1.

And, here's the failure for FUSE client -

Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 89, in run_tasks
    manager.__enter__()
  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-rishabh-wip-djf-15070/qa/tasks/ceph_fuse.py", line 137, in task
    mount.mount()
  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-rishabh-wip-djf-15070/qa/tasks/cephfs/fuse_mount.py", line 39, in mount
    return self._mount(mount_path, mount_fs_name)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-rishabh-wip-djf-15070/qa/tasks/cephfs/fuse_mount.py", line 144, in _mount
    waited
RuntimeError: Fuse mount failed to populate /sys/ after 31 seconds

Looks like the command to mount was executed successfully but the mount itself failed for some reason. I've got not idea what's the reason behind this. The remounting is different from any remounting in the sense that this time mounting happens with a different client ID and keyring,

FYI: These tests ran successfully when I tested with vstart_runner.py.

This commit introduces following two set of changes - First, make client keyring path, mountpoint on host FS and CephFS and CephFS's name attributes of the object representing the mount and update all the mount object creation calls accordingly. Also, rewrite all the mount object creation to use keyword arguments instead of positional arguments to avoid mistakes, especially since a new argument was added in this commit. Second, add remount method to mount.py so that it's possible to unmount safely, modify the attributes of the object representing the mount and mount again based on new state of the object *in a single call*. The method is placed in mount.py to avoid duplication. This change has two leads to two more changes: upgrading interface of mount() and mount_wait() and upgrading testsuites to adapt to these change. Signed-off-by: Rishabh Dave <ridave@redhat.com>

This commit adds a new argument check_status to mount methods of KernelMount, FuseMount, LocalKernelMount and LocalFuseMount. When value of this argument is False, these methods would catch the CommandFailedError exception and would return a tuple consisting of the exception itself, and stdout and stderr of the mount command. This allows reusing these mount methods while running negative tests for commands. The name "check_status" is selected so since teuthology's run() and vstart_runner's run() use a variable with same name for the very same purpose. Signed-off-by: Rishabh Dave <ridave@redhat.com>

And reset_obj_attrs parameter to it so that the caller of the method can choose to destroy the Ceph FS represented by the object without disturbing the object attributes. Signed-off-by: Rishabh Dave <ridave@redhat.com>

Modify cephfs.filesystem.Filesystem.recreate() method to delete only the FS represented by the object instead of deleting the every FS on the Ceph cluster. Signed-off-by: Rishabh Dave <ridave@redhat.com>

rishabh-d-dave · 2020-09-10T16:23:03Z

Added DNM to test changes the fix for ceph API tests.

Modify filesystem.Filesystem.delete_all_filesystems() method to make it more succinct, move it to class MDSCluster instead and update every call to it accordingly. Signed-off-by: Rishabh Dave <ridave@redhat.com>

Signed-off-by: Rishabh Dave <ridave@redhat.com>

rishabh-d-dave · 2020-09-10T18:28:06Z

ceph API tests passed - https://jenkins.ceph.com/job/ceph-api/2802/

rishabh-d-dave · 2020-09-11T04:03:39Z

jenkins test make check

rishabh-d-dave · 2020-09-11T04:32:49Z

jenkins render docs

ceph-jenkins · 2020-09-11T04:59:15Z

Doc render available at http://docs.ceph.com/ceph-prs/32581/

rishabh-d-dave · 2020-09-11T09:35:39Z

Tests ran successfully with teuthology -
multifs, fuse client: https://pulpito.ceph.com/rishabh-2020-09-11_06:19:54-fs-wip-rishabh-15070-distro-basic-smithi/
~~Missed out kernel client out, triggered tests for it now~~ Kernel tests failed since error messages can differ across kernel version I suppose - https://pulpito.ceph.com/rishabh-2020-09-11_09:30:41-kcephfs-wip-rishabh-15070-distro-basic-smithi/

client recovery: https://pulpito.ceph.com/rishabh-2020-09-11_06:20:39-fs-wip-rishabh-15070-distro-basic-smithi/
strays: https://pulpito.ceph.com/rishabh-2020-09-11_06:20:34-fs-wip-rishabh-15070-distro-basic-smithi/
quota: https://pulpito.ceph.com/rishabh-2020-09-11_06:20:27-fs-wip-rishabh-15070-distro-basic-smithi/
alternate pool: https://pulpito.ceph.com/rishabh-2020-09-11_06:20:22-fs-wip-rishabh-15070-distro-basic-smithi/
volume client: https://pulpito.ceph.com/rishabh-2020-09-11_06:20:17-fs-wip-rishabh-15070-distro-basic-smithi/
sessionmap: https://pulpito.ceph.com/rishabh-2020-09-11_06:20:12-fs-wip-rishabh-15070-distro-basic-smithi/
admin: https://pulpito.ceph.com/rishabh-2020-09-11_06:20:06-fs-wip-rishabh-15070-distro-basic-smithi/

Quota tests had 4 jobs, 2 CentOS 8 and 2 RHEL 8. RHEL 8 jobs failed, apparently, for not a related reason. I'm taking a deeper look into the logs...

qa/tasks/cephfs/test_multifs_auth.py

Add testsuite for testing authorization on Ceph cluster with multiple file systems and enable it to be executable with Teuthology framework. Also add helper methods required to setup the test environment for multi-FS tests. Signed-off-by: Rishabh Dave <ridave@redhat.com>

Right now, only client IDs are stashed and restored but with the recent changes (addition of more attributes to mount objects, specifically), this is not enough. Saving and restoring these details before and after tests respectively ensures that mount commands rus smoothly. Not doing this typically leads to mount command failure for the second test in the testsuite under execution since the client IDs are saved and restored in CephFSTestCase.setUp and CephFSTestCase.tearDown respectively but the rest of the details are not. Signed-off-by: Rishabh Dave <ridave@redhat.com>

Make caps FS-specific affects "fs authorize" subcommand. Let's add few tests to verify its behaviour. Signed-off-by: Rishabh Dave <ridave@redhat.com>

qa/suites/fs/multifs/tasks/multifs-auth.yaml

rishabh-d-dave · 2020-09-11T15:05:49Z

Last tests ran successfully with teuth -

https://pulpito.ceph.com/rishabh-2020-09-11_12:37:30-fs-wip-rishabh-15070-distro-basic-smithi/
https://pulpito.ceph.com/rishabh-2020-09-11_12:37:25-kcephfs-wip-rishabh-15070-distro-basic-smithi/

rishabh-d-dave · 2020-09-11T15:27:08Z

jenkins test make check

rishabh-d-dave · 2020-09-11T16:33:07Z

quota - https://pulpito.ceph.com/rishabh-2020-09-11_15:22:09-fs-wip-rishabh-15070-distro-basic-smithi
alternate pool - https://pulpito.ceph.com/rishabh-2020-09-11_15:22:04-fs-wip-rishabh-15070-distro-basic-smithi/
volume client - https://pulpito.ceph.com/rishabh-2020-09-11_15:21:59-fs-wip-rishabh-15070-distro-basic-smithi/
sessionmap - https://pulpito.ceph.com/rishabh-2020-09-11_15:21:54-fs-wip-rishabh-15070-distro-basic-smithi/
client recovery - https://pulpito.ceph.com/rishabh-2020-09-11_15:22:22-fs-wip-rishabh-15070-distro-basic-smithi/
strays - https://pulpito.ceph.com/rishabh-2020-09-11_15:22:16-fs-wip-rishabh-15070-distro-basic-smithi/
failover - https://pulpito.ceph.com/rishabh-2020-09-11_17:24:36-fs-wip-rishabh-15070-distro-basic-smithi/

admin - https://pulpito.ceph.com/rishabh-2020-09-11_15:21:49-fs-wip-rishabh-15070-distro-basic-smithi/
mulitfs kcephfs - https://pulpito.ceph.com/rishabh-2020-09-11_16:15:00-kcephfs-wip-rishabh-15070-distro-basic-smithi/
multifs fuse - https://pulpito.ceph.com/rishabh-2020-09-11_16:31:01-fs-wip-rishabh-15070-distro-basic-smithi/

There are few failures on quota just like last batch of tests (the failures aren't related) and on multifs kcephfs too has failures but those the failures are not due to tests. Looks like few smithi machines didn't respond back in time.

ajarr · 2020-09-11T18:31:15Z

Thanks, Rishabh!

rishabh-d-dave · 2020-09-11T18:31:58Z

@ajarr @batrick Thanks for all the review and help. :D

rishabh-d-dave requested review from batrick and fullerdj January 9, 2020 17:21

rishabh-d-dave added cephfs Ceph File System feature mon needs-test labels Jan 9, 2020

rishabh-d-dave force-pushed the wip-djf-15070 branch from 87c6368 to 99e732c Compare January 9, 2020 17:25

rishabh-d-dave commented Jan 9, 2020

View reviewed changes

src/mds/Server.cc Outdated Show resolved Hide resolved

fullerdj mentioned this pull request Jan 13, 2020

[DNM] mon, cephfs: Add auth caps for CephFS fsids #26855

Closed

batrick requested changes Jan 16, 2020

View reviewed changes

src/mds/Server.cc Outdated Show resolved Hide resolved

batrick added this to the octopus milestone Jan 24, 2020

rishabh-d-dave force-pushed the wip-djf-15070 branch from 99e732c to ac535aa Compare January 27, 2020 12:45

rishabh-d-dave force-pushed the wip-djf-15070 branch 3 times, most recently from 78c6eb8 to 5a6df79 Compare February 11, 2020 17:38

rishabh-d-dave added the wip-rishabh-testing Rishabh's testing label label Feb 12, 2020

rishabh-d-dave removed the wip-rishabh-testing Rishabh's testing label label Feb 12, 2020

rishabh-d-dave force-pushed the wip-djf-15070 branch 5 times, most recently from a3d13e4 to 1b02da6 Compare February 17, 2020 15:07

batrick removed this from the octopus milestone Feb 19, 2020

rishabh-d-dave added 4 commits September 10, 2020 17:10

qa/cephfs: add a method to destroy fs in filesystem.py

ee6e229

And reset_obj_attrs parameter to it so that the caller of the method can choose to destroy the Ceph FS represented by the object without disturbing the object attributes. Signed-off-by: Rishabh Dave <ridave@redhat.com>

qa/cephfs: modify recreate() in filesystem.py

a7eaec9

Modify cephfs.filesystem.Filesystem.recreate() method to delete only the FS represented by the object instead of deleting the every FS on the Ceph cluster. Signed-off-by: Rishabh Dave <ridave@redhat.com>

rishabh-d-dave added 2 commits September 10, 2020 23:56

qa/cephfs: modify delete_all_filesystems() in filesystem.py

04ed58f

Modify filesystem.Filesystem.delete_all_filesystems() method to make it more succinct, move it to class MDSCluster instead and update every call to it accordingly. Signed-off-by: Rishabh Dave <ridave@redhat.com>

qa/cephfs: add methods to read/write on CephFS mounts

3f0284f

Signed-off-by: Rishabh Dave <ridave@redhat.com>

rishabh-d-dave commented Sep 11, 2020

View reviewed changes

qa/tasks/cephfs/test_multifs_auth.py Show resolved Hide resolved

rishabh-d-dave added 3 commits September 11, 2020 18:02

qa/cephfs: add tests for "fs authorize" subcommand

995c736

Make caps FS-specific affects "fs authorize" subcommand. Let's add few tests to verify its behaviour. Signed-off-by: Rishabh Dave <ridave@redhat.com>

rishabh-d-dave commented Sep 11, 2020

View reviewed changes

qa/suites/fs/multifs/tasks/multifs-auth.yaml Show resolved Hide resolved

tchaikov mentioned this pull request Sep 12, 2020

Revert "mon, cephfs: Add auth caps for CephFS fsids" #37122

Closed

This was referenced Feb 4, 2021

nautilus: mgr/volume: subvolume auth_id management and few bug fixes #39292

Merged

octopus: mgr/volume: subvolume auth_id management and few bug fixes #39390

Merged

rishabh-d-dave mentioned this pull request May 9, 2022

qa/cephfs: fix minor bug in caps_helper.py's run_mon_cap_tests() #46168

Merged

14 tasks

Conversation

rishabh-d-dave commented Jan 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

Uh oh!

rishabh-d-dave commented Jan 9, 2020

Uh oh!

fullerdj commented Jan 9, 2020

Uh oh!

batrick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gregsfortytwo commented Jan 20, 2020

Uh oh!

batrick commented Jan 21, 2020

Uh oh!

batrick commented Jan 30, 2020

Uh oh!

rishabh-d-dave commented Feb 12, 2020

Uh oh!

rishabh-d-dave commented Feb 19, 2020

Uh oh!

rishabh-d-dave commented Sep 10, 2020

Uh oh!

rishabh-d-dave commented Sep 10, 2020

Uh oh!

rishabh-d-dave commented Sep 11, 2020

Uh oh!

rishabh-d-dave commented Sep 11, 2020

Uh oh!

ceph-jenkins commented Sep 11, 2020

Uh oh!

rishabh-d-dave commented Sep 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rishabh-d-dave commented Sep 11, 2020

Uh oh!

rishabh-d-dave commented Sep 11, 2020

Uh oh!

rishabh-d-dave commented Sep 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ajarr commented Sep 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rishabh-d-dave commented Sep 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

rishabh-d-dave commented Jan 9, 2020 •

edited

Loading

rishabh-d-dave commented Sep 11, 2020 •

edited

Loading

rishabh-d-dave commented Sep 11, 2020 •

edited

Loading

ajarr commented Sep 11, 2020 •

edited

Loading

rishabh-d-dave commented Sep 11, 2020 •

edited

Loading