Skip to content

mds: prevent clients from exceeding the xattrs key/value limits#46357

Merged
vshankar merged 5 commits intoceph:mainfrom
luis-henrix:impose-xattrs-limits
Mar 8, 2023
Merged

mds: prevent clients from exceeding the xattrs key/value limits#46357
vshankar merged 5 commits intoceph:mainfrom
luis-henrix:impose-xattrs-limits

Conversation

@luis-henrix
Copy link

A client is allowed to set as many xattrs in an inode as it wants, as long as it has CAP_XATTR_EXCL. This will allow him to buffer the xattrs and send them to the MDS later on. This will eventually result in crashing the MDS (and possibly also requires recovering the filesystem?).

There was an attempt long time ago to fix this in https://tracker.ceph.com/issues/19033, which resulted in commit eb915d0 ("cephfs: fix write_buf's _len overflow problem"). This fix added a maximum size for the xattrs (keys + values). But this maximum is only enforced when a client uses the synchronous MDS_OP_SETXATTR operation.

Here's a small test case:

# dd if=/dev/zero bs=1 count=65536 2> /dev/null | attr -s myattr /mnt/myfile
attr_set: No space left on device
Could not set "myattr" for /mnt/myfile
# dd if=/dev/zero bs=1 count=65536 2> /dev/null | attr -s myattr /mnt/myfile
Attribute "myattr" set to a 65536 byte value for /mnt/myfile:

#

Note that this PR is just an RFC, I'm really not sure what's the best way to fix this issue: a user that sets an attribute in a file doesn't get any error at all. So, maybe we should simply not buffer xattrs on the client. Also, I think the MDS will need to take some further actions after detecting the problem: besides dropping the xattrs, it probably needs to force the revogation of caps from the client so that it can get an update.

Fixes: https://tracker.ceph.com/issues/55725
Signed-off-by: Luís Henriques lhenriques@suse.de

@luis-henrix luis-henrix marked this pull request as draft May 20, 2022 11:49
@github-actions github-actions bot added the cephfs Ceph File System label May 20, 2022
@lxbsz lxbsz requested a review from a team May 23, 2022 01:30
@vshankar vshankar requested a review from a team May 24, 2022 07:41
@luis-henrix luis-henrix force-pushed the impose-xattrs-limits branch from 741f8ba to 4fbe62b Compare May 25, 2022 17:17
@luis-henrix
Copy link
Author

I've just pushed another iteration of this PR. I think I've implemented all the bits. The thing I'm almost sure isn't 100% right is how to not grant 'Xx' caps to clients that don't support the new feature. It seems to be working, but the MDS code isn't really something I feel comfortable changing. More knowledgeable people are welcome to jump in.

I'll send out the companion kernel client RFC patch shortly.

@luis-henrix luis-henrix requested review from lxbsz and vshankar May 25, 2022 17:18
@djgalloway djgalloway changed the base branch from master to main May 25, 2022 19:57
@luis-henrix
Copy link
Author

Thanks for your review @lxbsz I'll update the code accordingly. Eventually, the new feature bit may be dropped and the mdsmap may be used instead as suggested on the mailing-list.

@luis-henrix
Copy link
Author

Ok, I've pushed yet another draft to fix this issue. It now uses the mdsmap to share the xattrs max size knob with the clients, and this knob is now per-filesystem (not per-MDS).
The issue about simply dropping (silently) any xattrs that old clients may have added is still present.

}
if (n < MDS_MAX_XATTR_SIZE) {
ss << var << " must at least " << MDS_MAX_XATTR_SIZE;
return -ERANGE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why fail it here ? Shouldn't the MDS_MAX_XATTR_SIZE just a default value ? Users should be allowed to set any value larger or less than it, right ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can drop this check. I'm not really sure if in which situations admins should be really playing with this setting anyway.

@luis-henrix luis-henrix force-pushed the impose-xattrs-limits branch from 16fb361 to 56a4c74 Compare June 2, 2022 14:30
@luis-henrix luis-henrix marked this pull request as ready for review June 2, 2022 14:30
@luis-henrix luis-henrix requested a review from a team as a code owner June 2, 2022 14:30
@luis-henrix
Copy link
Author

Pushed a revised version, added a note to the PendingReleaseNotes files as suggested by @lxbsz. Also opened this PR for review (not a "draft" anymore).

@lxbsz
Copy link
Member

lxbsz commented Jun 3, 2022

Hi @luis-henrix , BTW, shouldn't we fix the same issue in the Client.cc in this PR as you did in https://patchwork.kernel.org/project/ceph-devel/list/?series=646916 ?

@luis-henrix
Copy link
Author

luis-henrix commented Jun 3, 2022

Hi @luis-henrix , BTW, shouldn't we fix the same issue in the Client.cc in this PR as you did in https://patchwork.kernel.org/project/ceph-devel/list/?series=646916 ?

Good point! I remember trying to reproduce this issue earlier using the fuse client and I wasn't able to, so I assumed it wasn't affected (i.e. I assumed it always did sync setxattr operations). But a quick look at the code seems to indicate otherwise. Let me go look closer. (And try to figure out again how to debug the fuse client again, as can never remember how to do that!)

UPDATE: No, I was right in the first place. It looks like the fuse client always uses the CEPH_MDS_OP_SETXATTR (sync) operation and never buffers anything in->xattrs. So, @lxbsz I guess there's nothing to do on Client.cc.

@github-actions
Copy link

github-actions bot commented Jun 8, 2022

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@vshankar
Copy link
Contributor

jenkins test make check

@vshankar
Copy link
Contributor

jenkins test dashboard cephadm

@github-actions
Copy link

github-actions bot commented Mar 1, 2023

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@vshankar
Copy link
Contributor

vshankar commented Mar 1, 2023

@luis-henrix I hope this is the last rebase you have to push for this PR. Tests are running here: https://pulpito.ceph.com/vshankar-2023-02-24_04:31:46-fs-wip-vshankar-testing-20230222.044949-testing-default-smithi/

Luís Henriques added 5 commits March 1, 2023 10:35
This new configuration option will allow to define the maximum size for a
filesystem xattrs blob.  This is a filesystem-wide knob that will replace
the per-MDS mds_max_xattr_pairs_size option.

Note:

The kernel client patch to handle this new configuration was merged before
the corresponding ceph-side pull-request.  This was unfortunate because in
the meantime PR ceph#43284 was merged and the encoding/decoding of
'bal_rank_mask' got in between.  Hence the 'max_xattr_size' is being
encoding/decoded before 'bal_rank_mask'.

URL: https://tracker.ceph.com/issues/55725
Signed-off-by: Luís Henriques <lhenriques@suse.de>
Commit eb915d0 ("cephfs: fix write_buf's _len overflow problem") added
a limit to the total size of xattrs.  This limit is respected by clients
doing a "sync" operation, i.e. MDS_OP_SETXATTR.  However, clients with
CAP_XATTR_EXCL can still buffer these operations and ignore these limits.

This patch prevents clients from crashing the MDSs by also imposing the
xattr limits even when they have the Xx caps.  Replaces the per-MDS knob
"max_xattr_pairs_size" by the new mdsmap setting that the clients can
access.

Unfortunately, clients that misbehave, i.e. old clients that don't respect
this xattrs limit and buffer their xattrs, will see them vanishing.

URL: https://tracker.ceph.com/issues/55725
Signed-off-by: Luís Henriques <lhenriques@suse.de>
When doing a sync setxattr (MDS_OP_SETXATTR) to set the first xattr in an
inode it is possible for the client to set a huge xattr key/value that would
exceed the limit.  Prevent this from from happening by checking the size
against the maximum allowed size.

URL: https://tracker.ceph.com/issues/55725
Signed-off-by: Luís Henriques <lhenriques@suse.de>
…p schema

Let the mdsmap schema know about the new field 'max_xattr_size'.  This prevents
the following error:

tasks.mgr.dashboard.helper._ValError: \
  In `input['fs_map']['filesystems'][0]['mdsmap']`: unknown keys: {'max_xattr_size'}

URL: https://tracker.ceph.com/issues/55725
Signed-off-by: Luís Henriques <lhenriques@suse.de>
…ield

Signed-off-by: Luís Henriques <lhenriques@suse.de>
@luis-henrix
Copy link
Author

luis-henrix commented Mar 1, 2023 via email

@luis-henrix
Copy link
Author

I've no idea what these failures are. I've just received an email with this cryptic information:

[ceph/ceph] Pull Request Triage workflow run

Repository: ceph/ceph
Workflow: Pull Request Triage
Duration: 20.0 seconds
Finished: 2023-03-01 10:40:04 UTC

View results: https://github.com/ceph/ceph/actions/runs/4302633946

Jobs:

  • pr-triage failed (1 annotation)

Do I need to do something here? Or is this still some infrastruture problem?

@lxbsz
Copy link
Member

lxbsz commented Mar 6, 2023

jenkins test docs

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests ran successfully: https://pulpito.ceph.com/vshankar-2023-03-02_09:21:58-fs-wip-vshankar-testing-20230222.044949-testing-default-smithi/

Failures are not related to this PR. This should get merged soon.

@vshankar
Copy link
Contributor

vshankar commented Mar 6, 2023

jenkins test make check arm64

1 similar comment
@vshankar
Copy link
Contributor

vshankar commented Mar 7, 2023

jenkins test make check arm64

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vshankar vshankar merged commit 9623dd2 into ceph:main Mar 8, 2023
@vshankar vshankar mentioned this pull request Mar 8, 2023
3 tasks
vshankar added a commit to vshankar/ceph that referenced this pull request Dec 1, 2023
This is required for a reef client to work with a higher revision
MDS, since essentially, this happens:

reef(client):

if (version >=17) {
    decode(bal_rank_mask, p);
}

and higher-revision MDS (say, upcoming squid):

version = 17
encode(version, bl);
...
...
encode(max_xattr_size, bl);
encode(bal_rank_mask, bl);

The client incorrectly decodes max_xattr_size (type: uint64_t) into
bal_rank_mask (type: string).

This situation ended up due to a couple of reasons:

* the kclient patchset hanlding `max_xattr_size` was merged early on
  and another MDS side change that bumped the MDSMap encoding version
  to 17 got merged in the midst (PR ceph#43284). Details in comment:

        ceph#46357 (comment)

* The reef backport for PR ceph#46357 got delayed (and, reef branched out).

Which means reef(18.2.0) user-space clients are broken with higher version
MDSs.

Fixes: https://tracker.ceph.com/issues/63713
Signed-off-by: Venky Shankar <vshankar@redhat.com>
trociny pushed a commit to trociny/ceph that referenced this pull request Dec 14, 2023
This is required for a reef client to work with a higher revision
MDS, since essentially, this happens:

reef(client):

if (version >=17) {
    decode(bal_rank_mask, p);
}

and higher-revision MDS (say, upcoming squid):

version = 17
encode(version, bl);
...
...
encode(max_xattr_size, bl);
encode(bal_rank_mask, bl);

The client incorrectly decodes max_xattr_size (type: uint64_t) into
bal_rank_mask (type: string).

This situation ended up due to a couple of reasons:

* the kclient patchset hanlding `max_xattr_size` was merged early on
  and another MDS side change that bumped the MDSMap encoding version
  to 17 got merged in the midst (PR ceph#43284). Details in comment:

        ceph#46357 (comment)

* The reef backport for PR ceph#46357 got delayed (and, reef branched out).

Which means reef(18.2.0) user-space clients are broken with higher version
MDSs.

Fixes: https://tracker.ceph.com/issues/63713
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 36ee8e7)
vshankar added a commit to ceph/ceph-ci that referenced this pull request Dec 19, 2023
This is required for a reef client to work with a higher revision
MDS, since essentially, this happens:

reef(client):

if (version >=17) {
    decode(bal_rank_mask, p);
}

and higher-revision MDS (say, upcoming squid):

version = 17
encode(version, bl);
...
...
encode(max_xattr_size, bl);
encode(bal_rank_mask, bl);

The client incorrectly decodes max_xattr_size (type: uint64_t) into
bal_rank_mask (type: string).

This situation ended up due to a couple of reasons:

* the kclient patchset hanlding `max_xattr_size` was merged early on
  and another MDS side change that bumped the MDSMap encoding version
  to 17 got merged in the midst (PR #43284). Details in comment:

        ceph/ceph#46357 (comment)

* The reef backport for PR #46357 got delayed (and, reef branched out).

Which means reef(18.2.0) user-space clients are broken with higher version
MDSs.

Fixes: https://tracker.ceph.com/issues/63713
Signed-off-by: Venky Shankar <vshankar@redhat.com>
(cherry picked from commit 36ee8e7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants