Revert "quincy: ceph_fs.h: add separate owner_{u,g}id fields" by vshankar · Pull Request #54108 · ceph/ceph

vshankar · 2023-10-19T17:09:50Z

Revert "quincy: ceph_fs.h: add separate owner_{u,g}id fields"

This seems to be causing test failures:

        https://pulpito.ceph.com/yuriw-2023-10-19_16:09:31-smoke-quincy-release-distro-default-smithi/7432399

Where the MDS crashes with

In function 'CInode* Server::prepare_new_inode(MDRequestRef&, CDir*, inodeno_t, unsigned int, const file_layout_t*)'
3449: FAILED ceph_assert(_inode->gid != (unsigned)-1)

1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14f) [0x7f991631c89e]
2: /usr/lib/ceph/libceph-common.so.2(+0x27eab0) [0x7f991631cab0]
3: (Server::prepare_new_inode(boost::intrusive_ptr<MDRequestImpl>&, CDir*, inodeno_t, unsigned int, file_layout_t const*)+0x2791) [0x55563bec4831]
4: (Server::handle_client_mkdir(boost::intrusive_ptr<MDRequestImpl>&)+0x1ec) [0x55563bede88c]

Revert this change till the failure is fixed.

This seems to be causing test failures: https://pulpito.ceph.com/yuriw-2023-10-19_16:09:31-smoke-quincy-release-distro-default-smithi/7432399 Where the MDS crashes with In function 'CInode* Server::prepare_new_inode(MDRequestRef&, CDir*, inodeno_t, unsigned int, const file_layout_t*)' 3449: FAILED ceph_assert(_inode->gid != (unsigned)-1) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14f) [0x7f991631c89e] 2: /usr/lib/ceph/libceph-common.so.2(+0x27eab0) [0x7f991631cab0] 3: (Server::prepare_new_inode(boost::intrusive_ptr<MDRequestImpl>&, CDir*, inodeno_t, unsigned int, file_layout_t const*)+0x2791) [0x55563bec4831] 4: (Server::handle_client_mkdir(boost::intrusive_ptr<MDRequestImpl>&)+0x1ec) [0x55563bede88c] Revert this change till the failure is fixed. Signed-off-by: Venky Shankar <vshankar@redhat.com>

vshankar · 2023-10-19T17:24:45Z

Test runs in ~3hrs.

mihalicyn · 2023-10-19T17:27:22Z

Test runs in ~3hrs.

ugh. So, this failure is hardly reproducible and can be reproduced only after many iterations, right?

This assertion was added in my PR. Before that we had no analogical assertion in this place. I will take a look closely on the test code and try to understand what happens here.

vshankar · 2023-10-20T02:06:18Z

jenkins test make check

vshankar · 2023-10-20T04:12:07Z

Test runs in ~3hrs.

ugh. So, this failure is hardly reproducible and can be reproduced only after many iterations, right?

Not really. We saw this smoke tests and not in fs suite. So, there is something that smoke test that's not in fs suite tests - I'm going to audit that. We probably didn't see this in the main branch run since smoke tests are not run :/

This assertion was added in my PR. Before that we had no analogical assertion in this place. I will take a look closely on the test code and try to understand what happens here.

ACK. Thx!

mihalicyn · 2023-10-20T11:19:31Z

Ok, I have analyzed the ceph code and test code too. At this point I haven't found anything suspicious that can make gid == -1.

But, there is one serious thing about this ceph_assert. During a review it was proposed to add a sanity check for the uid/gid values (https://github.com/ceph/ceph/pull/52575/files#r1285297994). And I have added ceph_assert(_inode->uid != (unsigned)-1);. But I think that was a bad idea to add assert here, because assert AFAIU kills the whole MDS server, right? At the same time this uid/gid fields are under the full control of the client. It means that malicious client can send -1 by purpose and crash the entire MDS, right? My suggestion is to remove this assertions from the request handling code as they are unsafe.
@lxbsz Xiubo, what do you think about it?

Probably we need to revert #53138 for now too.

Speaking about smoke test failure, I wasn't able to understand how -1 appears in the owner_gid field. Is it possible somehow to rerun this tests with requests dump/debug for the MDS?

As it follows from test logs we have 5.4.0-165-generic (Ubuntu kernel?) and cephfs kernel client. So it means that cephfs client is old (without new feature that adds owner_uid/gid fields support).

vshankar · 2023-10-20T11:43:09Z

Ok, I have analyzed the ceph code and test code too. At this point I haven't found anything suspicious that can make gid == -1.

But, there is one serious thing about this ceph_assert. During a review it was proposed to add a sanity check for the uid/gid values (https://github.com/ceph/ceph/pull/52575/files#r1285297994). And I have added ceph_assert(_inode->uid != (unsigned)-1);. But I think that was a bad idea to add assert here, because assert AFAIU kills the whole MDS server, right?

Correct.

At the same time this uid/gid fields are under the full control of the client. It means that malicious client can send -1 by purpose and crash the entire MDS, right? My suggestion is to remove this assertions from the request handling code as they are unsafe. @lxbsz Xiubo, what do you think about it?

Probably we need to revert #53138 for now too.

See my comment below - this failure isn't seen in main branch smoke test.

Speaking about smoke test failure, I wasn't able to understand how -1 appears in the owner_gid field. Is it possible somehow to rerun this tests with requests dump/debug for the MDS?

We have a core. With some effort we can setup an env and gdb the core to examine frames.

As it follows from test logs we have 5.4.0-165-generic (Ubuntu kernel?) and cephfs kernel client. So it means that cephfs client is old (without new feature that adds owner_uid/gid fields support).

@mihalicyn I haven't looked at the failures closely but the very recent smoke run on main branch doesn't have this failure - https://pulpito.ceph.com/?branch=wip-vshankar-testing-20231018.065603 - its possible that its something in the quincy that tripping it off...

Just FYI - I'm on vacation next week, so won't be able to look much into this.

mihalicyn · 2023-10-20T12:21:46Z

Ok, I have analyzed the ceph code and test code too. At this point I haven't found anything suspicious that can make gid == -1.
But, there is one serious thing about this ceph_assert. During a review it was proposed to add a sanity check for the uid/gid values (https://github.com/ceph/ceph/pull/52575/files#r1285297994). And I have added ceph_assert(_inode->uid != (unsigned)-1);. But I think that was a bad idea to add assert here, because assert AFAIU kills the whole MDS server, right?

Correct.

Ok, then I have to send a new PR to fix that and replace ceph_assert with something like (existing example from the prepare_new_inode function):

  if (useino && useino != _inode->ino) {
    dout(0) << "WARNING: client specified " << useino << " and i allocated " << _inode->ino << dendl;
    mds->clog->error() << mdr->client_request->get_source()
       << " specified ino " << useino
       << " but mds." << mds->get_nodeid() << " allocated " << _inode->ino;
    //ceph_abort(); // just for now.
  }

So we can replace ceph_assert(_inode->gid != (unsigned)-1) to:

  if (_inode->gid == (unsigned)-1) {
    dout(0) << "WARNING: client specified gid = " << _inode->gid << " for ino " << _inode->ino << dendl;
    mds->clog->error() << mdr->client_request->get_source()
       << " specified gid = " << _inode->gid << " for ino " << _inode->ino;
  }

Do I need to create a new issue in the ceph tracker, or I can send PR and reuse an old https://tracker.ceph.com/issues/62217 ?
Also I need to create a backports PRs for it after it will be merged to the main branch.

Speaking about smoke test failure, I wasn't able to understand how -1 appears in the owner_gid field. Is it possible somehow to rerun this tests with requests dump/debug for the MDS?

We have a core. With some effort we can setup an env and gdb the core to examine frames.

Great! This would be awesome to analyze core.

I haven't looked at the failures closely but the very recent smoke run on main branch doesn't have this failure -

It's very very interesting information...

its possible that its something in the quincy that tripping it off...

Yep it's possible. Buffer overflow somewhere or something else. Ugh. :(

By the way, how much smoke tests runs we had for quincy branch? Only one? Or more?
If more then question is that if all of runs are failed on this assertion or not?

Just FYI - I'm on vacation next week, so won't be able to look much into this.

Have a great vacation, Venky!

I'll start from replacing the unsafe ceph_assert with something safer then.

Having ceph_assert calls is unsafe in the Server::prepare_new_inode when assertion depends on a data which is a client-provided. A malicious client can easily crash MDS by purpose. Let's replace two ceph_assert calls for owner_{u,g}id's with log prints. See also: ceph#54108 Fixes: commit 46cb244 ("ceph_fs.h: add separate owner_{u,g}id fields") Fixes: https://tracker.ceph.com/issues/62217 Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>

vshankar · 2023-10-20T14:25:11Z

Ok, then I have to send a new PR to fix that and replace ceph_assert with something like (existing example from the prepare_new_inode function):

I meant to say that the assert will obviously crash the mds, but, I'm not sure if the assert was incorrectly placed or not - that would require to carefully look at the original PR. I can do that when I'm back.

vshankar · 2023-10-20T14:29:44Z

I'll start from replacing the unsafe ceph_assert with something safer then.

You could run this through smoke test in quincy and see what shows up - that would be interesting.

By the way, how much smoke tests runs we had for quincy branch? Only one? Or more?
If more then question is that if all of runs are failed on this assertion or not?

The automatic cron jobs that trigger smoke runs were disabled until it was run manually and its when we saw this. Those cron jobs are been enabled now, so, we can see if the crash is reproducible.

vshankar · 2023-10-21T05:22:34Z

https://tracker.ceph.com/projects/cephfs/wiki/Quincy#2023-October-19

vshankar · 2023-10-21T05:22:47Z

Reviewed-by: Venky Shankar vshankar@redhat.com

mchangir

Reviewed-by: Milind Changire mchangir@redhat.com

mihalicyn · 2023-10-23T11:42:19Z

I think I found where the issue is. Issue is that two recent protocol extensions cbd7e30 and 46cb244 are not handling very old clients of ceph.

Let me show that. We have (https://github.com/ceph/ceph/blob/c672d7aab0240cefa8a3decdffd498e0ade3d5be/src/messages/MClientRequest.h#L230C13-L230C13):

  void decode_payload() override { // << decoder for message from the client
    using ceph::decode;
    auto p = payload.cbegin();

    if (header.version >= 4) { // this is for relatively new clients of ceph (after a "version" field was added to the "struct ceph_mds_request_head" structure.
      decode(head, p); // << in this method we have checks for h.version and correctly handle 32-bit ext_num_fwd and owner_uid/gid
    } else { // this if for "struct ceph_mds_request_head_legacy" and this structure does *not* contain the "version" field.
      struct ceph_mds_request_head_legacy old_mds_head;

      decode(old_mds_head, p); //<<< here we don't have any fallback code that fills a new fields ext_num_retry, ext_num_fwd, owner_uid, owner_gid with any data from an "old" fields num_retry, num_fwd, caller_uid, caller_gid
      copy_from_legacy_head(&head, &old_mds_head);
      head.version = 0;

      /* Can't set the btime from legacy struct */
      if (head.op == CEPH_MDS_OP_SETATTR) {
	int localmask = head.args.setattr.mask;

	localmask &= ~CEPH_SETATTR_BTIME;

	head.args.setattr.btime = { ceph_le32(0), ceph_le32(0) };
	head.args.setattr.mask = localmask;
      }
    }

сс @lxbsz

I'm ready to prepare fixes for both CEPHFS_FEATURE_32BITS_RETRY_FWD and CEPHFS_FEATURE_HAS_OWNER_UIDGID features. The only reason why it failed on the CEPHFS_FEATURE_HAS_OWNER_UIDGID is that we have added an assertion in the code. But we don't have any analogical assertions for CEPHFS_FEATURE_32BITS_RETRY_FWD so it's hard to notice any issues here.

The reproducer in the smoke test includes Linux kernel 5.4 which uses a very old cephfs protocol with struct ceph_mds_request_head_legacy.

mihalicyn · 2023-10-23T12:19:42Z

I was able to reproduce this issue with Linux kernel 5.4.240 and ceph version from main branch.
I'm on the way to fix it and create a tracker issue.

We also need to fix it for reef branch.

cc @yuriw @batrick @mchangir

mihalicyn · 2023-10-23T13:49:30Z

Fixed in #54149

vshankar added the cephfs Ceph File System label Oct 19, 2023

vshankar requested a review from a team October 19, 2023 17:09

github-actions bot added this to the quincy milestone Oct 19, 2023

vshankar force-pushed the revert-53139-wip-62571-quincy branch from df24f7a to cca2217 Compare October 19, 2023 17:12

vshankar force-pushed the revert-53139-wip-62571-quincy branch from cca2217 to 71a526c Compare October 19, 2023 17:15

vshankar added the wip-vshankar-testing2 label Oct 19, 2023

mihalicyn mentioned this pull request Oct 20, 2023

mds/Server: replace ceph_assert for uid/gid with log prints #54121

Closed

14 tasks

yuriw requested a review from mchangir October 21, 2023 14:28

mchangir approved these changes Oct 23, 2023

View reviewed changes

mihalicyn mentioned this pull request Oct 23, 2023

reef: ceph_fs.h: add separate owner_{u,g}id fields #53138

Merged

14 tasks

yuriw merged commit b790c24 into quincy Oct 23, 2023

yuriw deleted the revert-53139-wip-62571-quincy branch October 23, 2023 14:42

mihalicyn mentioned this pull request Nov 8, 2023

quincy: MClientRequest: properly handle ceph_mds_request_head_legacy for ext_num_retry, ext_num_fwd, owner_uid, owner_gid #54411

Merged

Conversation

vshankar commented Oct 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vshankar commented Oct 19, 2023

Uh oh!

mihalicyn commented Oct 19, 2023

Uh oh!

vshankar commented Oct 20, 2023

Uh oh!

vshankar commented Oct 20, 2023

Uh oh!

mihalicyn commented Oct 20, 2023

Uh oh!

vshankar commented Oct 20, 2023

Uh oh!

mihalicyn commented Oct 20, 2023

Uh oh!

vshankar commented Oct 20, 2023

Uh oh!

vshankar commented Oct 20, 2023

Uh oh!

vshankar commented Oct 21, 2023

Uh oh!

vshankar commented Oct 21, 2023

Uh oh!

mchangir left a comment

Choose a reason for hiding this comment

Uh oh!

mihalicyn commented Oct 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mihalicyn commented Oct 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mihalicyn commented Oct 23, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vshankar commented Oct 19, 2023 •

edited

Loading

mihalicyn commented Oct 23, 2023 •

edited

Loading

mihalicyn commented Oct 23, 2023 •

edited

Loading