Skip to content

mds: rdlock_path_xlock_dentry supports returning auth target inode#49691

Merged
vshankar merged 2 commits intoceph:mainfrom
zhsgao:file_open_crash_mds
May 18, 2023
Merged

mds: rdlock_path_xlock_dentry supports returning auth target inode#49691
vshankar merged 2 commits intoceph:mainfrom
zhsgao:file_open_crash_mds

Conversation

@zhsgao
Copy link
Contributor

@zhsgao zhsgao commented Jan 10, 2023

Server::handle_client_openc will call Server::handle_client_open if the target inode exists. Server::handle_client_open requires the target inode to be auth if client sets O_WRONLY or O_TRUNC, but Server::rdlock_path_xlock_dentry can not ensure that. The assertion that the target inode is auth in Server::handle_client_open may fail and it will crash mds. So add a parameter to to support returning auth target inode if it exists and forward the request if necessary. The tail dentry may not be auth anymore when an auth target inode is provided as required by the caller, because we can't require both of them to be auth in one mds.
This bug will only be triggered in kernel client, because ceph-fuse client will try to lookup the target before sending a create request to mds and it can send an open request instead if the target exists. But kernel client will send the create request to mds without lookup first because atomic open is supported to improve performance.

Fixes: https://tracker.ceph.com/issues/58411
Signed-off-by: Zhansong Gao zhsgao@hotmail.com

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@github-actions github-actions bot added the cephfs Ceph File System label Jan 10, 2023
@zhsgao zhsgao force-pushed the file_open_crash_mds branch from 99274eb to 97b4f1a Compare January 10, 2023 14:29
@zhsgao zhsgao force-pushed the file_open_crash_mds branch from 97b4f1a to 109b698 Compare January 18, 2023 05:06
@vshankar vshankar requested a review from a team January 19, 2023 13:13
@vshankar
Copy link
Contributor

Going through the changes now, but this requires a test case - please add one.

@zhsgao zhsgao force-pushed the file_open_crash_mds branch from 109b698 to c2c8c42 Compare January 20, 2023 06:23
if (pdnvec)
pdnvec->clear(); // do not confuse likes of rdlock_path_pin_ref();
if (pin)
*pin = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a related change to the bug? If not, please have this as a separate commit (and explain why resetting *pin is correct or if it does not matter).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a related change to the bug? If not, please have this as a separate commit (and explain why resetting *pin is correct or if it does not matter).

pdnvec and pin are used to return values to the caller and it is better to clear them when an error needs to be returned. pdnvec is cleared but pin is not in some places where errors need to be returned, so I add them. It is not related to this bug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK - thanks for explaining. Better to have this as a separate commit for clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK - thanks for explaining. Better to have this as a separate commit for clarity.

OK, I have deleted them.

Comment on lines +8451 to +8452
// do not xlock the tail dentry if target inode exists and caller wants it
if (xlock_dentry && (dnl->is_null() || !want_inode) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't the dentry linkage need to be valid here as well ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't the dentry linkage need to be valid here as well ?

The tail dentry can be null and we can also create one if it doesn't exist. The caller gets a null tail dentry and then links it with an inode, a file or hard link can be created. And we need to provide a xlocked auth tail dentry to the caller because it needs to be modified.

@vshankar
Copy link
Contributor

@zhsgao Mind adding a test case for the fix?

@github-actions github-actions bot added the tests label Feb 3, 2023
@zhsgao
Copy link
Contributor Author

zhsgao commented Feb 3, 2023

@zhsgao Mind adding a test case for the fix?

I have added it in a new commit. This bug will only be triggered in kernel client and ceph-fuse client can handle it normally, so add --kclient when testing.

@zhsgao zhsgao force-pushed the file_open_crash_mds branch from 6b6cd5d to 44b85b9 Compare February 5, 2023 08:18
@zhsgao
Copy link
Contributor Author

zhsgao commented Feb 6, 2023

A bug in the test framework of kernel client make the test unable to run, use PR #50001 to fix it.

@zhsgao zhsgao force-pushed the file_open_crash_mds branch 5 times, most recently from f4210c2 to e5cf3f5 Compare February 13, 2023 03:50
@vshankar
Copy link
Contributor

Looks fine. Added this is my test backlog for integration test runs.

@vshankar vshankar added the wip-vshankar-backlog Backlog of CephFS PRs to pick for testing label Feb 15, 2023
@vshankar vshankar added wip-vshankar-testing3 and removed wip-vshankar-backlog Backlog of CephFS PRs to pick for testing labels Feb 28, 2023
@vshankar
Copy link
Contributor

@vshankar
Copy link
Contributor

https://pulpito.ceph.com/vshankar-2023-03-16_06:43:29-fs-wip-vshankar-testing-20230315.085747-testing-default-smithi/

@zhsgao there are a couple of new failures with this change. I'll debug those. The good thing is that most other tests run fine.

@zhsgao
Copy link
Contributor Author

zhsgao commented Mar 27, 2023

https://pulpito.ceph.com/vshankar-2023-03-16_06:43:29-fs-wip-vshankar-testing-20230315.085747-testing-default-smithi/

@zhsgao there are a couple of new failures with this change. I'll debug those. The good thing is that most other tests run fine.

OK, I'll also take a look at these failures. Thanks!

@vshankar
Copy link
Contributor

@zhsgao Looks like this change is causing this test failure: https://pulpito.ceph.com/vshankar-2023-03-16_06:43:29-fs-wip-vshankar-testing-20230315.085747-testing-default-smithi/7210002/

The MDS crashes with the following backtrace:

    -1> 2023-03-23T13:41:14.860+0000 7f304a650700 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-3054-gf194b10e/
rpm/el8/BUILD/ceph-18.0.0-3054-gf194b10e/src/mds/Mutation.cc: In function 'void MutationImpl::drop_local_auth_pins()' thread 7f304a650700 time 2023-03-23T13:41:14.858732+0000
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-3054-gf194b10e/rpm/el8/BUILD/ceph-18.0.0-3054-gf194b10e/src/mds/Muta
tion.cc: 195: FAILED ceph_assert(p.first->is_auth())

 ceph version 18.0.0-3054-gf194b10e (f194b10e62aa44419e63d7e7068b1c9dc1e255b5) reef (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x135) [0x7f3055f08dbf]
 2: /usr/lib64/ceph/libceph-common.so.2(+0x2a9f85) [0x7f3055f08f85]
 3: ceph-mds(+0x2b547f) [0x55c6c450147f]
 4: (MDCache::request_cleanup(boost::intrusive_ptr<MDRequestImpl>&)+0xe6) [0x55c6c4569ef6]
 5: (Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&)+0x7aa) [0x55c6c447843a]
 6: (Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int)+0x350) [0x55c6c447ac30]
 7: (Server::handle_client_getattr(boost::intrusive_ptr<MDRequestImpl>&, bool)+0x6e6) [0x55c6c4494586]
 8: (MDSContext::complete(int)+0x5f) [0x55c6c475d45f]
 9: (MDSRank::_advance_queues()+0xaa) [0x55c6c440ff3a]
 10: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x1ce) [0x55c6c4410a2e]
 11: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x5c) [0x55c6c441141c]
 12: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x1bf) [0x55c6c43fadcf]
 13: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x478) [0x7f3056181d78]
 14: (DispatchQueue::entry()+0x50f) [0x7f305617ef1f]
 15: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f3056245511]
 16: /lib64/libpthread.so.0(+0x81cf) [0x7f3054cb11cf]
 17: clone()

Seems to be related to this change as the assert failure is related to an auth check:

src/mds/Mutation.cc: 195: FAILED ceph_assert(p.first->is_auth())

@vshankar vshankar added the DNM label Mar 30, 2023
@vshankar
Copy link
Contributor

zhsgao added 2 commits March 31, 2023 18:29
`Server::handle_client_openc` will call `Server::handle_client_open`
if the target inode exists. `Server::handle_client_open requires`
the target inode to be auth if client sets `O_WRONLY` or `O_TRUNC`,
but `Server::rdlock_path_xlock_dentry` can not ensure that. The assertion
that the target inode is auth in `Server::handle_client_open` may fail
and it will crash mds. So add a parameter to to support returning
auth target inode if it exists and forward the request if necessary.
The tail dentry may not be auth anymore when an auth target inode
is provided as required by the caller, because we can't require
both of them to be auth in one mds.
This bug will only be triggered in kernel client, because ceph-fuse
client will try to lookup the target before sending a create request
to mds and it can send an open request instead if the target exists.
But kernel client will send the create request to mds without lookup
first because `atomic open` is supported to improve performance.

Fixes: https://tracker.ceph.com/issues/58411
Signed-off-by: Zhansong Gao <zhsgao@hotmail.com>
…me mds as the inode

Signed-off-by: Zhansong Gao <zhsgao@hotmail.com>
@zhsgao zhsgao force-pushed the file_open_crash_mds branch from e5cf3f5 to 9945f3b Compare March 31, 2023 10:31
@zhsgao
Copy link
Contributor Author

zhsgao commented Mar 31, 2023

@zhsgao Looks like this change is causing this test failure: https://pulpito.ceph.com/vshankar-2023-03-16_06:43:29-fs-wip-vshankar-testing-20230315.085747-testing-default-smithi/7210002/

The MDS crashes with the following backtrace:

    -1> 2023-03-23T13:41:14.860+0000 7f304a650700 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-3054-gf194b10e/
rpm/el8/BUILD/ceph-18.0.0-3054-gf194b10e/src/mds/Mutation.cc: In function 'void MutationImpl::drop_local_auth_pins()' thread 7f304a650700 time 2023-03-23T13:41:14.858732+0000
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-3054-gf194b10e/rpm/el8/BUILD/ceph-18.0.0-3054-gf194b10e/src/mds/Muta
tion.cc: 195: FAILED ceph_assert(p.first->is_auth())

 ceph version 18.0.0-3054-gf194b10e (f194b10e62aa44419e63d7e7068b1c9dc1e255b5) reef (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x135) [0x7f3055f08dbf]
 2: /usr/lib64/ceph/libceph-common.so.2(+0x2a9f85) [0x7f3055f08f85]
 3: ceph-mds(+0x2b547f) [0x55c6c450147f]
 4: (MDCache::request_cleanup(boost::intrusive_ptr<MDRequestImpl>&)+0xe6) [0x55c6c4569ef6]
 5: (Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&)+0x7aa) [0x55c6c447843a]
 6: (Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int)+0x350) [0x55c6c447ac30]
 7: (Server::handle_client_getattr(boost::intrusive_ptr<MDRequestImpl>&, bool)+0x6e6) [0x55c6c4494586]
 8: (MDSContext::complete(int)+0x5f) [0x55c6c475d45f]
 9: (MDSRank::_advance_queues()+0xaa) [0x55c6c440ff3a]
 10: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x1ce) [0x55c6c4410a2e]
 11: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x5c) [0x55c6c441141c]
 12: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x1bf) [0x55c6c43fadcf]
 13: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x478) [0x7f3056181d78]
 14: (DispatchQueue::entry()+0x50f) [0x7f305617ef1f]
 15: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f3056245511]
 16: /lib64/libpthread.so.0(+0x81cf) [0x7f3054cb11cf]
 17: clone()

Seems to be related to this change as the assert failure is related to an auth check:

src/mds/Mutation.cc: 195: FAILED ceph_assert(p.first->is_auth())

I have fixed it. I didn't check the code logic related to snapshot before. Thanks very much!

@vshankar
Copy link
Contributor

I have fixed it. I didn't check the code logic related to snapshot before. Thanks very much!

Could you also explain a bit about the bug and/or the commit diff that fixes this? (makes it easier to review rather than going through large parts of changes over again)

@zhsgao
Copy link
Contributor Author

zhsgao commented Mar 31, 2023

I have fixed it. I didn't check the code logic related to snapshot before. Thanks very much!

Could you also explain a bit about the bug and/or the commit diff that fixes this? (makes it easier to review rather than going through large parts of changes over again)

The crash occurs because MDCache::path_traverse does not forward the request as the caller expects when the target is a non-auth snapshot, so the auth assertion after will fail. target_inode needs to be set when the target exists but is not actually set if target is a snapshot, so the request will not be forwarded.

@vshankar
Copy link
Contributor

jenkins test make check arm64

@vshankar
Copy link
Contributor

vshankar commented May 9, 2023

@vshankar
Copy link
Contributor

Here are the test runs: https://pulpito.ceph.com/?branch=wip-vshankar-testing-20230509.090020

There are a bit more failures in the fs suite than we normally run into (known issues, teuthology infra issues, etc..). I'm going to debug the new failures (mostly workunit failures and dead jobs due to timeout).

@vshankar
Copy link
Contributor

@zhsgao I debugged the failed test and I think there is a bug in path_traverse but I need to confirm. Will update soon...

@zhsgao
Copy link
Contributor Author

zhsgao commented May 15, 2023

@zhsgao I debugged the failed test and I think there is a bug in path_traverse but I need to confirm. Will update soon...

OK! If the bug is confirmed, I will try to fix it as soon as possible. Thanks!

@vshankar
Copy link
Contributor

@zhsgao I debugged the failed test and I think there is a bug in path_traverse but I need to confirm. Will update soon...

OK! If the bug is confirmed, I will try to fix it as soon as possible. Thanks!

Looks like this change was not causing the failures. I was wrong. Will update in a while.

@vshankar
Copy link
Contributor

@zhsgao I debugged the failed test and I think there is a bug in path_traverse but I need to confirm. Will update soon...

OK! If the bug is confirmed, I will try to fix it as soon as possible. Thanks!

Looks like this change was not causing the failures. I was wrong. Will update in a while.

False alarm. No related failures found with this change.

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vshankar vshankar merged commit 7b29685 into ceph:main May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cephfs Ceph File System tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants