mds/MDSDaemon: unlock `mds_lock` while shutting down Beacon and others by MaxKellermann · Pull Request #60326 · ceph/ceph

MaxKellermann · 2024-10-15T14:49:18Z

This fixes a deadlock bug during MDS shutdown:

the "signal_handler" thread receives the shutdown signal and invokes
MDSDaemon::suicide() while holding mds_lock
MDSDaemon::suicide() invokes Beacon::send_and_wait() while still
holding mds_lock
meanwhile, all "ms_dispatch" threads get stuck waiting for
mds_lock, for example in MDCache::upkeep_main() or
MDSDaemon::ms_dispatch2()
Beacon::send_and_wait() waits for a MSG_MDS_BEACON packet to be
dispatched (via cvar with a timeout)

At this point, even if a MSG_MDS_BEACON packet is received by one of
the worker threads, they will put it in the DispatchQueue, but no
dispatcher thread will be able to handle it because they are all
stuck. The cvar.wait_for() clal in Beacon::send_and_wait() will
therefore time out and the MSG_MDS_BEACON will never be processed.

The proper solution is to unlock mds_lock to avoid the dispatchers
from getting stuck. And in general, we should be holding a lock
strictly only when it is needed and never do blocking calls while
holding a lock.

Fixes: https://tracker.ceph.com/issues/68760

This used to be part of #60320

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

batrick

Needs a tracker ticket.

MaxKellermann · 2024-10-29T20:38:00Z

https://tracker.ceph.com/issues/68760

src/mds/MDSDaemon.cc

MaxKellermann · 2024-11-01T06:59:26Z

Note: MDSRank::damaged needs adjusted too.

True. What a weird piece of code! (and [[noreturn]] is missing)
Fixing this one would require passing a unique_lock to it, propagating it through all callers. That's a bigger effort.
(Or just call unlock() manually. Which would be dirty. But acceptable for this one method?)

batrick · 2024-11-04T14:46:40Z

Note: MDSRank::damaged needs adjusted too.

True. What a weird piece of code! (and [[noreturn]] is missing) Fixing this one would require passing a unique_lock to it, propagating it through all callers. That's a bigger effort. (Or just call unlock() manually. Which would be dirty. But acceptable for this one method?)

I think just calling unlock is fine with an appropriate comment. We don't expect the MDS to continue normally so it's fine.

I am wondering about the wisdom of letting other threads in the MDS (like MDCache::upkeeper) continue; not all threads are gated by the MDSDaemon::ms_dispatch2 _DNE check. Probably all points that acquire mds_lock should also check for some "terminate" flag and sit quietly (until the beacon reply is processed and the MDS exits).

MaxKellermann · 2024-12-07T05:54:05Z

@batrick, it's been a month. Is this going to be merged?

MaxKellermann · 2025-01-04T14:06:58Z

@batrick, it's been two months now.

batrick · 2025-01-05T01:55:34Z

@batrick, it's been a month. Is this going to be merged?

Sorry, I think I was waiting to see if you were going to adjust the code to:

only unlock after setting the beacon to state to _DNE
add a comment with explanation

MaxKellermann · 2025-01-08T05:11:36Z

Sorry, I think I was waiting to see if you were going to adjust the code to:
* only unlock after setting the beacon to state to _DNE

I don't think that makes sense because the beacon state is protected by Beacon::lock not mds_lock. The proper check you're looking for is the stopping flag; protected by mds_lock and checked by ms_dispatch2() right before the beacon/DNE check.

* add a comment with explanation

I added a comment to the unlock() call.

MaxKellermann · 2025-01-13T11:46:09Z

@batrick, do you agree or do you still want me to move the unlock?

ceph-jenkins · 2025-01-13T11:46:12Z

Can one of the admins verify this patch?

MaxKellermann · 2025-04-01T07:43:06Z

It's been nearly half a year since I submitted this PR, and it fixes a very annoying Ceph bug. I'm running out of time to get my Ceph improvements merged.

batrick · 2025-05-22T12:57:52Z

Sorry again for the wait.

Sorry, I think I was waiting to see if you were going to adjust the code to:
* only unlock after setting the beacon to state to _DNE
I don't think that makes sense because the beacon state is protected by Beacon::lock not mds_lock. The proper check you're looking for is the stopping flag; protected by mds_lock and checked by ms_dispatch2() right before the beacon/DNE check.

The concern I have is the race which would exist between unlocking the mds_lock and acquiring the Beacon::mutex. That would allow dispatch threads to come in, check the beacon state is not _DNE and do some work (like process client requests).

We already have a mds_lock -> Beacon::mutex lock dependency and routinely check the beacon with the mds lock held (see MDSDaemon::ms_dispatch2).

* add a comment with explanation
I added a comment to the unlock() call.

Thanks.

If you can please just move the unlock after the beacon.set_want_state(*mdsmap, MDSMap::STATE_DNE); I will be happy to approve this.

MaxKellermann · 2025-05-22T13:01:00Z

If you can please just move the unlock after the beacon.set_want_state(*mdsmap, MDSMap::STATE_DNE); I will be happy to approve this.

done

batrick · 2025-05-22T16:24:41Z

jenkins test api

github-actions · 2025-07-21T18:02:23Z

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

batrick · 2025-07-22T00:53:23Z

@vshankar @mchangir @kotreshhr for your next batch, please.

vshankar · 2025-07-22T17:02:32Z

This PR is under test in https://tracker.ceph.com/issues/72244.

src/mds/MDSDaemon.cc

This fixes a deadlock bug during MDS shutdown: - the "signal_handler" thread receives the shutdown signal and invokes MDSDaemon::suicide() while holding `mds_lock` - MDSDaemon::suicide() invokes Beacon::send_and_wait() while still holding `mds_lock` - meanwhile, all "ms_dispatch" threads get stuck waiting for `mds_lock`, for example in MDCache::upkeep_main() or MDSDaemon::ms_dispatch2() - Beacon::send_and_wait() waits for a `MSG_MDS_BEACON` packet to be dispatched (via `cvar` with a timeout) At this point, even if a `MSG_MDS_BEACON` packet is received by one of the worker threads, they will put it in the `DispatchQueue`, but no dispatcher thread will be able to handle it because they are all stuck. The cvar.wait_for() call in Beacon::send_and_wait() will therefore time out and the `MSG_MDS_BEACON` will never be processed. The proper solution is to unlock `mds_lock` to avoid the dispatchers from getting stuck. And in general, we should be holding a lock strictly only when it is needed and never do blocking calls while holding a lock. Fixes: https://tracker.ceph.com/issues/68760 Signed-off-by: Max Kellermann <max.kellermann@ionos.com>

vshankar · 2025-07-30T06:41:45Z

putting this to retest

vshankar

https://tracker.ceph.com/projects/cephfs/wiki/QA_main_2025#wip-vshankar-testing-20250730064735-debug

vshankar · 2025-08-04T05:51:53Z

jenkins make check arm64

vshankar · 2025-08-04T05:53:09Z

jenkins test make check arm64

github-actions bot added the cephfs Ceph File System label Oct 15, 2024

batrick requested changes Oct 16, 2024

View reviewed changes

MaxKellermann force-pushed the MDSDaemon_suicide_unlock branch from 60b532c to 4211071 Compare October 29, 2024 20:38

batrick requested changes Oct 31, 2024

View reviewed changes

src/mds/MDSDaemon.cc Outdated Show resolved Hide resolved

MaxKellermann force-pushed the MDSDaemon_suicide_unlock branch from 4211071 to c1df8f3 Compare October 31, 2024 10:09

batrick requested changes Oct 31, 2024

View reviewed changes

src/mds/MDSDaemon.cc Outdated Show resolved Hide resolved

vshankar assigned batrick Jan 6, 2025

MaxKellermann force-pushed the MDSDaemon_suicide_unlock branch from c1df8f3 to 0695334 Compare January 8, 2025 05:11

vshankar requested a review from batrick February 10, 2025 13:22

MaxKellermann force-pushed the MDSDaemon_suicide_unlock branch from 0695334 to aed19e4 Compare May 22, 2025 13:00

batrick approved these changes May 22, 2025

View reviewed changes

batrick added the needs-qa label May 22, 2025

batrick removed their assignment May 22, 2025

github-actions bot added the stale label Jul 21, 2025

batrick removed the stale label Jul 22, 2025

vshankar added the wip-vshankar-testing3 label Jul 22, 2025

vshankar requested changes Jul 30, 2025

View reviewed changes

src/mds/MDSDaemon.cc Show resolved Hide resolved

MaxKellermann force-pushed the MDSDaemon_suicide_unlock branch from aed19e4 to c0fedb5 Compare July 30, 2025 06:25

vshankar approved these changes Jul 30, 2025

View reviewed changes

vshankar approved these changes Aug 4, 2025

View reviewed changes

vshankar merged commit 98e4f86 into ceph:main Aug 4, 2025
13 checks passed

vshankar removed the wip-vshankar-testing3 label Aug 4, 2025

Conversation

MaxKellermann commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

batrick left a comment

Choose a reason for hiding this comment

Uh oh!

MaxKellermann commented Oct 29, 2024

Uh oh!

Uh oh!

Uh oh!

MaxKellermann commented Nov 1, 2024

Uh oh!

batrick commented Nov 4, 2024

Uh oh!

MaxKellermann commented Dec 7, 2024

Uh oh!

MaxKellermann commented Jan 4, 2025

Uh oh!

batrick commented Jan 5, 2025

Uh oh!

MaxKellermann commented Jan 8, 2025

Uh oh!

MaxKellermann commented Jan 13, 2025

Uh oh!

ceph-jenkins commented Jan 13, 2025

Uh oh!

MaxKellermann commented Apr 1, 2025

Uh oh!

batrick commented May 22, 2025

Uh oh!

MaxKellermann commented May 22, 2025

Uh oh!

batrick commented May 22, 2025

Uh oh!

github-actions bot commented Jul 21, 2025

Uh oh!

batrick commented Jul 22, 2025

Uh oh!

vshankar commented Jul 22, 2025

Uh oh!

Uh oh!

vshankar commented Jul 30, 2025

Uh oh!

vshankar left a comment

Choose a reason for hiding this comment

Uh oh!

vshankar commented Aug 4, 2025

Uh oh!

vshankar commented Aug 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MaxKellermann commented Oct 15, 2024 •

edited

Loading