Skip to content

Fix MDS shutdown deadlocks and timeouts#60320

Closed
MaxKellermann wants to merge 4 commits intoceph:mainfrom
MaxKellermann:mds_shutdown
Closed

Fix MDS shutdown deadlocks and timeouts#60320
MaxKellermann wants to merge 4 commits intoceph:mainfrom
MaxKellermann:mds_shutdown

Conversation

@MaxKellermann
Copy link
Member

This PR fixes several bugs that delay the MDS process shutdown by several seconds. With these changes, the MDS process exits almost instantly.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
This eliminates the `wait_for()` delay and speeds up MDS shutdown.

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
This fixes a deadlock bug during MDS shutdown.

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
During shutdown, the MDS sends a `MSG_MDS_BEACON` with
`MDSMap::STATE_DNE` (in `MDSDaemon::suicide()`) and then waits for a
`MSG_MDS_BEACON` reply from the MON.

The MON, however, suppresses replies to `STATE_DNE`; in
`MDSMonitor::preprocess_beacon()`, it returns early on `STATE_DNE` and
`MDSMonitor::prepare_beacon()` silently evicts the dying MDS without
any reply.

This delays the MDS shutdown until the MDS times out.

Since `MDSDaemon::suicide()` has code to wait for a beacon reply, I
figure that the MON reply was suppressed accidently, therefore I
suggest adding it.

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
@MaxKellermann MaxKellermann requested a review from a team as a code owner October 15, 2024 13:01
@github-actions github-actions bot added cephfs Ceph File System core mon labels Oct 15, 2024
Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much every commit should be a separate PR. If appropriate for backport, each PR should have a tracker ticket:

Please create a tracker issue so that this issue can be tracked for backporting.

Then, please annotate the commit which fixes/resolves a Ceph tracker issue with:

Fixes: http://tracker.ceph.com/issues/...

This is essential when examining the history of the repository (this commit fixes what) and helps merge scripts identify issues that have been resolved by a merge. See this article on GitHub on how to amend commits and update your pull request.

_notify_mdsmap(mdsmap);

sender = std::thread([this]() {
ceph_pthread_setname(pthread_self(), "beacon");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this just a cleanup patch or does it have some useful side-effect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both. The useful side effect is that top/gdb/etc. show a proper thread name. No effect for the process itself.

@MaxKellermann
Copy link
Member Author

Pretty much every commit should be a separate PR.
[...]
This is essential when examining the history of the repository (this commit fixes what) and helps merge scripts identify issues that have been resolved by a merge

Please help me understand this policy. Even if this is a single PR, there are still 4 distinct commits, and each commit can be identified to fix one thing. What does splitting into 4 PRs change for that?

@MaxKellermann
Copy link
Member Author

I split this PR into 4 separate PRs, each with one commit. Not my taste, but if you prefer it that way...

@MaxKellermann MaxKellermann deleted the mds_shutdown branch October 15, 2024 14:50
@batrick
Copy link
Member

batrick commented Oct 16, 2024

Pretty much every commit should be a separate PR.
[...]
This is essential when examining the history of the repository (this commit fixes what) and helps merge scripts identify issues that have been resolved by a merge

Please help me understand this policy. Even if this is a single PR, there are still 4 distinct commits, and each commit can be identified to fix one thing. What does splitting into 4 PRs change for that?

It simplifies backporting. We don't necessarily want to backport all of these changes.

@MaxKellermann
Copy link
Member Author

I don't get that - you can easily backport single commits with git cherry-pick. I forward-port individual patches from reef to main using stg pick which is mostly the same as git cherry-pick but for stgit. (My stgit stack has grown to 250 patches. I can't submit more PRs right now due to dependencies on other unmerged PRs.)

@MaxKellermann MaxKellermann mentioned this pull request Dec 9, 2024
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cephfs Ceph File System core mon

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants