mds: use regular dispatch for processing beacons#57469
Conversation
leonid-s-usov
left a comment
There was a problem hiding this comment.
Is there a flow where the dispatchers are added dynamically, which would require a priority queue? Until now I was under the impression that dispatchers are added in a predictable static initialization flow, meaning that simply ordering the add_dispatcher_* calls should provide enough control over the invocation order.
If we have to go for the new priority argument, I think we should define standard high/default/low priorities, evenly spaced in the priority range. The default should be in the middle of the priority range, not on the lowest side.
Dispatchers can be added dynamically but it is uncommon. Adding dispatchers to the head/tail of the list is just too clumsy; there's no reason not to just define the order. It's rather important to get right because some dispatchers acquire highly contentious locks (mds_lock being one example).
Sure. |
leonid-s-usov
left a comment
There was a problem hiding this comment.
OK then let's go with the priorities.
However, we should use the priority constants instead of literal magic numbers and try our best to leave calls unchanged where the default priority should apply.
d9794d4 to
8defa5d
Compare
|
I've switched
|
src/mds/MDSRank.cc
Outdated
| dout(10) << __func__ << ": initializing metrics handler" << dendl; | ||
| metrics_handler.init(); | ||
| messenger->add_dispatcher_tail(&metrics_handler); | ||
| messenger->add_dispatcher_head(&metrics_handler, 20); // very high but behind Beacon |
There was a problem hiding this comment.
Still magic numbers. Now, one needs to go find where beacon handler is added and check that number. And if that number changes, how do they know to update this place?
If this should be high but behind beacon, this should be
messenger->add_dispatcher_tail(&metrics_handler, ::priority_high);
and beacon should stay
messenger->add_dispatcher_head(&beacon, ::priority_high);
For order checking. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
So we can ensure that e.g. MDSRank::ms_dispatch is lowest priority so that we do not acquire the mds_lock when looking at beacons. This change maintains the current behavior when the priority is unset: the use of std::stable_sort will ensure that the add_dispatcher_head and add_dispatcher_tail calls will preserve order when dispatcher priorities are equal. Fixes: 7fc04be Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Similar to the issue with MClientMetrics, beacons should also not be handled via fast dispatch because it's necessary to acquire Beacon::mutex. This is a big no-no as it may block one of the Messenger threads leading to improbable deadlocks or DoS. Instead, use the normal dispatch where acquiring locks is okay to do. Fixes: 7fc04be See-also: linux.git/f7c2f4f6ce16fb58f7d024f3e1b40023c4b43ff9 Fixes: https://tracker.ceph.com/issues/65658 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This tries to preserve existing order but uses priorities to make it explicit and robust to future dispatchers being added. Except: - The beacon and metrics dispatcher have the highest priorities. This is to ensure we process these messages before trying to acquire any expensive locks (like mds_lock). - The monc dispatcher also has a relatively high priority for the same reasons. This change affects other daemons which may have ordered a dispatcher ahead of the monc but I cannot think of a legitimate reason to nor do I see an instance of it. Fixes: 7fc04be Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
|
This PR is under test in https://tracker.ceph.com/issues/66086. |
|
@batrick does the fs suite run under valgrind against ubuntu? in https://tracker.ceph.com/issues/66336, the rgw suite is tracking i'm stumped by the actual valgrind issue though; the new/delete seem to be internal to stable_sort() in the standard library, and only effect ubuntu runs |
Yes, see
Ya, I'm not seeing an issue with the code. It's either a bug in glibc++ or valgrind. |
ok, thanks. i scanned through some recent fs suite results and did spot several occurrances of this (pulpito summarizes the failure as |
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e