mds: use regular dispatch for processing metrics#57081
Conversation
|
jenkins test api |
|
jenkins test make check arm64 |
vshankar
left a comment
There was a problem hiding this comment.
This change is fine - I guess we underestimated issues when using fast dispatch in the original PR, especially when using locks - the whole discussion of using spinlocks vs mutexes. I recall that we switched to using a mutex since the code was doing memory allocation (under a spinlock which is a big no no). Maybe it's time to revisit that.
MDS uses fast dispatch with beacon (reply) messages which aren't too frequently received or there isn't a flood of those at any given point. The effect with metrics messages are that they can be huge in number and too frequent.
|
This PR is under test in https://tracker.ceph.com/issues/65771. |
* refs/pull/57081/head: mds: use regular dispatch for processing metrics
|
I'm not going to merge this right yet: I think we should also increase the priority of metrics messages so they get processed ahead of regular requests. |
Does it matter if the processing is delayed a bit? Did you see huge lags in metrics updation in tests? |
No. But I'm trying to maintain the current behavior of metrics messages being processed promptly by the MDS. In any case, it makes sense to me that these messages should not be queued behind potentially thousands of client requests. |
There have been cases where the MDS does an undesirable failover because it misses heartbeat resets after a long recovery in up:replay. It was observed that the MDS was processing a flood of metrics messages from all reconnecting clients. This likely caused undersiable MetricAggregator::lock contention in the messenger threads while fast dispatching client metrics. Instead, use the normal dispatch where acquiring locks is okay to do. See-also: linux.git/f7c2f4f6ce16fb58f7d024f3e1b40023c4b43ff9 Fixes: https://tracker.ceph.com/issues/65658 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Since these are no longer fast dispatched, we need to ensure they are processed in a timely fashion and ahead of any incoming requests. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
|
jenkins test make check |
|
jenkins test windows |
|
This PR is under test in https://tracker.ceph.com/issues/65867. |
You are just not allowed to block inside of an ms_fast_dispatch function. Like, at all. It's stealing one of the AsyncMessenger threads (default: 8?) to do the fast dispatch and if you go to sleep holding it then messaging work can stop. So holding a mutex for the entirety of the work duration is really, really bad, and the client reconnect storms we were getting straightforwardly translate into blocked messaging (because way more than 8 clients can simultaneously send in looooots of MClientMetrics messages, which have to be processed serially, and as they're holding messenger locks while being processed, they will starve any other attempts to use the messenger and back up other MDS worker threads when outgoing throttle limits are hit). If you look at the OSD, I believe you'll find it grabs one mutex for the duration of an "enqueue" call. But everything else is lock free. |
|
Actually @vshankar, is there some way we could update the fast_dispatch documentation to be clearer about these requirements? I thought it was already pretty explicit, but it's clearly not good enough. |
|
@batrick we have similar problems with Beacon https://tracker.ceph.com/issues/66014 so please look at that next. We'll need to backport both of these everywhere. |
|
|
This PR is under test in https://tracker.ceph.com/issues/66086. |
There have been cases where the MDS does an undesirable failover because it misses heartbeat resets after a long recovery in up:replay. It was observed that the MDS was processing a flood of metrics messages from all reconnecting clients. This likely caused undersiable MetricAggregator::lock contention in the messenger threads while fast dispatching client metrics.
Instead, use the normal dispatch where acquiring locks is okay to do.
See-also: linux.git/f7c2f4f6ce16fb58f7d024f3e1b40023c4b43ff9
Fixes: https://tracker.ceph.com/issues/65658
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e