Skip to content

quincy: mds: use regular dispatch for processing metrics#57679

Closed
batrick wants to merge 2 commits intoceph:quincyfrom
batrick:wip-66189-quincy
Closed

quincy: mds: use regular dispatch for processing metrics#57679
batrick wants to merge 2 commits intoceph:quincyfrom
batrick:wip-66189-quincy

Conversation

@batrick
Copy link
Member

@batrick batrick commented May 23, 2024

backport tracker: https://tracker.ceph.com/issues/66189


backport of #57081
parent tracker: https://tracker.ceph.com/issues/65658

this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh

@batrick batrick added this to the quincy milestone May 23, 2024
@batrick batrick added the cephfs Ceph File System label May 23, 2024
@lxbsz
Copy link
Member

lxbsz commented Aug 2, 2024

This PR is under test in https://tracker.ceph.com/issues/67315.

@batrick
Copy link
Member Author

batrick commented Dec 2, 2024

jenkins test make check

@batrick
Copy link
Member Author

batrick commented Dec 2, 2024

jenkins test api

@ceph ceph deleted a comment from github-actions bot Dec 30, 2024
@batrick
Copy link
Member Author

batrick commented Dec 30, 2024

jenkins test api

@vshankar
Copy link
Contributor

vshankar commented Jan 2, 2025

This PR is under test in https://tracker.ceph.com/issues/69400.

There have been cases where the MDS does an undesirable failover because it
misses heartbeat resets after a long recovery in up:replay.  It was observed
that the MDS was processing a flood of metrics messages from all reconnecting
clients. This likely caused undersiable MetricAggregator::lock contention in
the messenger threads while fast dispatching client metrics.

Instead, use the normal dispatch where acquiring locks is okay to do.

See-also: linux.git/f7c2f4f6ce16fb58f7d024f3e1b40023c4b43ff9
Fixes: https://tracker.ceph.com/issues/65658
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit ed1fe99)
Since these are no longer fast dispatched, we need to ensure they are processed
in a timely fashion and ahead of any incoming requests.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
(cherry picked from commit d56b502)
@vshankar
Copy link
Contributor

Quincy is EOL.

@vshankar vshankar closed this Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cephfs Ceph File System needs-qa

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants