Skip to content

reef: mds: nudge_log never nudges the log#67495

Merged
batrick merged 1 commit intoceph:reeffrom
batrick:i75141
Feb 25, 2026
Merged

reef: mds: nudge_log never nudges the log#67495
batrick merged 1 commit intoceph:reeffrom
batrick:i75141

Conversation

@batrick
Copy link
Member

@batrick batrick commented Feb 24, 2026

The Locker uses has_any_waiter for a particular lock to evaluate whether to nudge the log. For the squid, tentacle, and main branches, this larger bit mask (all 64 bits) will cause this to wrongly return true for other locks which have waiters. The side-effect of waking requests spuriously is undesirable but should not affect performance significantly.

For reef and older releases, using std::numeric_limits<uint64_t>::max() in has_any_waiter() causes a bitwise overflow that sets the wait-queue search bound impossibly high, resulting in the method always incorrectly returning false. This results in nudge_log never nudging the log!

Note: for reef the fix is different because of the interface refactor. For that reason, this fix is applied directly to reef.

Fixes: db5c9dc
Fixes: https://tracker.ceph.com/issues/75141

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

You must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.

@batrick
Copy link
Member Author

batrick commented Feb 24, 2026

jenkins test make check

@batrick
Copy link
Member Author

batrick commented Feb 25, 2026

The following tests FAILED:
	 32 - run-rbd-unit-tests-61.sh (Timeout)
	 34 - run-rbd-unit-tests-127.sh (Failed)
	186 - unittest_bluefs (Subprocess killed)

@batrick
Copy link
Member Author

batrick commented Feb 25, 2026

jenkins test make check

The Locker uses has_any_waiter for a particular lock to evaluate whether
to nudge the log. For the squid, tentacle, and main branches, this
larger bit mask (all 64 bits) will cause this to wrongly return true for
other locks which have waiters. The side-effect of waking requests
spuriously is undesirable but should not affect performance
significantly.

For reef and older releases, using std::numeric_limits<uint64_t>::max()
in has_any_waiter() causes a bitwise overflow that sets the wait-queue
search bound impossibly high, resulting in the method always incorrectly
returning false. This results in nudge_log never nudging the log!

Note: for reef the fix is different because of the interface refactor.
For that reason, this fix is applied directly to reef.

Fixes: db5c9dc
Fixes: https://tracker.ceph.com/issues/75141
Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
@batrick
Copy link
Member Author

batrick commented Feb 25, 2026

jenkins test make check

Copy link
Member Author

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>

@batrick batrick merged commit ab47f43 into ceph:reef Feb 25, 2026
8 of 11 checks passed
@batrick batrick deleted the i75141 branch February 25, 2026 18:18
@markhpc
Copy link
Member

markhpc commented Mar 2, 2026

This PR appears to be slightly different from the PR to main here: #67496

First, this is great work. Thank you Patrick for fixing this. I'm a little confused why this was merged though. Is this classified as a backport due to the interface change? If not, was it run through QA? Did it go through a review/approval process? If it is considered a backport, why was it merged before 67496 was tested/merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cephfs Ceph File System performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants