Skip to content

mds: scrub pins more inodes than the mds_cache_memory_limit#65858

Merged
vshankar merged 1 commit intoceph:mainfrom
sajibreadd:wip-71167-scrub-improvement
Mar 9, 2026
Merged

mds: scrub pins more inodes than the mds_cache_memory_limit#65858
vshankar merged 1 commit intoceph:mainfrom
sajibreadd:wip-71167-scrub-improvement

Conversation

@sajibreadd
Copy link
Member

For scrubbing dirfrag we are pushing children back into the scrub stack. Instead we can follow the same strategy for scrub directory and pushing children front of the scrub stack, and in kick_off_scrubs always start scrubbing from the front of the stack. It will prevent ScrubStack to pinning whole level of the file-system tree.

Fixes: https://tracker.ceph.com/issues/71167

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

You must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.

@github-actions github-actions bot added the cephfs Ceph File System label Oct 9, 2025
@sajibreadd sajibreadd requested a review from vshankar October 9, 2025 12:15
@vshankar vshankar requested a review from a team October 9, 2025 12:17
@github-actions
Copy link

github-actions bot commented Oct 9, 2025

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@sajibreadd sajibreadd force-pushed the wip-71167-scrub-improvement branch from 4cd9baf to 8f018ce Compare October 10, 2025 07:07
@dparmar18
Copy link
Contributor

So this would be a LIFO now and the memory consumption should be proportional to the depth of the tree at every level instead of the total children at every level. So, the worst case for num. of inodes pinned should be (say if B is the avg. branching factor and D is the depth) ~B * D v/s ~B^D (current approach)?

@sajibreadd
Copy link
Member Author

So this would be a LIFO now and the memory consumption should be proportional to the depth of the tree at every level instead of the total children at every level. So, the worst case for num. of inodes pinned should be (say if B is the avg. branching factor and D is the depth) ~B * D v/s ~B^D (current approach)?

Right, but I think it can improved to just D, having (inode, iterator) as a pair in the scrub_stack but I have seen in the tracker developers are cautious about multi-mds scrub, although I don't understand why it would be problem(May be issue can be a dirfrag might need to be pinned until it's whole subtree get scrubbed).

But for now this minimal change can get rid of long lasting memory issue.

@vshankar
Copy link
Contributor

@mchangir for review.

@vshankar
Copy link
Contributor

@mchangir @kotreshhr - please plan to review this change.

@vshankar
Copy link
Contributor

vshankar commented Nov 5, 2025

@mchangir @kotreshhr - please plan to review this change.

Gentle nudge for review.

@sajibreadd sajibreadd requested a review from ifed01 December 9, 2025 07:00
@ifed01
Copy link
Contributor

ifed01 commented Dec 9, 2025

jenkins test make check

@vshankar
Copy link
Contributor

Right, but I think it can improved to just D, having (inode, iterator) as a pair in the scrub_stack but I have seen in the tracker developers are cautious about multi-mds scrub, although I don't understand why it would be problem(May be issue can be a dirfrag might need to be pinned until it's whole subtree get scrubbed).

But for now this minimal change can get rid of long lasting memory issue.

I agree and this is a good minimalistic approach which works around the issue 👍

@vshankar
Copy link
Contributor

Just FYI - testing was blocked for a while due to lab move. I will be revisiting pending PRs once everything is up working.

@vshankar
Copy link
Contributor

This PR is under test in https://tracker.ceph.com/issues/75073.

vshankar added a commit to vshankar/ceph that referenced this pull request Feb 25, 2026
* refs/pull/65858/head:

Reviewed-by: Venky Shankar <vshankar@redhat.com>
Reviewed-by: Igor Fedotov <ifedotov@suse.com>
@vshankar
Copy link
Contributor

vshankar commented Mar 3, 2026

jenkins retest this please

@sajibreadd sajibreadd force-pushed the wip-71167-scrub-improvement branch from 2aba0f8 to 24aca24 Compare March 3, 2026 09:08
@vshankar
Copy link
Contributor

vshankar commented Mar 3, 2026

Hi @sajibreadd - any changes in the latest push?

@sajibreadd
Copy link
Member Author

Hi @sajibreadd - any changes in the latest push?

No, just rebased it.

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For scrubbing dirfrag we are pushing children back into the scrub stack. Instead we can follow the same
strategy for scrub directory and pushing children front of the scrub stack, and in kick_off_scrubs always
start scrubbing from the front of the stack. It will prevent ScrubStack to pinning whole level of the file-system
tree.

Fixes: https://tracker.ceph.com/issues/71167
Signed-off-by: Md Mahamudur Rahaman Sajib <mahamudur.sajib@croit.io>
@sajibreadd sajibreadd force-pushed the wip-71167-scrub-improvement branch from 24aca24 to 5815655 Compare March 3, 2026 11:47
@sajibreadd
Copy link
Member Author

Hi @sajibreadd - any changes in the latest push?

I am rebasing again, as previous rebase was done to my forked main. I just did rebase to upstream/main. Let's see now whether make check passes or not as my commit is very much unrelated to the test for which it is failing.

@sajibreadd
Copy link
Member Author

jenkins test make check

@vshankar vshankar merged commit 1800aea into ceph:main Mar 9, 2026
13 checks passed
@sajibreadd sajibreadd deleted the wip-71167-scrub-improvement branch March 16, 2026 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants