cephfs: MDCache request cleanup by theanalyst · Pull Request #62630 · ceph/ceph

theanalyst · 2025-04-02T13:36:10Z

When BatchGetAttr finds no new head, we end up passing a potential null request object to Finisher which crashes later. Handle this
Fixes: https://tracker.ceph.com/issues/70769

Contribution Guidelines

To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins test classic perf Jenkins Job | Jenkins Job Definition
jenkins test crimson perf Jenkins Job | Jenkins Job Definition
jenkins test signed Jenkins Job | Jenkins Job Definition
jenkins test make check Jenkins Job | Jenkins Job Definition
jenkins test make check arm64 Jenkins Job | Jenkins Job Definition
jenkins test submodules Jenkins Job | Jenkins Job Definition
jenkins test dashboard Jenkins Job | Jenkins Job Definition
jenkins test dashboard cephadm Jenkins Job | Jenkins Job Definition
jenkins test api Jenkins Job | Jenkins Job Definition
jenkins test docs ReadTheDocs | Github Workflow Definition
jenkins test ceph-volume all Jenkins Jobs | Jenkins Jobs Definition
jenkins test windows Jenkins Job | Jenkins Job Definition
jenkins test rook e2e Jenkins Job | Jenkins Job Definition

In cases where there is a single element in a batch_op_map,new_batch_head is a nullptr, when this is retried at Finisher we'd hit one of the asserts when dereferencing Fixes: https://tracker.ceph.com/issues/70769 Signed-off-by: Abhishek Lekshmanan <abhishek.lekshmanan@cern.ch>

Ignore null requests Signed-off-by: Abhishek Lekshmanan <abhishek.lekshmanan@cern.ch>

theanalyst · 2025-04-02T13:37:33Z

The main fix is the first commit; the second commit adds a couple of null checks, though this may prevent hiding actual issues so happy to drop it.

vshankar · 2025-04-07T12:50:01Z

@mchangir PTAL

mchangir · 2025-04-08T13:02:32Z

@theanalyst the crash is most probably this issue #62684

theanalyst · 2025-04-08T13:35:48Z

@theanalyst the crash is most probably this issue #62684
@mchangir
That explains the locking issue, however if you see what happens when we find a new batch head currently:

    auto new_batch_head = it->second->find_new_head();
    if (!new_batch_head) {
      mdr->batch_op_map->erase(it);
    } 
    mds->finisher->queue(new C_MDS_RetryRequest(this, new_batch_head));
  }

if the new_batch_head is not found it defaults to null, and this is passed to Finisher. We see this happening quite a lot if you launch requests with find etc on a large tree where a single inode might have only a single batch request and then you pass nullptr to Finisher which crashes. So I believe passing a non null new_batch_head is needed. What do you think?

mchangir · 2025-04-09T03:30:48Z

@theanalyst the crash is most probably this issue #62684
@mchangir
That explains the locking issue, however if you see what happens when we find a new batch head currently:
    auto new_batch_head = it->second->find_new_head();
    if (!new_batch_head) {
      mdr->batch_op_map->erase(it);
    } 
    mds->finisher->queue(new C_MDS_RetryRequest(this, new_batch_head));
  }
if the new_batch_head is not found it defaults to null, and this is passed to Finisher. We see this happening quite a lot if you launch requests with find etc on a large tree where a single inode might have only a single batch request and then you pass nullptr to Finisher which crashes. So I believe passing a non null new_batch_head is needed. What do you think?

@theanalyst would you be able to form a set of bash commands that mimic your workload ?
I'd like to run them and see the crash myself.

dvanders · 2025-05-23T16:09:18Z

I've just seen this in the wild on a single active v18.2.7 fs. I don't have any more detail about the use-case that triggered it.

vshankar · 2025-07-21T15:37:09Z

@mchangir I believe @theanalyst shared the reproducer elsewhere (via mail?). Could you please update the tracker with those details please. I will plan to have a look then.

sajibreadd · 2025-07-29T17:33:59Z

From the codeflow, what I can understand, when a session gets evicted it tries to clear the requests,

while (!session->requests.empty()) {
    auto mdr = MDRequestRef(*session->requests.begin());
    mdcache->request_kill(mdr);
  }

This void MDCache::request_kill(MDRequestRef &mdr) function in turns is doing the request_cleanup

While following the comment,

void MDCache::request_kill(MDRequestRef &mdr) {
 // rollback peer requests is tricky. just let the request proceed.

mds wants to just proceed the request marking it as a killed request. But when it comes to batch_head request, either it should replaced by new batch head(which is not killed) request completely in the request_cleanup and let it proceed or we should remove this assert as we are allowing the killed batch head to proceed. So the comment, // Should already be reset in request_cleanup(). doesn't justify.

void Server::dispatch_client_request(MDRequestRef &mdr) {
 // we shouldn't be waiting on anyone.
 ceph_assert(!mdr->has_more() || mdr->more()->waiting_on_peer.empty());

 if (mdr->killed) {
   // Should already be reset in request_cleanup().
   ceph_assert(!mdr->is_batch_head());

vshankar · 2025-08-13T12:53:07Z

@mchangir gentle nudge on this.

mchangir · 2025-08-19T15:09:55Z

src/mds/MDCache.cc

+    } else {
+      mds->finisher->queue(new C_MDS_RetryRequest(this, new_batch_head));


this looks sensible to me ... wrapping the re-queuing in an else block probably got missed earlier

For most of the crashes we see in the Finisher, this seems like the most logical place. I also tried adding print statements and I can see that we indeed queue up a null new_batch_head when the request gets killed and the inode doesn't have a new_batch_head. Finally we crash in finish where we try to access this nullptr

github-actions · 2025-10-18T17:01:44Z

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

vshankar · 2025-10-18T17:21:09Z

(unmarking stale)

I haven't been following the discussion here closely. @mchangir - Is this good to go? What about @sajibreadd comment #62630 (comment) ?

vshankar · 2025-11-02T17:05:56Z

This PR is under test in https://tracker.ceph.com/issues/73693.

vshankar

https://tracker.ceph.com/projects/cephfs/wiki/QA_main_2025#httpstrackercephcomissues73693

theanalyst added 2 commits April 2, 2025 15:32

mds: MDCache: check validity of mdr requests before dispatching

75cd8c0

Ignore null requests Signed-off-by: Abhishek Lekshmanan <abhishek.lekshmanan@cern.ch>

github-actions bot added the cephfs Ceph File System label Apr 2, 2025

theanalyst requested review from batrick, mchangir and vshankar April 2, 2025 13:40

vshankar assigned mchangir Apr 7, 2025

vshankar self-assigned this Jul 8, 2025

mchangir reviewed Aug 19, 2025

View reviewed changes

github-actions bot added the stale label Oct 18, 2025

vshankar removed the stale label Oct 18, 2025

vshankar added the wip-vshankar-testing2 label Nov 2, 2025

vshankar approved these changes Nov 17, 2025

View reviewed changes

vshankar merged commit b5ebfaf into ceph:main Nov 17, 2025
12 of 14 checks passed

This was referenced Dec 2, 2025

tentacle: cephfs: MDCache request cleanup #66469

Open

squid: cephfs: MDCache request cleanup #66472

Merged

		} else {
		mds->finisher->queue(new C_MDS_RetryRequest(this, new_batch_head));

Conversation

theanalyst commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Contribution Guidelines

Checklist

Uh oh!

theanalyst commented Apr 2, 2025

Uh oh!

vshankar commented Apr 7, 2025

Uh oh!

mchangir commented Apr 8, 2025

Uh oh!

theanalyst commented Apr 8, 2025

Uh oh!

mchangir commented Apr 9, 2025

Uh oh!

dvanders commented May 23, 2025

Uh oh!

vshankar commented Jul 21, 2025

Uh oh!

sajibreadd commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vshankar commented Aug 13, 2025

Uh oh!

mchangir Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

theanalyst Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 18, 2025

Uh oh!

vshankar commented Oct 18, 2025

Uh oh!

vshankar commented Nov 2, 2025

Uh oh!

vshankar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

theanalyst commented Apr 2, 2025 •

edited

Loading

sajibreadd commented Jul 29, 2025 •

edited

Loading