Skip to content

mds: prevent duplicate wrlock acquisition for a single request#61250

Merged
vshankar merged 2 commits intoceph:mainfrom
Nordix:fix-65607
Jan 27, 2025
Merged

mds: prevent duplicate wrlock acquisition for a single request#61250
vshankar merged 2 commits intoceph:mainfrom
Nordix:fix-65607

Conversation

@Sunnatillo
Copy link
Contributor

@Sunnatillo Sunnatillo commented Jan 7, 2025

Take over of abandoned PR #58474.

In case the current request has lock cache attached and then the
lock cache must have already acquired the wrlock of filelock. So
currently the path_traverse() will acquire the wrlock twice and
possibly caused deadlock by itself.

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

The lock cache is buggy and we need to disable it as a workaround.

Fixes: https://tracker.ceph.com/issues/65607
Signed-off-by: Xiubo Li <xiubli@redhat.com>
In case the current request has lock cache attached and then the
lock cache must have already acquired the wrlock of filelock. So
currently the path_traverse() will acquire the wrlock twice and
possibly caused deadlock by itself.

Fixes: https://tracker.ceph.com/issues/65607
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Sunnatillo <sunnat.samadov@est.tech>
@Sunnatillo
Copy link
Contributor Author

jenkins test make check

@Sunnatillo
Copy link
Contributor Author

jenkins test api

1 similar comment
@Sunnatillo
Copy link
Contributor Author

jenkins test api

@ceph-jenkins
Copy link
Collaborator

Can one of the admins verify this patch?

@vshankar vshankar self-assigned this Jan 13, 2025
@kashifest
Copy link

@vshankar we would really appreciate some attention on this issue, the previous PR was promised to be reviewed couple of months ago and now this new one is a week old and still no reviews 😞

@vshankar
Copy link
Contributor

@vshankar we would really appreciate some attention on this issue, the previous PR was promised to be reviewed couple of months ago and now this new one is a week old and still no reviews 😞

I should apologise first. Yes, this change had got delayed and put on the side-burner. CephFS folks have been a bit more occupied than usual from the last couple of months - PTOs, conferences, etc.. I did assign this to @batrick for review last week, however, @batrick had to go on leave. As you can see, I have this in my list and I assure you that it will get the required attention soon. Expect a response by tomorrow.

@kashifest
Copy link

@vshankar we would really appreciate some attention on this issue, the previous PR was promised to be reviewed couple of months ago and now this new one is a week old and still no reviews 😞

I should apologise first. Yes, this change had got delayed and put on the side-burner. CephFS folks have been a bit more occupied than usual from the last couple of months - PTOs, conferences, etc.. I did assign this to @batrick for review last week, however, @batrick had to go on leave. As you can see, I have this in my list and I assure you that it will get the required attention soon. Expect a response by tomorrow.

Thanks a lot and appreciate your efforts.

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. However I would like the config to be exercised in our test (as a separate PR).

I'll create a tracker and link it in here. Will put this to test. Thx @Sunnatillo

@vshankar
Copy link
Contributor

This PR is under test in https://tracker.ceph.com/issues/69546.

@vshankar
Copy link
Contributor

This PR is under test in https://tracker.ceph.com/issues/69546.

I've checked the test run and I think this change is good to merge. I'll be doing the needful soon (attending conferences, thereby the delay).

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vshankar vshankar merged commit b6a0e49 into ceph:main Jan 27, 2025
@Sunnatillo Sunnatillo deleted the fix-65607 branch January 27, 2025 20:55
@Sunnatillo Sunnatillo restored the fix-65607 branch January 27, 2025 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cephfs Ceph File System common

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants