Skip to content

client: fixed a bug that read operation hung#59027

Merged
vshankar merged 1 commit intoceph:mainfrom
hit1943:65971_fix
Oct 24, 2024
Merged

client: fixed a bug that read operation hung#59027
vshankar merged 1 commit intoceph:mainfrom
hit1943:65971_fix

Conversation

@hit1943
Copy link

@hit1943 hit1943 commented Aug 5, 2024

The Fc cap refs is no-zero(get_cap_ref by a previous readahead).
Fc cap is being revoked by the mds, and in handle_cap_grant it misses to call check_caps to update cap->issued and cap->implemented
Then a new read operation comes, and in Client::get_caps the reader wants a Fc cap, it will wait for a notification after the Fc revocation finished.
So after the readahead finished, it calls put_cap_ref to released Fc cap,whilch should notify the waiting reader after check_caps

    Fixes: https://tracker.ceph.com/issues/65971
    Signed-off-by: Tod Chen <chentao.2022@bytedance.com>

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

…n the Fc caps is wanted but revoked by the mds, and the Fc cap refs is no-zero

        Fixes: https://tracker.ceph.com/issues/65971
        Signed-off-by: Tod Chen <chentao.2022@bytedance.com>
@github-actions github-actions bot added the cephfs Ceph File System label Aug 5, 2024
@vshankar vshankar requested a review from a team August 20, 2024 15:32
@dparmar18
Copy link
Contributor

This is an interesting patch, somehow I feel this might also solve https://tracker.ceph.com/issues/66581

@vshankar
Copy link
Contributor

This is an interesting patch, somehow I feel this might also solve https://tracker.ceph.com/issues/66581

@dparmar18 Do you see the same issue in the client logs for the above tracker?

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks fine to me. @hit1943 - Is the client hang not reproducible with this fix? Just checking before including this in my test branch.

@hit1943
Copy link
Author

hit1943 commented Sep 1, 2024

Change looks fine to me. @hit1943 - Is the client hang not reproducible with this fix? Just checking before including this in my test branch.

Yeah, I have used this modification in my personal cluster environment, and it has been verified to solve the problem.

@dparmar18
Copy link
Contributor

jenkins test api

@vshankar
Copy link
Contributor

This PR is under test in https://tracker.ceph.com/issues/68092.

joscollin pushed a commit to joscollin/ceph that referenced this pull request Sep 23, 2024
* refs/pull/59027/head:
	        cephfs: Fixed a bug that read operation hung in Client::get_caps when the Fc caps is wanted but revoked by the mds, and the Fc cap refs is no-zero
@vshankar
Copy link
Contributor

vshankar commented Oct 7, 2024

Testing update: Another PR in the test branch had a bug. Rerunning failed tests without that change.

@vshankar vshankar changed the title cephfs: Fixed a bug that read operation hung client: Fixed a bug that read operation hung Oct 7, 2024
@vshankar vshankar changed the title client: Fixed a bug that read operation hung client: fixed a bug that read operation hung Oct 7, 2024
@dparmar18
Copy link
Contributor

jenkins test api

joscollin pushed a commit to joscollin/ceph that referenced this pull request Oct 11, 2024
* refs/pull/59027/head:
	        cephfs: Fixed a bug that read operation hung in Client::get_caps when the Fc caps is wanted but revoked by the mds, and the Fc cap refs is no-zero
@vshankar
Copy link
Contributor

This change is good to merge. I'll be doing the needful after preparing the run wiki. FYI, @hit1943

Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vshankar
Copy link
Contributor

jenkins test api

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cephfs Ceph File System

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants