Skip to content

quincy: mds: do remove the cap when seqs equal or larger than last issue#58296

Merged
yuriw merged 1 commit intoceph:quincyfrom
batrick:wip-66624-quincy
Oct 21, 2024
Merged

quincy: mds: do remove the cap when seqs equal or larger than last issue#58296
yuriw merged 1 commit intoceph:quincyfrom
batrick:wip-66624-quincy

Conversation

@batrick
Copy link
Member

@batrick batrick commented Jun 26, 2024

backport tracker: https://tracker.ceph.com/issues/66624


backport of #56828
parent tracker: https://tracker.ceph.com/issues/64977

this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh

@batrick batrick added this to the quincy milestone Jun 26, 2024
@batrick batrick added the cephfs Ceph File System label Jun 26, 2024
@lxbsz
Copy link
Member

lxbsz commented Aug 2, 2024

This PR is under test in https://tracker.ceph.com/issues/67315.

There is a race in case of:

   MDS                            rw Client
- Issue the 'Asx' caps to
  rw client
                             - Adds the cap, then removes it
			       later by queuing it to the cap
			       release list. But the cap->seq
			       may have been updated by previous
			       cap grant requests.
			       And the cap grant request won't
			       increase the 'last_issue' seq in
			       MDS.
- ro client's lookup
  request comes and the
  MDS sends a 'Ax' caps
  revoke request to rw
  client by increasing
  the 'seq'.
                             - The revoke request just finds
			       that the cap doesn't exist, then
			       queues a new cap release
			       immediately with the new 'seq'.
			       Then trigger to flush the pending
			       cap releases to MDS.
- Just receives the cap
  release request but the
  'seq' > cap's 'last_issue',
  then MDS will skip
  removing the cap. And
  then the _do_cap_release()
  will issue the 'Ax' caps
  back to rw client.

  Then wakes up the ro
  client's lookup request,
  while the lookup request
  will try to revoke the
  'Ax' caps again from the
  rw client.

This will cause a spinlock infinitely in mds side.

Fixes: https://tracker.ceph.com/issues/64977
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 345978e)
Copy link
Contributor

@yuriw yuriw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed-by: Venky Shankar vshankar@redhat.com

@yuriw yuriw merged commit 183c13c into ceph:quincy Oct 21, 2024
@yuriw
Copy link
Contributor

yuriw commented Oct 21, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants