Skip to content

squid: mds: do remove the cap when seqs equal or larger than last issue#58294

Merged
lxbsz merged 1 commit intoceph:squidfrom
batrick:wip-66623-squid
Jul 16, 2024
Merged

squid: mds: do remove the cap when seqs equal or larger than last issue#58294
lxbsz merged 1 commit intoceph:squidfrom
batrick:wip-66623-squid

Conversation

@batrick
Copy link
Member

@batrick batrick commented Jun 26, 2024

backport tracker: https://tracker.ceph.com/issues/66623


backport of #56828
parent tracker: https://tracker.ceph.com/issues/64977

this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh

There is a race in case of:

   MDS                            rw Client
- Issue the 'Asx' caps to
  rw client
                             - Adds the cap, then removes it
			       later by queuing it to the cap
			       release list. But the cap->seq
			       may have been updated by previous
			       cap grant requests.
			       And the cap grant request won't
			       increase the 'last_issue' seq in
			       MDS.
- ro client's lookup
  request comes and the
  MDS sends a 'Ax' caps
  revoke request to rw
  client by increasing
  the 'seq'.
                             - The revoke request just finds
			       that the cap doesn't exist, then
			       queues a new cap release
			       immediately with the new 'seq'.
			       Then trigger to flush the pending
			       cap releases to MDS.
- Just receives the cap
  release request but the
  'seq' > cap's 'last_issue',
  then MDS will skip
  removing the cap. And
  then the _do_cap_release()
  will issue the 'Ax' caps
  back to rw client.

  Then wakes up the ro
  client's lookup request,
  while the lookup request
  will try to revoke the
  'Ax' caps again from the
  rw client.

This will cause a spinlock infinitely in mds side.

Fixes: https://tracker.ceph.com/issues/64977
Signed-off-by: Xiubo Li <xiubli@redhat.com>
(cherry picked from commit 345978e)
@batrick batrick added this to the squid milestone Jun 26, 2024
@batrick batrick added the cephfs Ceph File System label Jun 26, 2024
@joscollin joscollin added the wip-jcollin-testing-squid2 Assigned for review label Jul 1, 2024
@joscollin
Copy link
Member

This PR is under test in https://tracker.ceph.com/issues/66762.

@lxbsz
Copy link
Member

lxbsz commented Jul 15, 2024

Checked all the failures, they are all not related, please see 2024-07-09 in https://tracker.ceph.com/projects/cephfs/wiki/Squid.

@lxbsz lxbsz merged commit 35c6770 into ceph:squid Jul 16, 2024
@joscollin joscollin removed the wip-jcollin-testing-squid2 Assigned for review label Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cephfs Ceph File System

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants