mds: fix client isn't responding to mclientcaps(revoke)#47752
mds: fix client isn't responding to mclientcaps(revoke)#47752
Conversation
|
@lxbsz -- this change seems to be causing MDS_CLIENT_LATE_RELEASE warnings to be generated. E.g.: https://pulpito.ceph.com/vshankar-2022-09-06_06:13:42-fs-wip-vshankar-testing-20220901-100101-testing-default-smithi/7013838/ |
Fixed it by calling We need to calculate the |
batrick
left a comment
There was a problem hiding this comment.
A synthetic test case would be helpful.
I think we also need to have the MDS give better information at dout(0) in the debug log when this situation happens:
- dump the inode
- dump the cap list for the inode
- dump the session holding the cap
|
jenkins test make check |
|
jenkins test windows |
Not sure how hard it is to reproduce this. Locally I haven't successfully done that yet.
Yeah, I will add this. |
|
jenkins test api |
Will add the debug logs later in confirm_receipt(). Fixes: https://tracker.ceph.com/issues/57244 Signed-off-by: Xiubo Li <xiubli@redhat.com>
When revoking caps from clients and if the clients could release some of the caps references and the clients still could send cap update request back to MDS, while the confirm_receipt() will clear the _revokes list anyway. But this cap will still be kept in revoking_caps list. At the same time add one debug log when revocation is not totally finished. Fixes: https://tracker.ceph.com/issues/57244 Signed-off-by: Xiubo Li <xiubli@redhat.com>
In case:
mds client
- Releases cap and put Inode
- Increase cap->seq and sends
revokes req to the client
- Receives release req and - Receives & drops the revoke req
skip removing the cap and
then eval the CInode and
issue or revoke caps again.
- Receives & drops the caps update
or revoke req
- Health warning for client
isn't responding to
mclientcaps(revoke)
Fixes: https://tracker.ceph.com/issues/57244
Signed-off-by: Xiubo Li <xiubli@redhat.com>
When writing to a file and the max_size is approaching the client will try to trigger to call check_caps() and flush the caps to MDS. But just in case the MDS is revoking Fsxrw caps, since the client keeps writing and holding the Fw caps it may only release part of the caps but the Fw. Fixes: https://tracker.ceph.com/issues/57244 Signed-off-by: Xiubo Li <xiubli@redhat.com>
|
jenkins test make check arm64 |
|
This would need a retest since there are failing tests in main branch which is being tracked by: https://tracker.ceph.com/issues/59534 |
|
This change will cause inconsistent cap_id in client: src/client/Client.cc: 4073: FAILED ceph_assert(cap.cap_id == cap_id) I have append a picture to description to this question: Do you have some suggestion to solve this problem? Thanks! |
That's why the change was reverted: #51661 |
When revoking caps from clients and if the clients could release
some of the caps references and the clients still could send cap
update request back to MDS, while the confirm_receipt() will clear
the _revokes list anyway. But this cap will still be kept in revoking_caps
list.
Fixes: https://tracker.ceph.com/issues/57244
Signed-off-by: Xiubo Li xiubli@redhat.com
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows