client: issue a cap release immediately if no cap exists#51500
client: issue a cap release immediately if no cap exists#51500
Conversation
66b083f to
717d62b
Compare
vshankar
left a comment
There was a problem hiding this comment.
@lxbsz Could you explain how this change fixed the issue mentioned in https://tracker.ceph.com/issues/61148#note-7
The 9fbde6d will remove the This is okay for the And if a new client request comes to MDS and when encoding the reply message it will add a new This is why we can see the call traces and warning logs in dmesg. Since only the |
For the record - commit 9fbde6d removes the part where a cap release from the client removes the cap from the revocation list (but does not remove the cap) if the mds had sent a revoke. Without this the "client failed to respond to cap revoke" warning would show up since the mds is tracking the cap in revokes list but the client has already dropped it.
Got it. So with 9fbde6d the mds lost the cap metadata but the client keeps a note of it during open() when the open() call in the mds races with the delayed cap release (in the mds).
Looks fine so far.
So, we add the logic back where the mds would remove the cap from the revokes list and not remove the cap from the inode, but the client is not tracking the relevant caps. I need to closely check the code you added to send cap releases... |
This is what the I just add the revoke/grant cases. Please note only the |
| m->get_mseq(), | ||
| cap_epoch_barrier); | ||
| } else { | ||
| ldout(cct, 5) << __func__ << " don't have vino " << vino << ", dropping" << dendl; |
There was a problem hiding this comment.
The else case flushes the queued release caps. This behaviour is modified by this change - do you see any issues due to that?
There was a problem hiding this comment.
No, the else case will only print this debug log and the flush just out side the else case.
There was a problem hiding this comment.
Right, but the current code will always flush queued cap releases if vino isn't found.
There was a problem hiding this comment.
IMO that's should be okay and the flush cap release will be trigger per tick.
Why it flushed it here is because the MDS may waiting for the cap release here in IMPORT case. And only when the MDS is waiting for it will it make sense to trigger the flush immediately here, or the MDS may will be stuck for 1 seconds at most.
|
@lxbsz In the interest if time (to get this PR tested and merged), should we revert PR #47752 for the time being? EDIT: I already ack'd this and plan to put it to test, however, I might be on PTO for sometime. I'll try my best to get this tested asap, however, the main branch runs have the cap_id mismatch asserts. |
Sure. Later I will generate a new PR together with #51500. |
IMO, only revert 9fbde6d commit should be enough. |
True, however, I would prefer a PR revert (since it reverts the whole change) and then pick the other commits into this PR. WDYT? |
Sure, np. |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
Will add the debug logs later in confirm_receipt(). Fixes: https://tracker.ceph.com/issues/57244 Signed-off-by: Xiubo Li <xiubli@redhat.com>
When revoking caps from clients and if the clients could release some of the caps references and the clients still could send cap update request back to MDS, while the confirm_receipt() will clear the _revokes list anyway. But this cap will still be kept in revoking_caps list. At the same time add one debug log when revocation is not totally finished. Fixes: https://tracker.ceph.com/issues/57244 Signed-off-by: Xiubo Li <xiubli@redhat.com>
In case:
mds client
- Releases cap and put Inode
- Increase cap->seq and sends
revokes req to the client
- Receives release req and - Receives & drops the revoke req
skip removing the cap and
then eval the CInode and
issue or revoke caps again.
- Receives & drops the caps update
or revoke req
- Health warning for client
isn't responding to
mclientcaps(revoke)
All the IMPORT/REVOKE/GRANT cap ops will increase the session seq
in MDS side and then the client need to issue a cap release to
unblock MDS to remove the corresponding cap to unblock possible
waiters.
Fixes: https://tracker.ceph.com/issues/57244
Fixes: https://tracker.ceph.com/issues/61148
Signed-off-by: Xiubo Li <xiubli@redhat.com>
When writing to a file and the max_size is approaching the client will try to trigger to call check_caps() and flush the caps to MDS. But just in case the MDS is revoking Fsxrw caps, since the client keeps writing and holding the Fw caps it may only release part of the caps but the Fw. Fixes: https://tracker.ceph.com/issues/57244 Signed-off-by: Xiubo Li <xiubli@redhat.com>
|
Pushed the reverted commits back from #47752 to this new PR and removed the problematic one. |
|
Pkgs building now -- will update with the run link. |
|
Test run looks fine. I'll update the run wiki and merge this today. Nice work @lxbsz |
| session->enqueue_cap_release( | ||
| m->get_ino(), | ||
| m->get_cap_id(), | ||
| m->get_seq(), |
There was a problem hiding this comment.
This is problematic (change also in the kernel driver), causing:
https://tracker.ceph.com/issues/64977
It should be the last issue_seq but at this point the client does not have a valid issue_seq because the cap is already removed. Furthermore, the MDS does not share its view of the issue_seq associated with the grant/revoke; sadly only clients actually set the issue_seq in an MClientCaps message header.
For the CEPH_CAP_OP_IMPORT case, it was okay to use m->get_seq() because that's implicitly the issue_seq as well but this is not okay for the CEPH_CAP_OP_GRANT or CEPH_CAP_OP_REVOKE cases.
I'm confused by how the original issue occurs with the stated sequence in this PR's description:
https://tracker.ceph.com/issues/57244
specifically the second step on the MDS side "Receives release req and...". The MDS does not increase issue_seq when sending revokes. The client's release should be processed successfully. This is an expected sequence of events which was accounted for as long ago as b180b6c. I think there was something else going on here that should be investigated.
With that said, we're now obligated to support this new scenario where the client sends a cap release with issue_seq == (last revoke/grant) seq. This only happens if the client is basically saying: "I don't acknowledge that I have this cap; please release it.". The MDS should be able to handle this gracefully and that's basically done in #56828.
I'm in the middle of some cleanup relating to issue_seq but wanted to leave this note here for anyone looking at this in the future. Cleanup tracked in https://tracker.ceph.com/issues/66704.
There was a problem hiding this comment.
For the
CEPH_CAP_OP_IMPORTcase, it was okay to usem->get_seq()because that's implicitly theissue_seqas well but this is not okay for theCEPH_CAP_OP_GRANTorCEPH_CAP_OP_REVOKEcases.
Agreed. For the wider audience, the last_sent/last_issue/mseq has deep history to solve particularly this release/revoke race which unfortunately broke with this commit.
specifically the second step on the MDS side "Receives release req and...". The MDS does not increase
issue_seqwhen sending revokes. The client's release should be processed successfully. This is an expected sequence of events which was accounted for as long ago as b180b6c. I think there was something else going on here that should be investigated.
... and this wasn't caught with fs suite runs which makes me worried a lot. Yes, it's a concurrent release/revoke case, but really we aren't testing to the extent where such cases are covered (esp. since we have now seen a couple of reports for this bug, one without much scale involved).
There was a problem hiding this comment.
This is problematic (change also in the kernel driver), causing:
https://tracker.ceph.com/issues/64977
It should be the last
issue_seqbut at this point the client does not have a validissue_seqbecause the cap is already removed.
How could this happen is what I eagerly want to figure out, after that I think we can resolve all the confusion about the revoke stuck issues.
Furthermore, the MDS does not share its view of the
issue_seqassociated with the grant/revoke; sadly only clients actually set theissue_seqin anMClientCapsmessage header.
The issue_seq in client will be updated only together with the client request reply msgs.
For the
CEPH_CAP_OP_IMPORTcase, it was okay to usem->get_seq()because that's implicitly theissue_seqas well but this is not okay for theCEPH_CAP_OP_GRANTorCEPH_CAP_OP_REVOKEcases.I'm confused by how the original issue occurs with the stated sequence in this PR's description:
https://tracker.ceph.com/issues/57244
specifically the second step on the MDS side "Receives release req and...". The MDS does not increase
issue_seqwhen sending revokes. The client's release should be processed successfully. This is an expected sequence of events which was accounted for as long ago as b180b6c. I think there was something else going on here that should be investigated.With that said, we're now obligated to support this new scenario where the client sends a cap release with
issue_seq == (last revoke/grant) seq. This only happens if the client is basically saying: "I don't acknowledge that I have this cap; please release it.". The MDS should be able to handle this gracefully and that's basically done in #56828.I'm in the middle of some cleanup relating to
issue_seqbut wanted to leave this note here for anyone looking at this in the future. Cleanup tracked in https://tracker.ceph.com/issues/66704.
In case:
All the IMPORT/REVOKE/GRANT cap ops will increase the session seq
in MDS side and then the client need to issue a cap release to
unblock MDS to remove the corresponding cap to unblock possible
waiters.
Fixes: https://tracker.ceph.com/issues/57244
Fixes: https://tracker.ceph.com/issues/61148
Signed-off-by: Xiubo Li xiubli@redhat.com
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows