Conversation
leonid-s-usov
left a comment
There was a problem hiding this comment.
It looks like we have a working solution here :)
I'm concerned with the need to increase the gather context cardinality. Are we expecting more than 32 bits of individual sub-operations for the quiesce?
Another thing I couldn't spot is how new CInodes that get loaded under the quiescing realm will be immediately locked.
I would surely hope not. It was just defensive programming. |
The xlock on the filelock should prevent lookup/addition of new inodes (in fact it's missing for the root of the subvolume, I'll add that). |
We may go with this for the first iteration, but do we really want to prevent read-only access to new files while the quiesce is active? It's overly restrictive, isn't it? |
|
@leonid-s-usov updated with latest code. Still have some tests I'm working on. |
398a0b7 to
8dadedc
Compare
|
Added a commit for some initial developer docs on the protocol. |
anthonyeleven
left a comment
There was a problem hiding this comment.
Interesting stuff. I've left a number of nitpicky documentarian suggestions.
Thanks for the review, pushed changes. |
|
Developer doc here: https://ceph--54581.org.readthedocs.build/en/54581/dev/mds_internals/quiesce/ |
|
I cannot inline reply for some reason so leaving a standalone comment:
Locking the other inodes is also important as that will recall capabilities which do not hold the quiescelock. Exlusive locking the quiescelock only would simply prevent new requests from being processed. |
56e7db2 to
6138e27
Compare
This avoids waking up all waiters when only WAIT_XLOCK waiters should wake. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The previous scheme invalidated a lock cache and then immediately removed it from its Capability list. The lock cache would eventually be deleted but a new one could be constructed shortly after. The main reason for this is that simply invalidating the lock cache does not drive a state change in the local locks preventing new writers. This is mostly important for acquiring the quiescelock. This commit also corrects a bug where a MDLockCache would be created for a given opcode type (like create) when the capability does not have the issued cap (CEPH_CAP_DIR_CREATE). The bug would not cause any negative side-effects but would hold locks unnecessarily when only MDS ops (and not the client executing ops asynchronously) are acquiring the locks. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This is no longer necessary with the change to a LocalLock quiescelock. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This reverts commit 16e50ab. This flag is no longer necessary as the volumes plugin issues quiesce calls against the data (i.e. root) directory of the subvolume rather than the subvolume directory (with its associated .meta file). Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Instead of requiring the caller to put the arguments in a list, allow passing as regular arguments. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Otherwise it gets included in the *args list. This is necessary after commit `qa: simplify calls to (rank|mds)_(tell|asok)`. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
…round jobs Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
For storing misc. test artifacts. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
- Verify multirank quiesce incl. caps issued. - Unset splitauth experimental Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
To include changes relating to it now being a local lock that prevents mutable caps. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
|
@batrick Let me know if you want me to verify the fs suite run for this change (also, please link the latest one if that's the case). |
I'm planning to run it again with other fixes I have for |
|
https://tracker.ceph.com/projects/cephfs/wiki/Main#2024-03-20 So this has been quite an odyssey. I think this is ready for mergig. @vshankar please do the honors. Also these: |
|
retest this please |
|
jenkins test api |
|
jenkins test dashboard cephadm |
|
jenkins test make check arm64 |
|
🎉 |
See also: #54485
Currently TODO:
Newceph.quiesce.ignorevxattr for files (for use by volumes "meta" file)Code TODOsContribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e