squid: mgr/vol: show progress and stats for the subvolume snapshot clones#61415
Draft
rishabh-d-dave wants to merge 11 commits intoceph:squidfrom
Draft
squid: mgr/vol: show progress and stats for the subvolume snapshot clones#61415rishabh-d-dave wants to merge 11 commits intoceph:squidfrom
rishabh-d-dave wants to merge 11 commits intoceph:squidfrom
Conversation
14 tasks
Contributor
|
This PR is under test in https://tracker.ceph.com/issues/71055. |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
f25002e to
10c3196
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
List volumes returns a list of of 1-member dictionaries where each
member is "{'name': <volname>}". This format is useful printing output
of the command "ceph fs volume ls" but for internal use this format is
hinderance since before using it, this list needs to be resolved from
list of dictionaries to list of strings.
Return a list of strings and move the code for converting it to list of
dictionaries to method "list_fs_volumes()" (which is method run by "ceph
fs volume ls" command).
This triggers change in async_job.py where we need volnames. The code
here converts list of dicts to list of string. This needs to removed.
Second, don't create unnecessary temporary variable "fs_map" in
"list_volumes()" since it is not required. To get fs names we can
directly iterate.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 9f355b6)
Output of command "ceph fs clone status" will now show the progress made by that specific clone. The output will show cloning completed in terms of percentage and amount of bytes that have been cloned and number of files that have been cloned. Fixes: https://tracker.ceph.com/issues/61904 Signed-off-by: Rishabh Dave <ridave@redhat.com> Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit d7bc828) Conflicts: src/pybind/mgr/volumes/fs/volume.py - minor conflict: line where open_subvol_in_group is imported is slightly different in squid compared to main.
Print a progress bar for ongoing clone job in output of "ceph status". When multiple clones are ongoing, show 1 progress bar in output of "ceph status" shows average of progress made by each clone. When number of clone job is more than number of clone threads, print 2 progress bars in output of "ceph status" command; one for ongoing clone jobs and other for ongoing+pending clone jobs. Fixes: https://tracker.ceph.com/issues/61904 Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit 65b789e)
1. Let caller check for multiple states. It might happen that clone
finishes while it is being cancelled, in such cases user might want
to check for both.
2. Add a helper method to check if clone is in pending state and add a
separate method to check if clone is in cancelled state.
Signed-off-by: Rishabh Dave <ridave@redhat.com>
(cherry picked from commit 10949bf)
Add a helper method that accepts command arguments (along with rest of paramters accepted by the method run_shell()) and return the stdout of the command. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit 9f60848)
Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit db0e736)
TestVolumesHelper._do_subvolume_io() is a helper method that allows users to generate data for testing. mgr/vol code that reports progress made by clone jobs depends on the value set for xattr rbytes. It takes a bit of a time for rbytes to be set. And, therefore, all tests in TestCloneProgressReporter needs to wait for subvolume's rbytes xattr's value to be set to the actual amount of data present on the subvolume before proceeding to actually testing. So that this can be achieved make _do_subvolume_io() return size of the data it has generated. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit 92aecab)
Clone progress is shown to user through "ceph fs clone status" output and through "ceph status" output. Test both these features. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit e0c85b8)
Update docs and add release notes about the progress report that is printed in output of "ceph fs clone status" command and progress bars that is/are printed in output of "ceph status" command. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit 645cc6e)
Test name is test_subvolume_snapshot_info_if_clone_pending_for_no_group, located in class TestSubvolumeSnapshotClones in test_volumes.py 5 seconds can (sometimes) be insufficient as value of the config option "snapshot_clone_delay" in this. Increase it to avoid unnecessary race conditions which leads to irrelevant failures. Following is an example where 5 seconds was insufficient as waiting period since instead it took 8 seconds - 2024-07-28T18:16:10.088 DEBUG:teuthology.orchestra.run.smithi064:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph config set mgr mgr/volumes/snapshot_clone_no_wait False ... 2024-07-28T18:16:18.694 DEBUG:teuthology.orchestra.run.smithi064:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph fs subvolume snapshot info cephfs subvol79370 subvol_snap40980 This issue was seen during testing of PR to which this commit belongs. This commit has been separated from the commit that adds tests for clone progress reporting so that it's easy to document need for this code patch and also track it. This commit is not being moved to a different PR and been kept on the same PR since it can't be reproduced otherwise. This also ensures that commit is backported to older release along with code that caused this issue, causing no one to need to find this commit while backporting effort. Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit a6b95a5)
snapshot is retained despite of deletion (using --retain-snapshots option of "subvolume rm" command). Signed-off-by: Rishabh Dave <ridave@redhat.com> (cherry picked from commit 9e34499)
10c3196 to
252e8b9
Compare
Contributor
Author
|
jenkins test api |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
backport tracker: https://tracker.ceph.com/issues/69561, https://tracker.ceph.com/issues/69562
backport of #54620, #59712
parent tracker: https://tracker.ceph.com/issues/61904, https://tracker.ceph.com/issues/67988
this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh
updated using ceph-backport.sh version 16.0.0.6848