mgr/vol: handle case where clone index entry goes missing#58101
mgr/vol: handle case where clone index entry goes missing#58101rishabh-d-dave merged 1 commit intoceph:mainfrom
Conversation
774ecc4 to
4ff6272
Compare
|
jenkins test make check |
|
jenkins test api |
|
jenkins test make check arm64 |
|
jenkins test make check |
|
jenkins test make check arm64 |
4ff6272 to
9323413
Compare
|
jenkins test make check |
|
jenkins test make check arm64 |
|
jenkins test api |
3f45b6b to
9f5e79f
Compare
|
@vshankar I've squashed the commits and made the change you had suggested. It's ready for review. PTAL. |
9f5e79f to
9ef9e17
Compare
|
@vshankar Just a reminder, please re-review. |
|
jenkins test make check |
a67d75c to
aa0f6ed
Compare
There was a problem hiding this comment.
https://tracker.ceph.com/projects/cephfs/wiki/Main#11-JUL-2024
Testing took some time because first there were issues with building testing branch, then there were too many infra failures, there were failures due to one of the PRs in the testing branch and finally last run's rerun had new failures that need to be checked.
There were no failures related to this PR, QA was successful.
|
jenkins test make check |
|
jenkins test api |
* refs/pull/58101/head: mgr/vol: handle case where clone index entry goes missing Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Kotresh Hiremath Ravishankar <khiremat@redhat.com>
|
jenkins test api |
|
I am seeing make check failures that seem unrelated - https://jenkins.ceph.com/job/ceph-pull-requests/138712/consoleFull#-1640523647e840cee4-f4a4-4183-81dd-42855615f2c1. Copying them below - Rebasing the PR... |
In `async_cloner.py`, clone index entry is fetched to get next clone job that needs to be executed. It might happen that the clone job was cancelled just when it was going to be picked for execution (IOW, when it was about to move from pending state to in-progress state). Currently, MGR hangs in such a case because exception `ObjectNotFound` from CephFS Python bindings is raised and is left uncaught. To prevent this issue catch the exception, log it and return None to tell `get_job()` of `async_job.py` to look for next job in the queue. Increase the scope of try-except in method `get_oldest_clone_entry()` of `async_cloner.py` so that when exception `cephfs.Error` or any exception under it is thrown by `self.fs.lstat()` is not left uncaught. FS object is also passed to the method `list_one_entry_at_a_time()`, so increasing scope of try-except is useful as it will not allow exceptions raised in other calls to CephFS Python binding methods to be left uncaught. Fixes: https://tracker.ceph.com/issues/66560 Signed-off-by: Rishabh Dave <ridave@redhat.com>
aa0f6ed to
3cff725
Compare
|
make check passed by rebasing PR branch. But API test keep different error each time it is run. |
|
jenkins test api |
|
CI passed, proceeding to merge. Leaving link to the comment where QA run reported - #58101 (review) |
mgr/vol: handle case where clone index entry goes missing Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Kotresh HR <khiremat@redhat.com>
In async_cloner.py, clone index entry is fetched to get next clone job
that needs to be executed. It might happen that the clone job was
cancelled just when it was going to be picked for execution (IOW, when
it was about to move from pending state to in-progress state).
Currently, in such a case MGR hangs because the exception ObjectNotFound
from CephFS Python bindings is raised and is left uncaught. To prevent
this issue catch an exception, log it and return None to tell get_job()
of async_job.py to look for next job in the queue.
Fixes: https://tracker.ceph.com/issues/66560
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e