Actions
Bug #66560
closedmgr/vol: get_next_job() from asyn_cloner failed because clone entry went missing
Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Development
Backport:
quincy,reef,squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
mgr/volumes
Labels (FS):
Pull request ID:
Tags (freeform):
backport_processed
Merge Commit:
Fixed In:
v19.3.0-3396-ge392142c65
Released In:
v20.2.0~2506
Upkeep Timestamp:
2025-11-01T01:26:47+00:00
Description
Tough to reproduce this, since to do so clone cancellation needs to be perfectly timed just before get_next_job() from async_cloner.py runs.
Relevant entries caputre from MGR log -
8042] osd_op (unknown.0.0:1607 33.3e 33:7eaac5ee:::100000011a3.0000001c: head [write 0~4194304 in=4194304b] snapc 1=[] ondisk+write+kn __if_redirected+supports_pool_eio e622) 0x55aac7595400 con 0x55aab6b60d80 4-06-17T18:41:06.116+0530 7fcae7a006c0 1 —— 192.168.29.219:0/255696103 --> [v2:192.168.29.219:6826/2005942128,v1:192.168.29.219:6827/2 942128] -- osd_op(unknown.0.0:1608 33.8 33:11072d26:::100000011a3.0000001d: head [write 0-4194304 in=4194304b] snapc 1=[] ondisk+write+k n_if_redirected+supports_pool_eio e622) 0x55aac7555000 con 0x55aab88a1a80 • [volumes WARNING volumes.fs.async_job] traceback: Traceback (most recent call last): 4-06-17T18:41:06.121+0530 7fcb82a006c0 —— ile "/home/rishabh/repos/ceph/mgr-vol-clone-stats/src/pybind/mgr/volumes/fs/async_job.py", line 52, in run vol_job = self.async_job.get_job() ^^^^^^^^^^^^^^^^^^^^^^^^ ile "/home/rishabh/repos/ceph/mgr-vol-clone-stats/src/pybind/mgr/volumes/fs/async_job.py", line 195, in get_job (ret, job) = self.get_next_job(volname, running_jobs) ^^^^^^^^^^^^^^^. ^^^^^^^^^^^^^^^ ile "/home/rishabh/repos/ceph/mgr-vol-clone-stats/src/pybind/mgr/volumes/fs/async_cloner.py", line 402, in get_next_job return get_next_clone_entry(self.fs_client, self.vc.volspec, volname, running_jobs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ile "/home/rishabh/repos/ceph/mgr-vol-clone-stats/src/pybind/mgr/volumes/fs/async_cloner.py", line 35, in get_next_clone_entry job = clone_index.get_oldest_clone_entry(running_jobs) ΑΛΛΛΛΛΛΛ ΜΑΛΛΙ ile "/home/rishabh/repos/ceph/mgr-vol-clone-stats/src/pybind/mgr/volumes/fs/operations/clone_index.py", line 80, in get_oldest_clone_en st = self.fs.lstat(dpath) ΑΛΛΛΛΛΛ ile "cephfs.pyx", line 1997, in cephfs.LibCephFS.lstat ile "cephfs.pyx", line 1978, in cephfs.LibCephFS.stat hfs.ObjectNotFound: error in stat: /volumes/_index/clone/3efa5bf7-115c-4f27-8b83-8fecc16a690a: No such file or directory [Errno 2] 4-06-17T18:41:06.122+0530 7fcad9e006c0 1 192.168.29.219:0/3444413926 <== mds.0 v2:192.168.29.219:6834/1413894566 246 ==== client_re (???:127 = 0 (0) Success) ==== 882+0+0 (secure 0 0 0 0x55aaccb68a00 con 0x55aab70cb680
Updated by Rishabh Dave almost 2 years ago
- Status changed from New to Fix Under Review
Updated by Venky Shankar almost 2 years ago
@Rishabh Dave The log entries in the description isn't totally clear since the entries overlap. How is that possible?
Updated by Venky Shankar almost 2 years ago
- Category set to Correctness/Safety
- Target version set to v20.0.0
- Source set to Development
- Backport set to quincy,reef,squid
Updated by Rishabh Dave over 1 year ago
- Status changed from Fix Under Review to Pending Backport
Updated by Rishabh Dave over 1 year ago
- Copied to Backport #66927: reef: mgr/vol: get_next_job() from asyn_cloner failed because clone entry went missing added
Updated by Rishabh Dave over 1 year ago
- Copied to Backport #66928: squid: mgr/vol: get_next_job() from asyn_cloner failed because clone entry went missing added
Updated by Rishabh Dave over 1 year ago
- Copied to Backport #66929: quincy: mgr/vol: get_next_job() from asyn_cloner failed because clone entry went missing added
Updated by Rishabh Dave over 1 year ago
- Tags (freeform) set to backport_processed
Updated by Rishabh Dave over 1 year ago
- Status changed from Pending Backport to Resolved
All backports have been merged, marking this as resolved.
Updated by Upkeep Bot 9 months ago
- Merge Commit set to e392142c65c4186e4ef7365acf58001c512b769a
- Fixed In set to v19.3.0-3396-ge392142c65c
- Upkeep Timestamp set to 2025-06-26T20:05:21+00:00
Updated by Upkeep Bot 8 months ago
- Fixed In changed from v19.3.0-3396-ge392142c65c to v19.3.0-3396-ge392142c65
- Upkeep Timestamp changed from 2025-06-26T20:05:21+00:00 to 2025-07-14T16:44:51+00:00
Updated by Upkeep Bot 5 months ago
- Released In set to v20.2.0~2506
- Upkeep Timestamp changed from 2025-07-14T16:44:51+00:00 to 2025-11-01T01:26:47+00:00
Actions