Bug #74984
openqa: Test failure: test_cephfs_mirror_incremental_sync/test_cephfs_mirror_sync_with_purged_snapshot (snapdiff bug)
0%
Description
With multi-threaded mirroring PR - https://github.com/ceph/ceph/pull/66572 the following test failed.
But I found the previous tracker for the same test failure - https://tracker.ceph.com/issues/68567
2026-02-17T08:26:34.377 INFO:tasks.cephfs_test_runner:====================================================================== 2026-02-17T08:26:34.377 INFO:tasks.cephfs_test_runner:FAIL: test_cephfs_mirror_incremental_sync (tasks.cephfs.test_mirroring.TestMirroring.test_cephfs_mirror_incremental_sync) 2026-02-17T08:26:34.377 INFO:tasks.cephfs_test_runner:Test incremental snapshot synchronization (based on mtime differences). 2026-02-17T08:26:34.377 INFO:tasks.cephfs_test_runner:---------------------------------------------------------------------- 2026-02-17T08:26:34.377 INFO:tasks.cephfs_test_runner:Traceback (most recent call last): 2026-02-17T08:26:34.377 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ceph_ceph_ebc89147fc1a1731b84ef7e7003e19f704ba15c0/qa/tasks/cephfs/test_mirroring.py", line 1224, in test_cephfs_mirror_incremental_sync 2026-02-17T08:26:34.377 INFO:tasks.cephfs_test_runner: self.verify_snapshot(repo_path, 'snap_c') 2026-02-17T08:26:34.377 INFO:tasks.cephfs_test_runner: File "/home/teuthworker/src/github.com_ceph_ceph_ebc89147fc1a1731b84ef7e7003e19f704ba15c0/qa/tasks/cephfs/test_mirroring.py", line 355, in verify_snapshot 2026-02-17T08:26:34.378 INFO:tasks.cephfs_test_runner: self.assertTrue(source_res == dest_res) 2026-02-17T08:26:34.378 INFO:tasks.cephfs_test_runner:AssertionError: False is not true 2026-02-17T08:26:34.378 INFO:tasks.cephfs_test_runner: 2026-02-17T08:26:34.378 INFO:tasks.cephfs_test_runner:---------------------------------------------------------------------- 2026-02-17T08:26:34.378 INFO:tasks.cephfs_test_runner:Ran 10 tests in 1478.603s 2026-02-17T08:26:34.378 INFO:tasks.cephfs_test_runner: 2026-02-17T08:26:34.378 INFO:tasks.cephfs_test_runner:FAILED (failures=1)
Attached the teuthology log.
Files
Updated by Kotresh Hiremath Ravishankar about 1 month ago
- Related to Bug #68567: cephfs/qa : Test failure: test_cephfs_mirror_incremental_sync (tasks.cephfs.test_mirroring.TestMirroring) added
Updated by Kotresh Hiremath Ravishankar about 1 month ago
Analysis:
Let's find out what files have a checksum mismatch between snap_b and snap_c. Following is vimdiff output of the relevant checksums. The files prefixed with >>>>>>> are the files whose checksum is mismatched.
1. ceph_repo/ceph-qa-suite/.snap/snap_c/.git/logs/HEAD
2. ceph_repo/ceph-qa-suite/.snap/snap_c/.git/logs/refs/heads/giant
> (cd /home/ubuntu/cephtest/mnt.0 && exec find -L ceph_repo/ceph-qa-suite/.snap/snap_c -type f -exec md5sum '{}' +) | > (cd /home/ubuntu/cephtest/mnt.1 && exec find -L ceph_repo/ceph-qa-suite/.snap/snap_c -type f -exec md5sum '{}' +)
0325ee9c9e6f77d5225fe18047f164c5 ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-rados-plugin=lrc-k=4-m=2-l=3.yaml | 0325ee9c9e6f77d5225fe18047f164c5 ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-rados-plugin=lrc-k=4-m=2-l=3.yaml
1a5c8baa549bb0c3829b19679595bdec ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-rados-plugin=jerasure-k=3-m=1.yaml | 1a5c8baa549bb0c3829b19679595bdec ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-rados-plugin=jerasure-k=3-m=1.yaml
8c56465ee9a6befedf0331238c6e99b2 ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-rados-plugin=isa-k=2-m=1.yaml | 8c56465ee9a6befedf0331238c6e99b2 ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-rados-plugin=isa-k=2-m=1.yaml
dd04e4b851be6a74494d883501fbcd92 ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-rados-plugin=jerasure-k=2-m=1.yaml | dd04e4b851be6a74494d883501fbcd92 ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-rados-plugin=jerasure-k=2-m=1.yaml
63c228035e8eaadae6544472d4356d3e ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-rados-default.yaml | 63c228035e8eaadae6544472d4356d3e ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-rados-default.yaml
d9e6c95a9fba03c3d0483ff35f5781ca ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-rados-parallel.yaml | d9e6c95a9fba03c3d0483ff35f5781ca ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-rados-parallel.yaml
+ +-- 32 lines: a9572c1de83c967b7568ffc2d30d9955 ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-feature-plugins-v2.yaml-|+ +-- 32 lines: a9572c1de83c967b7568ffc2d30d9955 ceph_repo/ceph-qa-suite/.snap/snap_c/erasure-code/ec-feature-plugins-v2.yaml-
2c642152299a94e05ea26eae11993b13 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/hooks/pre-push.sample | 2c642152299a94e05ea26eae11993b13 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/hooks/pre-push.sample
054f9ffb8bfe04a599751cc757226dda ceph_repo/ceph-qa-suite/.snap/snap_c/.git/hooks/pre-applypatch.sample | 054f9ffb8bfe04a599751cc757226dda ceph_repo/ceph-qa-suite/.snap/snap_c/.git/hooks/pre-applypatch.sample
c7ab00c7784efeadad3ae9b228d4b4db ceph_repo/ceph-qa-suite/.snap/snap_c/.git/hooks/push-to-checkout.sample | c7ab00c7784efeadad3ae9b228d4b4db ceph_repo/ceph-qa-suite/.snap/snap_c/.git/hooks/push-to-checkout.sample
2b7ea5cee3c49ff53d41e00785eb974c ceph_repo/ceph-qa-suite/.snap/snap_c/.git/hooks/post-update.sample | 2b7ea5cee3c49ff53d41e00785eb974c ceph_repo/ceph-qa-suite/.snap/snap_c/.git/hooks/post-update.sample
ce562e08d8098926a3862fc6e7905199 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/hooks/applypatch-msg.sample | ce562e08d8098926a3862fc6e7905199 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/hooks/applypatch-msg.sample
e83c4980f9df005c2895481bcc21d7e9 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/FETCH_HEAD | e83c4980f9df005c2895481bcc21d7e9 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/FETCH_HEAD
>>>>>>>>> 3be557a48c3359d288a46cae7f4ad551 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/logs/HEAD | f3024b5d1a846ae105d7d4cacfdae513 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/logs/HEAD
>>>>>>>>> 3be557a48c3359d288a46cae7f4ad551 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/logs/refs/heads/giant | f3024b5d1a846ae105d7d4cacfdae513 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/logs/refs/heads/giant
bcab67411bcdcc060e91f92abac02e05 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/logs/refs/remotes/origin/HEAD | bcab67411bcdcc060e91f92abac02e05 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/logs/refs/remotes/origin/HEAD
af5764e80ecbb8ebd2c80ce52245a7cb ceph_repo/ceph-qa-suite/.snap/snap_c/.git/packed-refs | af5764e80ecbb8ebd2c80ce52245a7cb ceph_repo/ceph-qa-suite/.snap/snap_c/.git/packed-refs
8567afb7f379a049879428930232de4f ceph_repo/ceph-qa-suite/.snap/snap_c/.git/config | 8567afb7f379a049879428930232de4f ceph_repo/ceph-qa-suite/.snap/snap_c/.git/config
036208b4a1ab4a235d75c181e685e5a3 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/info/exclude | 036208b4a1ab4a235d75c181e685e5a3 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/info/exclude
12b0ca1b66811d4ba7b1995888364b89 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/refs/heads/giant | 12b0ca1b66811d4ba7b1995888364b89 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/refs/heads/giant
73a00957034783b7b5c8294c54cd3e12 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/refs/remotes/origin/HEAD | 73a00957034783b7b5c8294c54cd3e12 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/refs/remotes/origin/HEAD
+ +--1469 lines: 955820e484e24c8b38508e33f716cdf0 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/ORIG_HEAD-------------------------|+ +--1469 lines: 955820e484e24c8b38508e33f716cdf0 ceph_repo/ceph-qa-suite/.snap/snap_c/.git/ORIG_HEAD-------------------------
95bdf80fe0825fdf398bcf6213093fca ceph_repo/ceph-qa-suite/.snap/snap_c/tasks/calamari/http_client.py | 95bdf80fe0825fdf398bcf6213093fca ceph_repo/ceph-qa-suite/.snap/snap_c/tasks/calamari/http_client.py
92e0cf9059081b44542098467733c00a ceph_repo/ceph-qa-suite/.snap/snap_c/tasks/blktrace.py | 92e0cf9059081b44542098467733c00a ceph_repo/ceph-qa-suite/.snap/snap_c/tasks/blktrace.py
5510abee35b0218b0434cf6bbfc00ca3 ceph_repo/ceph-qa-suite/.snap/snap_c/tasks/thrashosds.py | 5510abee35b0218b0434cf6bbfc00ca3 ceph_repo/ceph-qa-suite/.snap/snap_c/tasks/thrashosds.py
90bb4a99dd25e96bba48535d418b8bb9 ceph_repo/ceph-qa-suite/.snap/snap_c/tasks/radosgw_admin_rest.py | 90bb4a99dd25e96bba48535d418b8bb9 ceph_repo/ceph-qa-suite/.snap/snap_c/tasks/radosgw_admin_rest.py
3d6437a5afcd6b97cc47c6528512d9d6 ceph_repo/ceph-qa-suite/.snap/snap_c/.gitignore | 3d6437a5afcd6b97cc47c6528512d9d6 ceph_repo/ceph-qa-suite/.snap/snap_c/.gitignore
b4308a8dd3a52ad4bc2cf282239fc740 ceph_repo/ceph-qa-suite/.snap/snap_c/debug/mds_client.yaml | b4308a8dd3a52ad4bc2cf282239fc740 ceph_repo/ceph-qa-suite/.snap/snap_c/debug/mds_client.yaml
Let's see if the file ceph_repo/ceph-qa-suite/.snap/snap_c/.git/logs/HEAD is queued for data sync by the crawler thread. The following confirms the file is not queued for the snapshot snap_c sync
grep "cephfs::mirror" ceph-client.mirror.22423.log | grep -E 'copy_to_remote|push_dataq_entry|wait_until_safe_to_snapshot' | grep -E '.git/logs/HEAD|wait_until_safe_to_snapshot' | less ------------------------------------ 2026-02-17T08:24:38.343+0000 7f5a660c7640 10 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) push_dataq_entry: snapshot data replayer dataq pushed syncm=0x563dfdce3210 epath=./.git/logs/HEAD 2026-02-17T08:24:38.347+0000 7f5a648c4640 10 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) copy_to_remote: dir_root=/ceph_repo/ceph-qa-suite, epath=./.git/logs/HEAD, num_blocks=1 2026-02-17T08:24:38.351+0000 7f5a648c4640 10 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) copy_to_remote: dir_root=/ceph_repo/ceph-qa-suite, epath=./.git/logs/HEAD, block: [0~180] 2026-02-17T08:24:38.351+0000 7f5a648c4640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) copy_to_remote: truncating epath=./.git/logs/HEAD to 180 bytes 2026-02-17T08:24:38.351+0000 7f5a648c4640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) copy_to_remote: dir_root=/ceph_repo/ceph-qa-suite, epath=./.git/logs/HEAD error=0 synced 2026-02-17T08:24:44.327+0000 7f5a660c7640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) wait_until_safe_to_snapshot: Waiting for data sync to be done to take snapshot - dir_root=/ceph_repo/ceph-qa-suite current=snap_a,2 prev= syncm=0x563dfdce3210 2026-02-17T08:24:44.339+0000 7f5a660c7640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) wait_until_safe_to_snapshot: Woke up to take snapshot - dir_root=/ceph_repo/ceph-qa-suite current=snap_a,2 prev= syncm=0x563dfdce3210 2026-02-17T08:25:18.743+0000 7f5a660c7640 10 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) push_dataq_entry: snapshot data replayer dataq pushed syncm=0x563dfe0cf690 epath=./.git/logs/HEAD 2026-02-17T08:25:18.743+0000 7f5a650c5640 10 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) copy_to_remote: dir_root=/ceph_repo/ceph-qa-suite, epath=./.git/logs/HEAD, num_blocks=1 2026-02-17T08:25:18.743+0000 7f5a650c5640 10 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) copy_to_remote: dir_root=/ceph_repo/ceph-qa-suite, epath=./.git/logs/HEAD, block: [0~335] 2026-02-17T08:25:18.747+0000 7f5a650c5640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) copy_to_remote: truncating epath=./.git/logs/HEAD to 335 bytes 2026-02-17T08:25:18.747+0000 7f5a650c5640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) copy_to_remote: dir_root=/ceph_repo/ceph-qa-suite, epath=./.git/logs/HEAD error=0 synced 2026-02-17T08:25:21.287+0000 7f5a660c7640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) wait_until_safe_to_snapshot: Waiting for data sync to be done to take snapshot - dir_root=/ceph_repo/ceph-qa-suite current=snap_b,3 prev= snap_a,2 syncm=0x563dfe0cf690 2026-02-17T08:25:22.531+0000 7f5a660c7640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) wait_until_safe_to_snapshot: Woke up to take snapshot - dir_root=/ceph_repo/ceph-qa-suite current=snap_b,3 prev= snap_a,2 syncm=0x563dfe0cf690 2026-02-17T08:25:53.882+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) wait_until_safe_to_snapshot: Waiting for data sync to be done to take snapshot - dir_root=/ceph_repo/ceph-qa-suite current=snap_c,4 prev= snap_b,3 syncm=0x563e12c24010 2026-02-17T08:25:53.882+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) wait_until_safe_to_snapshot: Woke up to take snapshot - dir_root=/ceph_repo/ceph-qa-suite current=snap_c,4 prev= snap_b,3 syncm=0x563e12c24010 --------------------------------------
Let's check whether the Snapdiff readdir really returned the file ceph_repo/ceph-qa-suite/.snap/snap_c/.git/logs/HEAD. From below logs, it confirms the snapdiff is not returned that file. It has only returned . and ..
------------- 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) get_entry: entry=logs, snapid=4 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) get_entry: epath=./.git/logs 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) init_directory: epath=./.git/logs 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) init_directory: open_snapdiff for dir_root=/ceph_repo/ceph-qa-suite, path=./.git/logs, prev=snap_b, current=snap_c 2026-02-17T08:25:51.306+0000 7f5a640c3640 5 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) should_sync_entry: entry=./.git/FETCH_HEAD, r=-2 2026-02-17T08:25:51.306+0000 7f5a640c3640 5 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) run_datasync: syncm=0x563e12c24010 entry=./.git/FETCH_HEAD data_sync=1 attr_sync=1 2026-02-17T08:25:51.306+0000 7f5a640c3640 10 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) remote_file_op: dir_root=/ceph_repo/ceph-qa-suite, epath=./.git/FETCH_HEAD, need_data_sync=1, need_attr_sync=1 2026-02-17T08:25:51.306+0000 7f5a640c3640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) get_changed_blocks: dir_root=/ceph_repo/ceph-qa-suite, epath=./.git/FETCH_HEAD, sync_check=1 2026-02-17T08:25:51.306+0000 7f5a640c3640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) get_changed_blocks: epath=./.git/FETCH_HEAD 2026-02-17T08:25:51.306+0000 7f5a640c3640 10 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) copy_to_remote: dir_root=/ceph_repo/ceph-qa-suite, epath=./.git/FETCH_HEAD, num_blocks=1 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) get_entry: Added directory to stack =./.git/logs 2026-02-17T08:25:51.306+0000 7f5a668c8640 10 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) remote_mkdir: remote epath=./.git/logs 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) do_synchronize: r=0 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) do_synchronize: epath=./.git/logs 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) get_entry: sync stack size=3 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) get_entry: top of stack path=./.git/logs 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) next_entry: is snapdiff 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) get_entry: entry=., snapid=3 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) next_entry: is snapdiff 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) get_entry: entry=.., snapid=3 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) next_entry: is snapdiff 2026-02-17T08:25:51.306+0000 7f5a668c8640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) get_entry: entry=refs, snapid=4 ------------
@Venky Shankar As I had mentioned over the chat, the test was failing on main without multithreading intermittently.
https://tracker.ceph.com/issues/68567 - #61068 (comment)
You did mention you fixed an issue with snapdiff while working on block diff, could you please check if this is the same or different issue?
Updated by Kotresh Hiremath Ravishankar about 1 month ago
Here is the analysis for second file `giant`. Between `snap_a` and `snap_b`, the snapdiff returned the file. But between `snap_b` and `snap_c` the file is not returned because of `mtime` not changes. The logs alone won't confirm it. Logs along with code reading confirms.
For snap_b sync, giant is successfully returned by snapdiff. MIRRO CLIENT LOGS below (SUCCESS CASE)
--------------------
T08:23:57.164054+0000 change_attr=1 caps=pAsLsXsFs parents=0x10000000223.3["heads" ref=2 ino=0x1000000022b.3 csg=0] 0x563e5264ca00) n = 1
2026-02-17T08:25:18.751+0000 7f5a660c7640 10 client.7438 send_request client_request(unknown.0:5483 readdir_snapdiff #0x10000000001//snap_a/.git/logs/refs/heads 2026-02-17T08:25:18.754966+0000 caller_uid=0, caller_gid=0{}) to mds.0
2026-02-17T08:25:18.751+0000 7f5a660c7640 1 -- 10.20.193.90:0/1293359666 --> [v2:10.20.193.90:6833/3272978257,v1:10.20.193.90:6834/3272978257] -- client_request(unknown.0:5483 readdir_snapdiff #0x10000000001//snap_a/.git/logs/refs/heads 2026-02-17T08:25:
18.754966+0000 caller_uid=0, caller_gid=0{}) -- 0x563e52612a80 con 0x563e55eacc00
2026-02-17T08:25:18.751+0000 7f5a660c7640 20 client.7438 awaiting reply|forward|kick on 0x7f5a660c2b90
2026-02-17T08:25:18.751+0000 7f5a738e2640 1 -- 10.20.193.90:0/1293359666 <== mds.0 v2:10.20.193.90:6833/3272978257 5493 ==== client_reply(???:5483 = 0 (0) Success safe) ==== 784+0+0 (crc 0 0 0) 0x563dfe066c00 con 0x563e55eacc00
2026-02-17T08:25:18.751+0000 7f5a738e2640 20 client.7438 handle_client_reply got a reply. Safe:1 tid 5483
2026-02-17T08:25:18.751+0000 7f5a738e2640 10 client.7438 insert_trace from 2026-02-17T08:25:18.754981+0000 mds.0 is_target=1 is_dentry=0
2026-02-17T08:25:18.751+0000 7f5a738e2640 10 client.7438 features 0xffffffffffffffff
2026-02-17T08:25:18.751+0000 7f5a738e2640 10 client.7438 hrm is_target=1 is_dentry=0
2026-02-17T08:25:18.751+0000 7f5a738e2640 12 client.7438 add_update_inode had 0x1000000022b.3(faked_ino=0 nref=9 ll_ref=0 cap_refs={} open={} mode=40775 size=0/0 nlink=1 btime=2026-02-17T08:23:57.162395+0000 mtime=2026-02-17T08:23:57.164054+0000 ctime=202
6-02-17T08:23:57.164054+0000 change_attr=1 caps=pAsLsXs parents=0x10000000223.3["heads" ref=1 ino=0x1000000022b.3 csg=0] 0x563e5264ca00) caps pAsLsXsFs
2026-02-17T08:25:18.751+0000 7f5a738e2640 10 client.7438 update_inode_file_time 0x1000000022b.3(faked_ino=0 nref=9 ll_ref=0 cap_refs={} open={} mode=40775 size=0/0 nlink=1 btime=2026-02-17T08:23:57.162395+0000 mtime=2026-02-17T08:23:57.164054+0000 ctime=2
026-02-17T08:23:57.164054+0000 change_attr=1 caps=pAsLsXs parents=0x10000000223.3["heads" ref=1 ino=0x1000000022b.3 csg=0] 0x563e5264ca00) - ctime 2026-02-17T08:23:57.164054+0000 mtime 2026-02-17T08:23:57.164054+0000
2026-02-17T08:25:18.751+0000 7f5a738e2640 20 client.7438 dir hash is 2
2026-02-17T08:25:18.751+0000 7f5a738e2640 12 client.7438 add_update_inode client inode change_attr: 1 , mds inodestat change_attr: 1
2026-02-17T08:25:18.751+0000 7f5a738e2640 15 open_dir 0x563dff1cde00 on 0x563e5264ca00
2026-02-17T08:25:18.751+0000 7f5a738e2640 10 client.7438 insert_readdir_results 1 readdir items, end=1, hash_order=1, readdir_start , last_hash 0, next_offset 2
2026-02-17T08:25:18.751+0000 7f5a738e2640 10 client.7438 _readdir_drop_dirp_buffer 0x563dfe1e0e00
2026-02-17T08:25:18.751+0000 7f5a738e2640 15 client.7438 0: 'giant'
2026-02-17T08:25:18.751+0000 7f5a738e2640 10 client.7438 update_inode_file_time 0x1000000022c.3(faked_ino=0 nref=1 ll_ref=0 cap_refs={} open={} mode=100664 size=0/0 nlink=1 btime=2026-02-17T08:23:57.164054+0000 mtime=0.000000 ctime=0.000000 change_attr=0
caps=- objectset[0x1000000022c ts 0/0 objects 0 dirty_or_tx 0] 0x563e5264d800) - ctime 2026-02-17T08:25:15.747320+0000 mtime 2026-02-17T08:25:15.747320+0000
2026-02-17T08:25:18.751+0000 7f5a738e2640 10 client.7438 size 0 -> 335
2026-02-17T08:25:18.751+0000 7f5a738e2640 12 client.7438 add_update_inode client inode change_attr: 0 , mds inodestat change_attr: 2
2026-02-17T08:25:18.751+0000 7f5a738e2640 12 client.7438 add_update_inode adding 0x1000000022c.3(faked_ino=0 nref=1 ll_ref=0 cap_refs={} open={} mode=100664 size=335/0 nlink=1 btime=2026-02-17T08:23:57.164054+0000 mtime=2026-02-17T08:25:15.747320+0000 cti
me=2026-02-17T08:25:15.747320+0000 change_attr=2 caps=- objectset[0x1000000022c ts 0/0 objects 0 dirty_or_tx 0] 0x563e5264d800) caps pLs
2026-02-17T08:25:18.751+0000 7f5a738e2640 15 client.7438 link dir 0x1000000022b.3(faked_ino=0 nref=10 ll_ref=0 cap_refs={} open={} mode=40775 size=0/0 nlink=1 btime=2026-02-17T08:23:57.162395+0000 mtime=2026-02-17T08:23:57.164054+0000 ctime=2026-02-17T08:
23:57.164054+0000 change_attr=1 caps=pAsLsXsFs parents=0x10000000223.3["heads" ref=2 ino=0x1000000022b.3 csg=0] 0x563e5264ca00) 'giant' to inode 0x1000000022c.3(faked_ino=0 nref=1 ll_ref=0 cap_refs={} open={} mode=100664 size=335/0 nlink=1 btime=2026-02-1
7T08:23:57.164054+0000 mtime=2026-02-17T08:25:15.747320+0000 ctime=2026-02-17T08:25:15.747320+0000 change_attr=2 caps=pLs objectset[0x1000000022c ts 0/0 objects 0 dirty_or_tx 0] 0x563e5264d800) dn 0x1000000022b.3["giant" ref=1 ino=nil csg=-1] (new dn)
2026-02-17T08:25:18.751+0000 7f5a738e2640 20 client.7438 put_inode on 0x1000000022c.3(faked_ino=0 nref=3 ll_ref=0 cap_refs={} open={} mode=100664 size=335/0 nlink=1 btime=2026-02-17T08:23:57.164054+0000 mtime=2026-02-17T08:25:15.747320+0000 ctime=2026-02-
17T08:25:15.747320+0000 change_attr=2 caps=pLs objectset[0x1000000022c ts 0/0 objects 0 dirty_or_tx 0] parents=0x1000000022b.3["giant" ref=1 ino=0x1000000022c.3 csg=-1] 0x563e5264d800) n = 1
2026-02-17T08:25:18.751+0000 7f5a738e2640 20 client.7438 link inode 0x563e5264d800 parents now 0x1000000022b.3["giant" ref=1 ino=0x1000000022c.3 csg=-1]
2026-02-17T08:25:18.751+0000 7f5a738e2640 15 client.7438 update_dentry_lease 0x1000000022b.3["giant" ref=1 ino=0x1000000022c.3 csg=-1] lease(mask 0 dur 0) from 2026-02-17T08:25:18.754981+0000
2026-02-17T08:25:18.751+0000 7f5a738e2640 15 client.7438 insert_readdir_results ffbbdabc0000002: 'giant' -> 0x1000000022c
2026-02-17T08:25:18.751+0000 7f5a738e2640 20 client.7438 handle_client_reply signalling caller 0x7f5a660c2b90
2026-02-17T08:25:18.751+0000 7f5a738e2640 20 client.7438 handle_client_reply awaiting kickback on tid 5483 0x7f5a738de840
2026-02-17T08:25:18.751+0000 7f5a660c7640 20 client.7438 sendrecv kickback on tid 5483 0x7f5a738de840
2026-02-17T08:25:18.751+0000 7f5a660c7640 20 client.7438 lat 2049-12-20T22:30:42.378612+0000
2026-02-17T08:25:18.751+0000 7f5a660c7640 10 client.7438 _readdir_get_frag 0x563dfe1e0e00 got frag * size 1
2026-02-17T08:25:18.751+0000 7f5a660c7640 10 client.7438 _readdir_r_cb frag * buffer size 1 offset 2
2026-02-17T08:25:18.751+0000 7f5a660c7640 10 client.7438 fill_statx on 0x1000000022c snap/dev3 mode 0100664 mtime 2026-02-17T08:25:15.747320+0000 ctime 2026-02-17T08:25:15.747320+0000 change_attr 2
2026-02-17T08:25:18.751+0000 7f5a660c7640 20 client.7438 _unwrap_name: (giant, ) on 0x1000000022b.2(faked_ino=0 nref=9 ll_ref=0 cap_refs={} open={1=1} mode=40775 size=0/0 nlink=1 btime=2026-02-17T08:23:57.162395+0000 mtime=2026-02-17T08:23:57.164054+0000
ctime=2026-02-17T08:23:57.164054+0000 change_attr=1 caps=pAsLsXsFs COMPLETE parents=0x10000000223.2["heads" ref=2 ino=0x1000000022b.2 csg=0] 0x563e3d102700)
2026-02-17T08:25:18.751+0000 7f5a660c7640 10 client.7438 fill_dirent 'giant' -> 0x1000000022c type 8 w/ next_off ffbbdabc0000003
2026-02-17T08:25:18.751+0000 7f5a660c7640 15 client.7438 _readdir_r_cb de giant off ffbbdabc0000002 snap 3 = 1
2026-02-17T08:25:18.751+0000 7f5a660c7640 15 client.7438 readdir_snapdiff 1 giant 3
2026-02-17T08:25:18.751+0000 7f5a660c7640 10 client.7438 path_walk: cur=0x10000000001.3(faked_ino=0 nref=219 ll_ref=6 cap_refs={} open={0=6} mode=40775 size=0/0 nlink=1 btime=2026-02-17T08:23:53.765968+0000 mtime=2026-02-17T08:24:03.332008+0000 ctime=2026
-02-17T08:25:15.770711+0000 change_attr=16 caps=pAsLsXsFs parents=0x10000000001.snapdir["snap_b" ref=3 ino=0x10000000001.3 csg=0] 0x563e523ef100) path=
2026-02-17T08:25:18.751+0000 7f5a660c7640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) get_entry: entry=giant, snapid=3
2026-02-17T08:25:18.751+0000 7f5a660c7640 20 cephfs::mirror::PeerReplayer(85ef01e2-ddbc-48a0-b81f-73a105f4bf91) get_entry: epath=./.git/logs/refs/heads/giant
----------------------------
The corresponding mds logs for snap_b snapdiff for file giant. MDS RESPONSE (SUCCESS CASE)
------------------------
2026-02-17T08:25:18.751+0000 7f8fe76eb640 10 mds.0.server handle_client_readdir_snapdiff frag * offset '' offset_hash 0 flags 1
2026-02-17T08:25:18.751+0000 7f8fe76eb640 10 mds.0.server handle_client_readdir_snapdiff on [dir 0x1000000022b /ceph_repo/ceph-qa-suite/.git/logs/refs/heads/ [2,head] auth v=9 cv=0/0 state=1610874881|complete f(v0 m2026-02-17T08:23:57.164054+0000 1=1+0) n(v0 rc2026-02-17T08:25:15.747320+0000 b335 1=1+0) hs=1+0,ss=1+0
dirty=2 | child=1 dirty=1 authpin=0 0x55f599dac400]
2026-02-17T08:25:18.751+0000 7f8fe76eb640 10 mds.0.server handle_client_readdir_snapdiff snap 2 vs. snap 3
2026-02-17T08:25:18.751+0000 7f8fe76eb640 20 mds.0.server build_snap_diff (giant,2)->[dentry #0x1/ceph_repo/ceph-qa-suite/.git/logs/refs/heads/giant [2,2] auth (dversion lock) v=7 ino=0x1000000022c remote_ino=0x0 referent_inode_ptr=(nil) referent_ino=0x0 state=1610612736 | request=0 inodepin=1 dirty=1 0x55f59b50d180]
2026-02-17T08:25:18.751+0000 7f8fe76eb640 20 mds.0.server build_snap_diff (giant,head)->[dentry #0x1/ceph_repo/ceph-qa-suite/.git/logs/refs/heads/giant [3,head] auth (dversion lock) v=8 ino=0x1000000022c remote_ino=0x0 referent_inode_ptr=(nil) referent_ino=0x0 state=1610612736 | request=0 lock=0 inodepin=1 dirty=1 au
thpin=0 0x55f599b0f180]
2026-02-17T08:25:18.751+0000 7f8fe76eb640 20 mds.0.server build_snap_diff new file giant 3/head
2026-02-17T08:25:18.751+0000 7f8fe76eb640 10 mds.0.server inc dn [dentry #0x1/ceph_repo/ceph-qa-suite/.git/logs/refs/heads/giant [3,head] auth (dversion lock) v=8 ino=0x1000000022c remote_ino=0x0 referent_inode_ptr=(nil) referent_ino=0x0 state=1610612736 | request=0 lock=0 inodepin=1 dirty=1 authpin=0 0x55f599b0f180]
as giant hash 0xbbdabc 3
2026-02-17T08:25:18.751+0000 7f8fe76eb640 20 mds.0.locker issue_client_lease no/null lease on [dentry #0x1/ceph_repo/ceph-qa-suite/.git/logs/refs/heads/giant [3,head] auth (dversion lock) v=8 ino=0x1000000022c remote_ino=0x0 referent_inode_ptr=(nil) referent_ino=0x0 state=1610612736 | request=0 lock=0 inodepin=1 dirt
y=1 authpin=0 0x55f599b0f180]
2026-02-17T08:25:18.751+0000 7f8fe76eb640 10 mds.0.server inc inode [inode 0x1000000022c [3,head] /ceph_repo/ceph-qa-suite/.git/logs/refs/heads/giant auth v8 DIRTYPARENT s=335 n(v0 rc2026-02-17T08:25:15.747320+0000 b335 1=1+0) (iauth excl) (ifile excl) (ixattr excl) cr={7336=0-4194304@3} caps={7336=pAsxLsXsxFsxcrwb/p
AsxXsxFxwb@4},l=7336 | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 dirty=1 authpin=0 0x55f599e4a000] snap 3
2026-02-17T08:25:18.751+0000 7f8fe76eb640 10 mds.0.cache.snaprealm(0x10000000001 seq 3 0x55f599c24000) get_snaps 2,3 (seq 3 cached_seq 3)
2026-02-17T08:25:18.751+0000 7f8fe76eb640 10 mds.0.cache.snaprealm(0x10000000001 seq 3 0x55f599c24000) get_snap_info snaps 2,3
2026-02-17T08:25:18.751+0000 7f8fe76eb640 10 mds.0.cache.snaprealm(0x1 seq 1 0x55f598c758c0) get_snaps (seq 1 cached_seq 1)
2026-02-17T08:25:18.751+0000 7f8fe76eb640 10 mds.0.cache.snaprealm(0x1 seq 1 0x55f598c758c0) get_snap_info snaps
2026-02-17T08:25:18.751+0000 7f8fe76eb640 20 mds.0.cache.ino(0x1000000022c) pfile 0 pauth 0 plink 0 pxattr 0 plocal 0 mtime 2026-02-17T08:25:15.747320+0000 ctime 2026-02-17T08:25:15.747320+0000 change_attr 2 valid=1
2026-02-17T08:25:18.751+0000 7f8fe76eb640 10 mds.0.cache.ino(0x1000000022c) encode_inodestat caps pLs seq 0 mseq 0 xattrv 1
2026-02-17T08:25:18.751+0000 7f8fe76eb640 20 mds.0.142 get_session have 0x55f599bb2800 client.7438 10.20.193.90:0/1293359666 state open
2026-02-17T08:25:18.751+0000 7f8fe76eb640 10 mds.0.server reply to client_request(client.7438:5483 readdir_snapdiff #0x10000000001//snap_a/.git/logs/refs/heads 2026-02-17T08:25:18.754966+0000 caller_uid=0, caller_gid=0{}) readdir num=1 bytes=405 start=1 end=1
2026-02-17T08:25:18.751+0000 7f8fe76eb640 20 mds.0.bal hit_dir 2 pop is 1.01631, frag * size 1 [pop IRD:[C 3.41e-02] IWR:[C 2.16e+00] RDR:[C 1.02e+00] FET:[C 0.00e+00] STR:[C 0.00e+00] *LOAD:5.4]
2026-02-17T08:25:18.751+0000 7f8fe76eb640 20 mds.0.bal hit_dir 2 pop 10.1794 spread in [dir 0x1000000022b /ceph_repo/ceph-qa-suite/.git/logs/refs/heads/ [2,head] auth v=9 cv=0/0 state=1610874881|complete f(v0 m2026-02-17T08:23:57.164054+0000 1=1+0) n(v0 rc2026-02-17T08:25:15.747320+0000 b335 1=1+0) hs=1+0,ss=1+0 dirt
y=2 | child=1 dirty=1 authpin=0 0x55f599dac400]
2026-02-17T08:25:18.751+0000 7f8fe76eb640 7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.7438:5483 readdir_snapdiff #0x10000000001//snap_a/.git/logs/refs/heads 2026-02-17T08:25:18.754966+0000 caller_uid=0, caller_gid=0{})
2026-02-17T08:25:18.751+0000 7f8fe76eb640 10 mds.0.server apply_allocated_inos 0x0 / [] / 0x0
-----------------------------------------
Let's analyse the same for snap_c - giant file is not returned - MIRROR CLIENT LOGS (FAILURE CASE)
------------------------
2026-02-17T08:25:51.314+0000 7f5a668c8640 10 client.7438 send_request client_request(unknown.0:9584 readdir_snapdiff #0x10000000001//snap_b/.git/logs/refs/heads 2026-02-17T08:25:51.319212+0000 caller_uid=0, caller_gid=0{}) to mds.0
2026-02-17T08:25:51.314+0000 7f5a668c8640 1 -- 10.20.193.90:0/1293359666 --> [v2:10.20.193.90:6833/3272978257,v1:10.20.193.90:6834/3272978257] -- client_request(unknown.0:9584 readdir_snapdiff #0x10000000001//snap_b/.git/logs/refs/heads 2026-02-17T08:25:51.319212+0000 caller_uid=0, caller_gid=0{}) -- 0x563e52443880 con 0x563e55eacc00
2026-02-17T08:25:51.314+0000 7f5a668c8640 20 client.7438 awaiting reply|forward|kick on 0x7f5a668c3b90
2026-02-17T08:25:51.314+0000 7f5a738e2640 20 client.7438 put_inode on 0x10000000223.3(faked_ino=0 nref=14 ll_ref=0 cap_refs={} open={} mode=40775 size=0/0 nlink=1 btime=2026-02-17T08:23:57.110493+0000 mtime=2026-02-17T08:23:57.162395+0000 ctime=2026-02-17T08:23:57.162395+0000 change_attr=2 caps=pAsLsXsFs parents=0x10000000222.3["refs" ref=2 ino=0x10000000223.3 csg=0] 0x563e5264c300) n = 1
2026-02-17T08:25:51.314+0000 7f5a738e2640 20 client.7438 put_inode on 0x10000000223.3(faked_ino=0 nref=14 ll_ref=0 cap_refs={} open={} mode=40775 size=0/0 nlink=1 btime=2026-02-17T08:23:57.110493+0000 mtime=2026-02-17T08:23:57.162395+0000 ctime=2026-02-17T08:23:57.162395+0000 change_attr=2 caps=pAsLsXsFs parents=0x10000000222.3["refs" ref=2 ino=0x10000000223.3 csg=0] 0x563e5264c300) n = 1
2026-02-17T08:25:51.314+0000 7f5a738e2640 1 -- 10.20.193.90:0/1293359666 <== mds.0 v2:10.20.193.90:6833/3272978257 9597 ==== client_reply(???:9584 = 0 (0) Success safe) ==== 403+0+0 (crc 0 0 0) 0x563e3cef8200 con 0x563e55eacc00
2026-02-17T08:25:51.314+0000 7f5a738e2640 20 client.7438 handle_client_reply got a reply. Safe:1 tid 9584
2026-02-17T08:25:51.314+0000 7f5a738e2640 10 client.7438 insert_trace from 2026-02-17T08:25:51.319225+0000 mds.0 is_target=1 is_dentry=0
2026-02-17T08:25:51.314+0000 7f5a738e2640 10 client.7438 features 0xffffffffffffffff
2026-02-17T08:25:51.314+0000 7f5a738e2640 10 client.7438 hrm is_target=1 is_dentry=0
2026-02-17T08:25:51.314+0000 7f5a738e2640 12 client.7438 add_update_inode had 0x1000000022b.4(faked_ino=0 nref=9 ll_ref=0 cap_refs={} open={} mode=40775 size=0/0 nlink=1 btime=2026-02-17T08:23:57.162395+0000 mtime=2026-02-17T08:23:57.164054+0000 ctime=2026-02-17T08:23:57.164054+0000 change_attr=1 caps=pAsLsXs parents=0x10000000223.4["heads" ref=1 ino=0x1000000022b.4 csg=0] 0x563e12c40700) caps pAsLsXsFs
2026-02-17T08:25:51.314+0000 7f5a738e2640 10 client.7438 update_inode_file_time 0x1000000022b.4(faked_ino=0 nref=9 ll_ref=0 cap_refs={} open={} mode=40775 size=0/0 nlink=1 btime=2026-02-17T08:23:57.162395+0000 mtime=2026-02-17T08:23:57.164054+0000 ctime=2026-02-17T08:23:57.164054+0000 change_attr=1 caps=pAsLsXs parents=0x10000000223.4["heads" ref=1 ino=0x1000000022b.4 csg=0] 0x563e12c40700) - ctime 2026-02-17T08:23:57.164054+0000 mtime 2026-02-17T08:23:57.164054+0000
2026-02-17T08:25:51.314+0000 7f5a738e2640 20 client.7438 dir hash is 2
2026-02-17T08:25:51.314+0000 7f5a738e2640 12 client.7438 add_update_inode client inode change_attr: 1 , mds inodestat change_attr: 1
2026-02-17T08:25:51.314+0000 7f5a738e2640 15 open_dir 0x563dff0aa0c0 on 0x563e12c40700
2026-02-17T08:25:51.314+0000 7f5a738e2640 10 client.7438 insert_readdir_results 0 readdir items, end=1, hash_order=1, readdir_start , last_hash 0, next_offset 2
2026-02-17T08:25:51.314+0000 7f5a738e2640 10 client.7438 _readdir_drop_dirp_buffer 0x563e12c26700
2026-02-17T08:25:51.314+0000 7f5a738e2640 15 client.7438 close_dir dir 0x563dff0aa0c0 on 0x563e12c40700
2026-02-17T08:25:51.314+0000 7f5a738e2640 20 client.7438 put_inode on 0x1000000022b.4(faked_ino=0 nref=10 ll_ref=0 cap_refs={} open={} mode=40775 size=0/0 nlink=1 btime=2026-02-17T08:23:57.162395+0000 mtime=2026-02-17T08:23:57.164054+0000 ctime=2026-02-17T08:23:57.164054+0000 change_attr=1 caps=pAsLsXsFs parents=0x10000000223.4["heads" ref=1 ino=0x1000000022b.4 csg=0] 0x563e12c40700) n = 1
2026-02-17T08:25:51.314+0000 7f5a738e2640 20 client.7438 handle_client_reply signalling caller 0x7f5a668c3b90
2026-02-17T08:25:51.314+0000 7f5a738e2640 20 client.7438 handle_client_reply awaiting kickback on tid 9584 0x7f5a738de840
2026-02-17T08:25:51.314+0000 7f5a668c8640 20 client.7438 sendrecv kickback on tid 9584 0x7f5a738de840
2026-02-17T08:25:51.314+0000 7f5a668c8640 20 client.7438 lat 2049-12-20T22:30:42.378632+0000
2026-02-17T08:25:51.314+0000 7f5a668c8640 10 client.7438 _readdir_get_frag 0x563e12c26700 got frag * size 0
2026-02-17T08:25:51.314+0000 7f5a668c8640 10 client.7438 _readdir_r_cb frag * buffer size 0 offset 2
2026-02-17T08:25:51.314+0000 7f5a668c8640 15 client.7438 readdir_snapdiff 0 .. 3
2026-02-17T08:25:51.314+0000 7f5a668c8640 3 client.7438 closedir(0x563e12c26540) = 0
------------------
snap_c - giant file mds response - MDS RESPONSE (FAILURE CASE)
----------------
2026-02-17T08:25:51.314+0000 7f8fe76eb640 10 mds.0.server handle_client_readdir_snapdiff frag * offset '' offset_hash 0 flags 1
2026-02-17T08:25:51.314+0000 7f8fe76eb640 10 mds.0.server handle_client_readdir_snapdiff on [dir 0x1000000022b /ceph_repo/ceph-qa-suite/.git/logs/refs/heads/ [2,head] auth pv=15 v=13 cv=0/0 ap=1+1 state=1610874881|complete f(v0 m2026-02-17T08:23:57.164054
+0000 1=1+0)->f(v0 m2026-02-17T08:23:57.164054+0000 1=1+0) n(v0 rc2026-02-17T08:25:15.747320+0000 b335 1=1+0)->n(v0 rc2026-02-17T08:25:50.901948+0000 b485 1=1+0) hs=1+0,ss=2+0 dirty=3 | child=1 dirty=1 authpin=1 0x55f599dac400]
2026-02-17T08:25:51.314+0000 7f8fe76eb640 10 mds.0.server handle_client_readdir_snapdiff snap 3 vs. snap 4
2026-02-17T08:25:51.314+0000 7f8fe76eb640 20 mds.0.server build_snap_diff (giant,2)->[dentry #0x1/ceph_repo/ceph-qa-suite/.git/logs/refs/heads/giant [2,2] auth (dversion lock) v=7 ino=0x1000000022c remote_ino=0x0 referent_inode_ptr=(nil) referent_ino=0x0 state=1610612736 | request=0 inodepin=1 dirty=1 0x55f59b50d180]
2026-02-17T08:25:51.314+0000 7f8fe76eb640 20 mds.0.server build_snap_diff not in range, skipping
2026-02-17T08:25:51.314+0000 7f8fe76eb640 20 mds.0.server build_snap_diff (giant,3)->[dentry #0x1/ceph_repo/ceph-qa-suite/.git/logs/refs/heads/giant [3,3] auth (dversion lock) v=11 ino=0x1000000022c remote_ino=0x0 referent_inode_ptr=(nil) referent_ino=0x0 state=1610612736 | request=0 inodepin=1 dirty=1 0x55f59a589180]
2026-02-17T08:25:51.314+0000 7f8fe76eb640 20 mds.0.server build_snap_diff (giant,head)->[dentry #0x1/ceph_repo/ceph-qa-suite/.git/logs/refs/heads/giant [4,head] auth (dversion lock) pv=14 v=12 ino=0x1000000022c remote_ino=0x0 referent_inode_ptr=(nil) referent_ino=0x0 state=1610612736 | request=0 lock=0 inodepin=1 dirty=1 authpin=0 0x55f599b0f180]
2026-02-17T08:25:51.314+0000 7f8fe76eb640 20 mds.0.142 get_session have 0x55f599bb2800 client.7438 10.20.193.90:0/1293359666 state open
2026-02-17T08:25:51.314+0000 7f8fe76eb640 10 mds.0.server reply to client_request(client.7438:9584 readdir_snapdiff #0x10000000001//snap_b/.git/logs/refs/heads 2026-02-17T08:25:51.319212+0000 caller_uid=0, caller_gid=0{}) readdir num=0 bytes=24 start=1 end=1
2026-02-17T08:25:51.314+0000 7f8fe76eb640 20 mds.0.bal hit_dir 2 pop is 0.0431002, frag * size 1 [pop IRD:[C 9.85e-02] IWR:[C 1.93e+00] RDR:[C 4.31e-02] FET:[C 0.00e+00] STR:[C 0.00e+00] *LOAD:4.0]
2026-02-17T08:25:51.314+0000 7f8fe76eb640 20 mds.0.bal hit_dir 2 pop 0.474103 spread in [dir 0x1000000022b /ceph_repo/ceph-qa-suite/.git/logs/refs/heads/ [2,head] auth pv=15 v=13 cv=0/0 ap=1+1 state=1610874881|complete f(v0 m2026-02-17T08:23:57.164054+0000 1=1+0)->f(v0 m2026-02-17T08:23:57.164054+0000 1=1+0) n(v0 rc2026-02-17T08:25:15.747320+0000 b335 1=1+0)->n(v0 rc2026-02-17T08:25:50.901948+0000 b485 1=1+0) hs=1+0,ss=2+0 dirty=3 | child=1 dirty=1 authpin=1 0x55f599dac400]
2026-02-17T08:25:51.314+0000 7f8fe76eb640 7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.7438:9584 readdir_snapdiff #0x10000000001//snap_b/.git/logs/refs/heads 2026-02-17T08:25:51.319212+0000 caller_uid=0, caller_gid=0{})
2026-02-17T08:25:51.314+0000 7f8fe76eb640 10 mds.0.server apply_allocated_inos 0x0 / [] / 0x0
------------------------
Updated by Kotresh Hiremath Ravishankar 27 days ago
- Subject changed from qa: Test failure: test_cephfs_mirror_incremental_sync (tasks.cephfs.test_mirroring.TestMirroring) to qa: Test failure: test_cephfs_mirror_incremental_sync/test_cephfs_mirror_sync_with_purged_snapshot (tasks.cephfs.test_mirroring.TestMirroring)
Updated by Kinjal Dhaduk 26 days ago
I am able to reproduce this issue with ceph 9.0 GA build.
Setup:
-------
1. Created fresh primary_fs and secondary_fs filesystems
2. Created subvolume subvol_1 under subvolgroup_1 on primary_fs
3. Mounted primary_fs at /mnt/primary and secondary_fs at /mnt/secondary
4. Enabled CephFS mirroring with bootstrap token peering
5. Added /volumes/subvolgroup_1/subvol_1 for mirroring
6. Enabled mirror daemon debug logging (debug_cephfs_mirror 20)
Steps Performed:
1. Cloned ceph-qa-suite repo (branch giant) onto primary mount:
git clone --branch giant http://github.com/ceph/ceph-qa-suite /mnt/primary/volumes/subvolgroup_1/subvol_1/ceph_repo/ceph-qa-suite
2. Created snap_a — waited for full sync to complete. Checksums matched.
3. Ran git reset --hard HEAD~7 — created snap_b — waited for incremental sync. Checksums matched.
4. Ran git pull — created snap_c — waited for incremental sync.
Result
-------
Checksum mismatch on snap_c for the same two files reported in the original issue:
< 102d3866b54d386ef781863f9573be9e .snap/snap_c/ceph_repo/ceph-qa-suite/.git/logs/HEAD
b3b247fe5260fa8d7fb22585b5d30ac5 .snap/snap_c/ceph_repo/ceph-qa-suite/.git/logs/HEAD
< 102d3866b54d386ef781863f9573be9e .snap/snap_c/ceph_repo/ceph-qa-suite/.git/logs/refs/heads/giant
b3b247fe5260fa8d7fb22585b5d30ac5 .snap/snap_c/ceph_repo/ceph-qa-suite/.git/logs/refs/heads/giant
Primary (correct — 3 reflog entries):
clone: from http://github.com/ceph/ceph-qa-suite
reset: moving to HEAD~7
pull: Fast-forward
Secondary (stale — missing pull entry):
clone: from http://github.com/ceph/ceph-qa-suite
reset: moving to HEAD~7
Updated by Kinjal Dhaduk 26 days ago
- File ceph-client.cephfs-mirror.ceph1-kdhaduk-gu03xh-node6.mwksix.log.zip ceph-client.cephfs-mirror.ceph1-kdhaduk-gu03xh-node6.mwksix.log.zip added
Please find attached mirror daemon logs.
Updated by Venky Shankar 26 days ago
The reason for the missing entries from snapdiff is due to the fact that the snapshot writeback is still in-progress for some of the inodes and until that finishes and the MDS does a copy-onm-write of the snap inode (omap entry), the HEAD inode for the directory entry is used by the MDS for mtime/{first, last} comparison. Since the mtime hasn't been updated yet (snapshot write back still in-progress), some directory entries get excluded from the snapdiff call.
Updated by Venky Shankar 19 days ago
- Category set to Correctness/Safety
- Status changed from New to Triaged
- Assignee set to Venky Shankar
- Priority changed from Normal to High
- Target version set to v21.0.0
- Source set to Development
- Backport set to tentacle,squid
Updated by Venky Shankar 19 days ago
- Subject changed from qa: Test failure: test_cephfs_mirror_incremental_sync/test_cephfs_mirror_sync_with_purged_snapshot (tasks.cephfs.test_mirroring.TestMirroring) to qa: Test failure: test_cephfs_mirror_incremental_sync/test_cephfs_mirror_sync_with_purged_snapshot (snapdiff bug)