reef: cephfs_mirror: use snapdiff api for incremental syncing#58985
reef: cephfs_mirror: use snapdiff api for incremental syncing#58985
Conversation
|
jenkins test make check |
|
This needs a re-run and a rebuild since binaries built on shaman have been deleted by now . See - https://tracker.ceph.com/issues/67493#note-11. |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
@joscollin PTAL at the failure - https://pulpito.ceph.com/rishabh-2024-09-10_14:18:13-fs-wip-rishabh-testing-20240910.073325-reef-distro-default-smithi/7899232/ (which is reproducible) |
@vshankar My opinion is we shouldn't block this PR because of this and should be tracked separately [1]. |
|
e980457 to
bba88a8
Compare
|
@vshankar Added more commits. Please re-run the QA. |
|
jenkins retest this please |
bba88a8 to
f0d0b5b
Compare
|
Blocked due to: #54633 (review) |
|
Fix created: #61539. |
|
jenkins test docs |
|
jenkins test make check |
|
This PR is under test in https://tracker.ceph.com/issues/69881. |
|
Thanks for the reminder. |
Use snapdiff api to sync only the delta of files between two snapshots. Fixes: https://tracker.ceph.com/issues/61334 Signed-off-by: Jos Collin <jcollin@redhat.com> (cherry picked from commit 96c351c)
Likely ceph#55619 missed that. Signed-off-by: Igor Fedotov <igor.fedotov@croit.io> (cherry picked from commit 6f1d21c)
ceph#55619 eliminates the need for ceph_close() call after successful ceph_fdopendir() one. And introduces automatic file descriptor's close when corresponding ceph_closedir() is called. That hasty ceph_close() call makes file descriptor available for a new allocation which might conflict with the automatic fd close in the above ceph_closedir(). Full PeerReplayer::do_synchronize() has been reworked to close fds properly, depending on whether ceph_fdopendir() has been already applied to them. Additionally for the sake of uniformity this reworks incremental do_synchronize() in a way to do final fd closings similar to full implementation. Plus this effectively reverts ceph@bd78bdc as it looks like a wrong approach to fight broken file descriptor references. No much sense in reopening of the current snapshot's root folder on each new entry processing. Instead this patch just doesn't close it from the beginning. Fixes: https://tracker.ceph.com/issues/68853 Signed-off-by: Igor Fedotov <igor.fedotov@croit.io> (cherry picked from commit 7a747bc)
This avoids considering latest changes from the source filesystem for the mirroring of already existing snapshots. Thus the destination filesystem and snapshots would be created based only on the source snapshots. The destination fs would be a replica of the last snapshot taken. Fixes: https://tracker.ceph.com/issues/68567 Signed-off-by: Jos Collin <jcollin@redhat.com> (cherry picked from commit 23e4cd5)
Fixes: https://tracker.ceph.com/issues/68567 Signed-off-by: Jos Collin <jcollin@redhat.com> (cherry picked from commit 830626b)
Fixes: https://tracker.ceph.com/issues/69671 Signed-off-by: Jos Collin <jcollin@redhat.com> Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit d9ac431) Conflicts: src/tools/cephfs_mirror/PeerReplayer.h - Resolved cherry-pick conflicts due to 691ed01 not backported to reef
As a debug aid when the snapdiff request is erroring out. Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit ae9b035) Conflicts: src/mds/Server.cc - Resolved cherry-pick conflicts
Fixes: https://tracker.ceph.com/issues/59067 Signed-off-by: Leonid Usov <leonid.usov@ibm.com> (cherry picked from commit 749c770)
Fixes: http://tracker.ceph.com/issues/70287 Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit 33c6f23)
Signed-off-by: Venky Shankar <vshankar@redhat.com> (cherry picked from commit e16820f)
132fdad to
20fdf94
Compare
|
@joscollin - as discussed, we need the additional commit from the main branch that was dropped in the latest update. And if there is a bug in that commit when backporting, then we should find that first. |
I can debug and fix it. But as I mentioned in the discussion, that fix would make reef newer than main in git. |
I don't understand how it will be newer than main. The subclassing change is a commit that needs backporting since the main branch tests have been running with that.
The subclassing commit(*) has the remote and snapdiff sync class mechanism. The commit is standalone and does not reply on blockdiff. (*): d9ac431#diff-902f4a8c220846756f7af155a115519a0382579cd04dcc7069121665c245edfe |
|
jenkins test make check |
1 similar comment
|
jenkins test make check |
|
@joscollin any update on this? |
@vshankar I think @ifed01 also faced the same issue and working on a fix for it? I need to review this PR: #64995. |
No, it doesn't fix the issue found in this PR. |
Yeh, I was about to leave a comment since I reviewed PR #64995 and it does not look like the test failed due to that bug. |
I'm debugging this. |
@joscollin - any update on this? |
@vshankar |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
|
Is this not making the 18.2.8 release? Docs currently state that snapdiff is used in reef mirroring. |
Unfortunately, not. I see the docs are incorrect, possibly due to the fact that doc changes got backported first. |
|
backport tracker: https://tracker.ceph.com/issues/65222
backport tracker: https://tracker.ceph.com/issues/69244
backport tracker: https://tracker.ceph.com/issues/72612
Fixes: https://tracker.ceph.com/issues/71110
backport of #54633
parent tracker: https://tracker.ceph.com/issues/61334
this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh