qa: fix test_cephfs_mirror_incremental_sync failure#61068
qa: fix test_cephfs_mirror_incremental_sync failure#61068
Conversation
| self.add_directory(self.primary_fs_name, self.primary_fs_id, f'/{repo_path}') | ||
|
|
||
| # clear git logs before taking snapshot | ||
| self.mount_a.run_shell(['rm', '-rf', f'{repo_path}/.git/logs']) |
There was a problem hiding this comment.
It's not 100% clear to me that git logs are causing the test failure. Could you explain how?
There was a problem hiding this comment.
This happens mainly during the incremental sync. The md5sum differs consistently for the files .git/logs/HEAD and .git/logs/refs/heads/giant for the source and destination snapshots (thus the checksum differs), which means it changes during the incremental sync. I think that's because it's a repo and writes something to those files, as the path changes ?
As our goal is to test the incremental sync, avoiding the logs (a few KBs) altogether, works very well.
There was a problem hiding this comment.
As I've checked further, this issue doesn't exist in vstart. It happens in teuthology only.
There was a problem hiding this comment.
This happens mainly during the incremental sync. The md5sum differs consistently for the files
.git/logs/HEADand.git/logs/refs/heads/mainfor the source and destination snapshots (thus the checksum differs), which means it changes during the incremental sync. I think that's because it's a repo and writes something to those files, as the path changes ?
But the checksum are for snapshots, which are immutable, so I don't understand how the checksum would change 🤔
There was a problem hiding this comment.
This happens mainly during the incremental sync. The md5sum differs consistently for the files
.git/logs/HEADand.git/logs/refs/heads/mainfor the source and destination snapshots (thus the checksum differs), which means it changes during the incremental sync. I think that's because it's a repo and writes something to those files, as the path changes ?But the checksum are for snapshots, which are immutable, so I don't understand how the checksum would change 🤔
@vshankar
Look at the above instance, there's a difference in .git/refs/heads/giant too.
I think the solution would be to separate the working directory (.git repo) and the dir_root. So that the dir_root would contain the updated ceph-qa-suite without the .git dir, ready to sync.
There was a problem hiding this comment.
This happens mainly during the incremental sync. The md5sum differs consistently for the files
.git/logs/HEADand.git/logs/refs/heads/mainfor the source and destination snapshots (thus the checksum differs), which means it changes during the incremental sync. I think that's because it's a repo and writes something to those files, as the path changes ?But the checksum are for snapshots, which are immutable, so I don't understand how the checksum would change 🤔
@vshankar
It could be because of this issue #54633 (comment), but as I remember test_cephfs_mirror_incremental_sync fails intermittently even before snapdiff was merged.
Anyway, let me know your decision on this. Can I drop this change?
There was a problem hiding this comment.
Yeh, we need to fix it there rather than working around it.
There was a problem hiding this comment.
I think the solution would be to separate the working directory (.git repo) and the dir_root. So that the dir_root would contain the updated ceph-qa-suite without the
.gitdir, ready to sync.
@joscollin I really don't understand why are we tinkering around with .git stuff. The repo was cloned in cephfs and snapshots were taken. So, the snapshotted directory should be correctly replicated irrespective of what lives under that directory.
Clear the git logs before taking a new snapshot, as they are not good candidates to verify an incremental sync. Fixes: https://tracker.ceph.com/issues/68567 Signed-off-by: Jos Collin <jcollin@redhat.com>
It's easier to save the HEAD and reset to it instead of using git pull. Signed-off-by: Jos Collin <jcollin@redhat.com>
50586e0 to
dd9c172
Compare
|
jenkins test make check arm64 |
|
Is cephfs-mirror incremental sync(using snapdiff) fixed? I still found some problem. I can share the reproducer.
Firs snapshot will be synced successfully as it does not use snapdiff. But for the second snapshot, I found this. Also I feel we need to take a bit care when incremental sync fails. Because if it fails, I am not sure whether it mismirrors in the next phase or not. |
@sajibreadd |

Fixes: https://tracker.ceph.com/issues/68567
This PR fixes a long standing QA failure, by avoiding the git logs for the snapshot sync as they change unexpectedly.
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e