filed: fix possible data-loss when excluding hardlinks#1506
Conversation
a875b8e to
4bde001
Compare
19d90b3 to
9e53193
Compare
b21e099 to
2d772c2
Compare
arogge
left a comment
There was a problem hiding this comment.
Looks good overall. I have some minor remarks on things you might want to improve though.
core/src/findlib/hardlink.h
Outdated
| // workaround for the win32 compat layer defining struct stat | ||
| // without dev_t/ino_t | ||
| using our_dev_t = decltype(static_cast<struct stat*>(nullptr)->st_dev); | ||
| using our_ino_t = decltype(static_cast<struct stat*>(nullptr)->st_ino); |
There was a problem hiding this comment.
If we don't have st_dev and st_ino, hardlink detection will fail - right?
So maybe we should #ifdef all that hardlink code so it doesn't get compiled on windows at all.
There was a problem hiding this comment.
ill have to check to make sure but i think windows defines st_dev/st_ino, but not the types.
Im not sure if hardlink detection on windows ever does do anything though. Good point.
There was a problem hiding this comment.
Just checked: hardlinks exist on windows and we do seem to support them.
core/src/lib/util.cc
Outdated
| size_t hash_combine(size_t seed, size_t hash) | ||
| { | ||
| size_t changed = hash + 0x9e3779b9 + (seed << 6) + (seed >> 2); | ||
| return seed ^ changed; | ||
| } |
There was a problem hiding this comment.
since we won't have inter-procedural optimization between the fd and libbareos due to the dynamic linking, it might make sense to make this an inlined function in the header for performance reasons.
There was a problem hiding this comment.
Good catch. I did not look into how this is all linked together.
e467134 to
3b4bb21
Compare
ef32d97 to
e903560
Compare
1d80189 to
d59e5c0
Compare
Currently the first hardlinked file encountered is used as the "main" hardlinked file; that means only that copy gets send to the sd. The problem arises if that file does not get send to the sd after all (maybe because it got excluded). Currently the sd would not get send any copy of the file contents in that case. We fix this by checking if a digest exists (the digest gets written after the file was send) and in case its missing, the current file will get get chosen as the real "main file". We do this by 1) setting ff_pkt->linked to the CurLink and 2) by overwriting the filename inside the CurLink. To enable the second part we needed to change how the CurLink allocates memory for the string. Now its just a pointer to some memory instead of memory directly after the curlink itself.
Since `ff_pkt` was partially overwritten by the files contained in the dir, we should not look at its fields. Instead we should only use our saved copy which contains our "linked" field, which we can then update with our file index.
Updating ff_pkt->linked is done by FindOneFile itself. There is no reason to update it twice!
94d1cdc to
1751da1
Compare
Thank you for contributing to the Bareos Project!
Currently bareos only tries to send file data of a set of hard linked files exactly once:
when it tries to send the first file it encounters from within that set. If that fails,
for example because the specific file was excluded, then bareos will continue on its way
backing up the other hard linked files without ever sending the data. In cases where it was
not an error that prevented the sending of the file data, bareos will not emit a warning
during the backup but will error when it tries to restore that backup (since it has no way
to restore the hard linked files!).
This PR fixes this in the following way: Bareos now recognizes the fact that no data was send
for a particular set of hard links if the node in the linkhash has FileIndex 0 and tries to send the
data in that case.
Please check
If you have any questions or problems, please give a comment in the PR.
Helpful documentation and best practices
Checklist for the reviewer of the PR (will be processed by the Bareos team)
Make sure you check/merge the PR using
devtools/pr-toolto have some simple automated checks run and a proper changelog record added.General
Check backport lineSource code quality
Tests