Skip to content

filed: fix possible data-loss when excluding hardlinks#1506

Merged
BareosBot merged 14 commits intobareos:masterfrom
sebsura:dev/ssura/master/fix-not-sending-hardlinked-files-on-exclude
Sep 22, 2023
Merged

filed: fix possible data-loss when excluding hardlinks#1506
BareosBot merged 14 commits intobareos:masterfrom
sebsura:dev/ssura/master/fix-not-sending-hardlinked-files-on-exclude

Conversation

@sebsura
Copy link
Contributor

@sebsura sebsura commented Jul 11, 2023

Thank you for contributing to the Bareos Project!

Currently bareos only tries to send file data of a set of hard linked files exactly once:
when it tries to send the first file it encounters from within that set. If that fails,
for example because the specific file was excluded, then bareos will continue on its way
backing up the other hard linked files without ever sending the data. In cases where it was
not an error that prevented the sending of the file data, bareos will not emit a warning
during the backup but will error when it tries to restore that backup (since it has no way
to restore the hard linked files!).

This PR fixes this in the following way: Bareos now recognizes the fact that no data was send
for a particular set of hard links if the node in the linkhash has FileIndex 0 and tries to send the
data in that case.

Please check

  • Short description and the purpose of this PR is present above this paragraph
  • Your name is present in the AUTHORS file (optional)

If you have any questions or problems, please give a comment in the PR.

Helpful documentation and best practices

Checklist for the reviewer of the PR (will be processed by the Bareos team)

Make sure you check/merge the PR using devtools/pr-tool to have some simple automated checks run and a proper changelog record added.

General
  • Is the PR title usable as CHANGELOG entry?
  • Purpose of the PR is understood
  • Commit descriptions are understandable and well formatted
    Check backport line
  • Required backport PRs have been created
Source code quality
  • Source code changes are understandable
  • Variable and function names are meaningful
  • Code comments are correct (logically and spelling)
  • Required documentation changes are present and part of the PR
Tests
  • Decision taken that a test is required (if not, then remove this paragraph)
  • The choice of the type of test (unit test or systemtest) is reasonable
  • Testname matches exactly what is being tested
  • On a fail, output of the test leads quickly to the origin of the fault

@sebsura sebsura added the bugfix label Jul 11, 2023
@sebsura sebsura force-pushed the dev/ssura/master/fix-not-sending-hardlinked-files-on-exclude branch from a875b8e to 4bde001 Compare July 12, 2023 07:28
@sebsura sebsura changed the title filed: fix not sending hardlinked files when first encountered hardlinked flie is exclude filed: fix not sending hardlinked files when first encountered hardlinked file is excluded Jul 13, 2023
@sebsura sebsura force-pushed the dev/ssura/master/fix-not-sending-hardlinked-files-on-exclude branch from 19d90b3 to 9e53193 Compare July 13, 2023 08:44
@sebsura sebsura force-pushed the dev/ssura/master/fix-not-sending-hardlinked-files-on-exclude branch 2 times, most recently from b21e099 to 2d772c2 Compare August 2, 2023 07:06
@bruno-at-bareos bruno-at-bareos requested a review from arogge August 3, 2023 10:03
Copy link
Member

@arogge arogge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. I have some minor remarks on things you might want to improve though.

Comment on lines +54 to +57
// workaround for the win32 compat layer defining struct stat
// without dev_t/ino_t
using our_dev_t = decltype(static_cast<struct stat*>(nullptr)->st_dev);
using our_ino_t = decltype(static_cast<struct stat*>(nullptr)->st_ino);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't have st_dev and st_ino, hardlink detection will fail - right?
So maybe we should #ifdef all that hardlink code so it doesn't get compiled on windows at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ill have to check to make sure but i think windows defines st_dev/st_ino, but not the types.
Im not sure if hardlink detection on windows ever does do anything though. Good point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked: hardlinks exist on windows and we do seem to support them.

@sebsura sebsura requested a review from arogge September 11, 2023 05:54
Comment on lines +1072 to +1076
size_t hash_combine(size_t seed, size_t hash)
{
size_t changed = hash + 0x9e3779b9 + (seed << 6) + (seed >> 2);
return seed ^ changed;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we won't have inter-procedural optimization between the fd and libbareos due to the dynamic linking, it might make sense to make this an inlined function in the header for performance reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I did not look into how this is all linked together.

@sebsura sebsura force-pushed the dev/ssura/master/fix-not-sending-hardlinked-files-on-exclude branch from e467134 to 3b4bb21 Compare September 19, 2023 05:32
@sebsura sebsura requested a review from arogge September 19, 2023 05:32
Copy link
Member

@arogge arogge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Well done!

@arogge arogge changed the title filed: fix not sending hardlinked files when first encountered hardlinked file is excluded filed: fix possible data-loss when excluding hardlinks Sep 19, 2023
@sebsura sebsura force-pushed the dev/ssura/master/fix-not-sending-hardlinked-files-on-exclude branch from ef32d97 to e903560 Compare September 19, 2023 08:41
@sebsura sebsura requested a review from arogge September 19, 2023 11:38
@sebsura sebsura force-pushed the dev/ssura/master/fix-not-sending-hardlinked-files-on-exclude branch from 1d80189 to d59e5c0 Compare September 22, 2023 06:06
Currently the first hardlinked file encountered is used as the "main"
hardlinked file; that means only that copy gets send to the sd.  The
problem arises if that file does not get send to the sd after
all (maybe because it got excluded).  Currently the sd would not get
send any copy of the file contents in that case.

We fix this by checking if a digest exists (the digest gets written
after the file was send) and in case its missing, the current file
will get get chosen as the real "main file".  We do this by
1) setting ff_pkt->linked to the CurLink and
2) by overwriting the filename inside the CurLink.
To enable the second part we needed to change how the CurLink
allocates memory for the string.  Now its just a pointer to some
memory instead of memory directly after the curlink itself.
Since `ff_pkt` was partially overwritten by the files contained in the
dir, we should not look at its fields.  Instead we should only use our
saved copy which contains our "linked" field, which we can then update
with our file index.
Updating ff_pkt->linked is done by FindOneFile itself.  There is no
reason to update it twice!
@BareosBot BareosBot force-pushed the dev/ssura/master/fix-not-sending-hardlinked-files-on-exclude branch from 94d1cdc to 1751da1 Compare September 22, 2023 06:38
@BareosBot BareosBot merged commit 2d909a0 into bareos:master Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants