Skip to content

filed: skip stripped top level directories#1686

Merged
BareosBot merged 12 commits intobareos:masterfrom
SamuelBoerlin:skip-stripped-dirs
Feb 13, 2024
Merged

filed: skip stripped top level directories#1686
BareosBot merged 12 commits intobareos:masterfrom
SamuelBoerlin:skip-stripped-dirs

Conversation

@SamuelBoerlin
Copy link
Contributor

@SamuelBoerlin SamuelBoerlin commented Jan 26, 2024

Thank you for contributing to the Bareos Project!

Currently when you have a fileset with an include with e.g. File = /some/path and Strip Path = 2 the backup will always end up with an empty directory called /some/path in it.
This PR changes this behaviour to skip saving top level directories like in the example if they're stripped away exactly.
Like so the backup won't contain an unnecessary empty directory and the restored files match exactly with what was backed up.
Before this change a restore with where=/some/path would create an empty /some/path/some/path directory (which didn't exist before the restore) whereas the rest of the files are restored correctly at e.g. /some/path/somefile just like where they were before the restore.

Please check

  • Short description and the purpose of this PR is present above this paragraph
  • Your name is present in the AUTHORS file (optional)

If you have any questions or problems, please give a comment in the PR.

Helpful documentation and best practices

Checklist for the reviewer of the PR (will be processed by the Bareos team)

Make sure you check/merge the PR using devtools/pr-tool to have some simple automated checks run and a proper changelog record added.

General
  • Is the PR title usable as CHANGELOG entry?
  • Purpose of the PR is understood
  • Commit descriptions are understandable and well formatted
    Check backport line
    Required backport PRs have been created
Source code quality
  • Source code changes are understandable
  • Variable and function names are meaningful
  • Code comments are correct (logically and spelling)
  • Required documentation changes are present and part of the PR
Tests
  • Decision taken that a test is required (if not, then remove this paragraph)
  • The choice of the type of test (unit test or systemtest) is reasonable
  • Testname matches exactly what is being tested
  • On a fail, output of the test leads quickly to the origin of the fault

@sebsura
Copy link
Contributor

sebsura commented Jan 30, 2024

I think the problem here is that the name of a directory is "too short". For your example the name of the directory is /some/path but the "linkname", which is the one we actually care about, is /some/path/. In fact stripping /some/path/ works correctly (if strippath <= 2)!

It seems like our strip function should take that fact into consideration. Could you verify whether this change fixes the issue for you ?

@@ -1789,9 +1790,12 @@ void StripPath(FindFilesPacket* ff_pkt)
    * is a different link string, attempt to strip the link. If it fails,
    * back them both back. Do not strip symlinks. I.e. if either stripping
    * fails don't strip anything. */
-  if (!do_strip(ff_pkt->StripPath, ff_pkt->fname)) {
-    UnstripPath(ff_pkt);
-    goto rtn;
+  if ((ff_pkt->type != FT_DIREND && ff_pkt->type != FT_REPARSE)
+      || ff_pkt->fname == ff_pkt->link) {
+    if (!do_strip(ff_pkt->StripPath, ff_pkt->fname)) {
+      UnstripPath(ff_pkt);
+      goto rtn;
+    }
   }

@SamuelBoerlin
Copy link
Contributor Author

I think the problem here is that the name of a directory is "too short". For your example the name of the directory is /some/path but the "linkname", which is the one we actually care about, is /some/path/. In fact stripping /some/path/ works correctly (if strippath <= 2)!

It seems like our strip function should take that fact into consideration. Could you verify whether this change fixes the issue for you ?

@@ -1789,9 +1790,12 @@ void StripPath(FindFilesPacket* ff_pkt)
    * is a different link string, attempt to strip the link. If it fails,
    * back them both back. Do not strip symlinks. I.e. if either stripping
    * fails don't strip anything. */
-  if (!do_strip(ff_pkt->StripPath, ff_pkt->fname)) {
-    UnstripPath(ff_pkt);
-    goto rtn;
+  if ((ff_pkt->type != FT_DIREND && ff_pkt->type != FT_REPARSE)
+      || ff_pkt->fname == ff_pkt->link) {
+    if (!do_strip(ff_pkt->StripPath, ff_pkt->fname)) {
+      UnstripPath(ff_pkt);
+      goto rtn;
+    }
   }

Hi, thanks. I'll give that a try.

Earlier I did try using /some/path/ in the fileset includes, which indeed kind of works.
However, when you then do restore jobid=1 and ls you then get the following strange result:

$ls

build/
weird-files/

Note the empty first line. This is probably because it stores an entry for / (i.e. what's left after stripping; this can be seen when doing list files jobid=1), which is otherwise usually never backed up as far as I know, right?
That's why I made the change of not sending the directory to the SD if it's exactly stripped away.

@sebsura
Copy link
Contributor

sebsura commented Jan 30, 2024

You are right. There is this entry. Thankfully the restore itself works correctly and the permissions/etc. of the "root" directory are also restored.

I'm not sure what the best approach is here but i think not backing up a directory that was explicitly included in the backup is not a good idea (normally being top level actually ignores some checks that would cause such a file not to be backed up for example).

@SamuelBoerlin
Copy link
Contributor Author

Ah I see - if that's the intended behaviour then that's fine! Makes things a lot simpler.
I'll update the PR with your suggested change.

@SamuelBoerlin
Copy link
Contributor Author

I've now undone my changes and applied your fix instead 👍

@SamuelBoerlin
Copy link
Contributor Author

Hm, looks like Verify jobs with level DiskToCatalog don't work together with Strip Path.
VerifyFile in filed/verify.cc doesn't strip the path before sending it to the director.
I'll also have a try at fixing that.

@SamuelBoerlin
Copy link
Contributor Author

I've now also added a fix and tests for the issue with the verify jobs.

Copy link
Contributor

@sebsura sebsura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! There are some small issues but as a whole the changes are great.

@sebsura sebsura force-pushed the skip-stripped-dirs branch from ccb99d2 to ada50f7 Compare February 9, 2024 06:46
@sebsura
Copy link
Contributor

sebsura commented Feb 9, 2024

We had some CI changes, so i rebased it onto master so it could be build again.

Copy link
Contributor

@sebsura sebsura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work! The only thing left is to squash the fixup commits!

@SamuelBoerlin
Copy link
Contributor Author

Done. Thanks for your help!

@BareosBot BareosBot merged commit 198d77c into bareos:master Feb 13, 2024
@SamuelBoerlin
Copy link
Contributor Author

Should I create backports of this PR? Currently DiskToCatalog verify jobs don't work at all when StripPath is used.

@sebsura
Copy link
Contributor

sebsura commented Feb 19, 2024

Since both StripPath and Verify jobs are very rarely done, we decided on not backporting these changes.

@sebsura
Copy link
Contributor

sebsura commented Feb 19, 2024

@SamuelBoerlin If you create a backport to 23, we can merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants