Skip to content

archiver: only store deviceID for hardlinks#4006

Merged
MichaelEischer merged 6 commits intorestic:masterfrom
MichaelEischer:deviceID-only-for-hardlinks
Mar 28, 2024
Merged

archiver: only store deviceID for hardlinks#4006
MichaelEischer merged 6 commits intorestic:masterfrom
MichaelEischer:deviceID-only-for-hardlinks

Conversation

@MichaelEischer
Copy link
Copy Markdown
Member

@MichaelEischer MichaelEischer commented Nov 4, 2022

What does this PR change? What problem does it solve?

he deviceID can change e.g. when backing up from a filesystem snapshot. However, it is only used for hardlink detection. Thus, it is not necessary to store it for non-hardlinks. This should handle most use cases without adding much complexity.

Was the change previously discussed in an issue or on the forum?

Solves part of #3041
Replaces #3599

Checklist

  • I have read the contribution guidelines.
  • I have enabled maintainer edits.
  • I have added tests for all code changes.
  • [ ] I have added documentation for relevant changes (in the manual).
  • There's a new file in changelog/unreleased/ that describes the changes for our users (see template).
  • I have run gofmt on the code in all commits.
  • All commit messages are formatted in the same style as the other commits in the repo.
  • I'm done! This pull request is ready for review.

@greatroar
Copy link
Copy Markdown
Contributor

This does make it impossible to determine where the mount points were from inspecting the snapshot...

@greatroar greatroar mentioned this pull request Nov 11, 2022
8 tasks
@MichaelEischer
Copy link
Copy Markdown
Member Author

This does make it impossible to determine where the mount points were from inspecting the snapshot...

Is that information that useful? I figure that without knowing what filesystem was used where, that information is not particularly useful.

I did some tests to see where hardlinks are used, so far I've seen usages in /usr/share/{terminfo,zoneinfo}, in Docker/podman images, the steam runtime, flatpaks and a few random other usages. For me these amount to about 3% of all files, such that this PR is less effective than I'd hoped for.

@greatroar
Copy link
Copy Markdown
Contributor

Is that information that useful? I figure that without knowing what filesystem was used where, that information is not particularly useful.

No, probably not.

@JoeKun
Copy link
Copy Markdown

JoeKun commented Dec 12, 2023

If I understand correctly, this small change could substantially improve the usability of restic for users who want to backup from a file-system snapshot, such as a ZFS snapshot.

This sounds like a very desirable improvement.

@MichaelEischer Could you please clarify what is missing before we could potentially see this change merged?

@MichaelEischer
Copy link
Copy Markdown
Member Author

The problem with this change is that it will cause the next backup to create different tree blobs. This can double the size of the metadata stored for a backup.

My plan is to add support for large directories in restic 0.18 (see https://forum.restic.net/t/roadmap-for-restic-0-17-to-0-19/7197) which will also require changes to the tree format. That provides a good opportunity to also include this PR.

@MichaelEischer MichaelEischer force-pushed the deviceID-only-for-hardlinks branch 2 times, most recently from 31c0eb0 to 9b9111d Compare March 9, 2024 16:55
@MichaelEischer
Copy link
Copy Markdown
Member Author

Slight change of plans. Now that we have feature flags, we can immediately merge this PR, but hide it behind a feature flag for now (RESTIC_FEATURES=device-id-for-hardlinks). The deviceID handling will either become the default in repository version 3 (restic 0.18) or be completely replaced. Either way, the feature flag will be removed some time afterwards.

@MichaelEischer MichaelEischer force-pushed the deviceID-only-for-hardlinks branch from 9b9111d to 35e5c9d Compare March 28, 2024 17:44
The deviceID can change e.g. when backing up from filesystem snapshot.
It is only used for hardlink detection. Thus there it is not necessary
to store it for everything else.
@MichaelEischer MichaelEischer force-pushed the deviceID-only-for-hardlinks branch 3 times, most recently from 0b7e154 to a4624b7 Compare March 28, 2024 18:31
@MichaelEischer MichaelEischer force-pushed the deviceID-only-for-hardlinks branch from a4624b7 to 21cf38f Compare March 28, 2024 18:32
Copy link
Copy Markdown
Member Author

@MichaelEischer MichaelEischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hettiger
Copy link
Copy Markdown

Thank you @MichaelEischer

this also fixed backing up from APFS-snapshots for me.

@ilyagr
Copy link
Copy Markdown
Contributor

ilyagr commented Jan 18, 2026

I also just wanted to say thank you, this helped me a lot on my Mac. ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants