Skip to content

restore: preserve hardlinks on restore#492

Merged
aawsome merged 1 commit intorustic-rs:mainfrom
sagemathinc:main
Mar 28, 2026
Merged

restore: preserve hardlinks on restore#492
aawsome merged 1 commit intorustic-rs:mainfrom
sagemathinc:main

Conversation

@williamstein
Copy link
Copy Markdown
Contributor

This is the simplest (and probably least efficient) way to address #16

This issue has been open since 2022, and it seems potentially like a way to have very confusing BUGS if you backup this folder:

wstein@lite:/tmp$ mkdir link
wstein@lite:/tmp$ cd link/
wstein@lite:/tmp/link$ ls
wstein@lite:/tmp/link$ echo "foo" > A.txt
wstein@lite:/tmp/link$ ln A.txt B.txt
wstein@lite:/tmp/link$ echo "bar" >> B.txt
wstein@lite:/tmp/link$ more A.txt
foo
bar
wstein@lite:/tmp/link$ more B.txt
foo
bar

then restore it, and suddenly it behave very differently after the restore.

I'm building software where rustic backup/restore has to actually work well -- it's not just for emergency recovery, but part of the normal lifecycle of user data. I ran some big tests of backup/restore, and only after fixing this hard link issue was the filesystem restored properly.

Fortunately, restore already records inode, device id, and link count metadata for files. But the restore path recreated every file as an independent plain file, so hardlinks were silently de-linked on restore.

This PR adds a post-restore hardlink pass keyed by the stored (device_id, inode) identity. After file contents and metadata are restored, sibling paths in each hardlink group are replaced with hardlinks to a canonical restored path.
Also add a test using the existing backup fixture that contains a hardlink pair.

Obviously, it could make good sense to close this if:

  • you don't like that it isn't the globally optimal way to solve this problem (since it restores each linked file, then combines them, rather than restoring only one). It was just the minimal way to do this that at the end preserves the links.
  • I didn't worry at all about non-POSIX

If you close this, it would probably be good though to mention clearly on the main rustic README page that hardlinks silently break on backup --> restore, as it can lead to major bugs for users, and it's only something they might find via careful testing / debugging of errors popping up later (that was the case for me). "hard link" isn't mentioned anywhere on main rustic README or issue tracker (it's in this rustic_core issue).

For anybody who does need this, I put a release here for linux:

https://github.com/sagemathinc/rustic/releases

Thanks!

Restore already records inode, device id, and link count metadata for files,
but the restore path recreated every file as an independent plain file. That
meant hardlinked files were silently de-linked on restore.

Add a post-restore hardlink pass keyed by the stored `(device_id, inode)`
identity. After file contents and metadata are restored, sibling paths in each
hardlink group are replaced with hardlinks to a canonical restored path.

Also add a focused integration test using the existing backup fixture that
contains a hardlink pair. The test now asserts that restoring the snapshot
recreates a shared inode, not just matching file contents.
@aawsome
Copy link
Copy Markdown
Member

aawsome commented Mar 28, 2026

Hi @williamstein!

Thanks a lot for proposing this PR. IMO this is a good first step. I would add the following enhancements:

  • make a new option for hard-link restores (which maybe allows to choose different methods how to identify hard links)
  • instead of creating all hard links and then remove-and-hardlink in a postprocess, it is better to just create one file and then just hardlink it in the postprocess.

However, we can just start with merging this PR and do the above in follow-up PRs.

Copy link
Copy Markdown
Member

@aawsome aawsome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks a lot @williamstein

@aawsome aawsome added this pull request to the merge queue Mar 28, 2026
Merged via the queue into rustic-rs:main with commit a1bfbc8 Mar 28, 2026
22 checks passed
aawsome added a commit that referenced this pull request Apr 5, 2026
With this PR, hardlinks are not crated and then removed, but only
linked.
Follow-up to #492
github-merge-queue bot pushed a commit that referenced this pull request Apr 5, 2026
## 🤖 New release

* `rustic_core`: 0.10.1 -> 0.11.0 (✓ API compatible changes)

<details><summary><i><b>Changelog</b></i></summary><p>

<blockquote>

##
[0.11.0](rustic_core-v0.10.1...rustic_core-v0.11.0)
- 2026-04-05

### Added

- Optimize hardlink creation in restore
([#495](#495))
- add exclude-if-xattr option
([#491](#491))

### Fixed

- make `ignore`'s `.git_exclude()` mirror `.git_ignore()`
([#494](#494))

### Other

- update dependencies
([#496](#496))
- preserve hardlinks on restore
([#492](#492))
- use general tree modifier in `repair snapshots`
([#463](#463))
- [**breaking**] Optimize file streaming
([#489](#489))
- [**breaking**] use Cow to avoid OsString allocations
([#487](#487))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).

Co-authored-by: rustic-release-plz[bot] <182542030+rustic-release-plz[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants