Skip to content

Add --device-map flag to allow remapping DeviceIDs#3599

Closed
intentionally-left-nil wants to merge 1 commit intorestic:masterfrom
intentionally-left-nil:3041
Closed

Add --device-map flag to allow remapping DeviceIDs#3599
intentionally-left-nil wants to merge 1 commit intorestic:masterfrom
intentionally-left-nil:3041

Conversation

@intentionally-left-nil
Copy link
Copy Markdown

What does this PR change? What problem does it solve?

When creating backups from ZFS or btrfs snapshots, each snapshot will have a different DeviceID. This causes restic to upload the directory structure for each backup, even if nothing has changed. To solve this, this PR adds the --device-map flag which will allow users to map different device id's to the same logical device id.

The idea is that users can figure out what device ID their snapshot currently points to, (e.g. with stat -c '%d'
and then re-map that to a virtual device ID that is consistent across backups.
For example, if there's a btrfs snapshot mounted at /home/.snapshots/123/snapshot/, then you could do something like this:

device_id=$(stat -c '%d' /home/.snapshots/123/snapshot)
proot -b /home/.snapshots/123/snapshot:/home
restic backup --device-map "$device_id:42" /home

Was the change previously discussed in an issue or on the forum?

There is a decent amount of discussion on the linked issue. The larger issue here is that there are no guarantees anyways that the DeviceID is stable, and if it should be relied upon. One proposed idea was to implement an -ignore-deviceid flag, similar to the other ignore flags. However, given that the DeviceId is used to determine if symlinks span across different filesystems, this ended up not being a trivial change to implement.

Closes #3041

Checklist

  • I have read the contribution guidelines.
  • I have enabled maintainer edits.
  • I have added tests for all code changes.
  • I have added documentation for relevant changes (in the manual).
  • There's a new file in changelog/unreleased/ that describes the changes for our users (see template).
  • I have run gofmt on the code in all commits.
  • All commit messages are formatted in the same style as the other commits in the repo.
  • [] I'm done! This pull request is ready for review.

When backing up from a snapshot (ZFS or btrfs as an example), each
snapshot will be mounted by the filesystem as a different DeviceID

This causes restic to upload the directory structure for each new
snapshot, even though the structure is identical. This commit adds a new
flag to prevent this behavior and allow restic to re-use the previously
uploaded directory structure.

The concept is simple. When calling restic backup, pass the --device-map
src:dest to change any node with a DeviceID of src to behave as if it
were dest

Luckily, nodeFromFileInfo is the only place the DeviceID is read in the
backup codepath, so it's straightforward to shim the DeviceID here

Closes restic#3041
@MichaelEischer
Copy link
Copy Markdown
Member

MichaelEischer commented Dec 27, 2021

Why not let restic automatically assign pseudo-device ids as suggested in #3041 (comment) ? As rawtaz and I have commented (see e.g. #3041 (comment) ) --device-map looks like it requires rather a lot of setup to work.

The PR currently does not address device ID collisions at all, which can be a problem once users start fiddling with the device ids.

@intentionally-left-nil
Copy link
Copy Markdown
Author

intentionally-left-nil commented Dec 30, 2021

Hi @MichaelEischer, thanks for taking a look. I'd be happy to implement an automatic mapping algorithm, but I'm not sure how to go about it.

Consider a scenario with 4 files, spread across two devices:

/device1/a.txt
/device1/b.txt
/device2/c.txt
/device2/d.txt

and let's say that for the first backup, device1 has a DeviceID of 17, and device2 has a DeviceID of 18.
For the second run let's say Device1 has a DeviceID of 81 and Device2 has a DeviceID of 82

The question becomes: How do we map device1 and device2 to pseudo-ids which are consistent across runs?
Let's say you do something like hash the path, or the inode. So we hash /device1/a.txt and let's say that hashes to 3. So we have a pseudo-map that says "device 17 maps to pseudo 3". Now when we see /device1/b.txt, the mapping already exists so we can use the existing value. c.txt is missing a mapping, so you would hash it and then do something like "device 18 maps to pseudo 4"

But the problem with this approach is it isn't stable across backups, unless enumeration happens in the same order. For example, on the next backup, let's say that instead of backing up a.txt, the enumeration finds /device1/b.txt first (or maybe you deleted a.txt).

Now, the algorithm would see that there is no device mapping for ID 81, and so it would hash /device1/b.txt. But now this is a different file/inode so let's say that maps to pseudo id 11. And now we haven't solved anything because the pseudo-id's are not consistent across runs.

Some ideas I considered, but with obvious flaws:

  • Find some way to do enumeration in-order of devices (I don't see how you can enforce this with hard links)
  • Poke into the guts of the filesystem to get the mappings (e.g. look at /proc/mounts) and build a hash off the base mount path, rather than the file
  • For each file, try to get it's parent, (and keep getting its parent), until the DeviceID changes. Use that folder path/inode as the input for the hash algorithm. You'll still run into issues with hard links though. For example, if you hard link /device1/folderA into /device2/folderA, then iyou would run into the same issue depending on if the backup code sees /device1/a.txt or /device2/folderA/c.txt. The former would point to /device1 as the root device, and the latter would point to /device2/folderA as the root node

The last approach seems most reasonable to me but also fairly involved. I don't know enough about all the nuances of file systems (including network drives) to know what gotcha's I'd run into trying to traverse up parent directories.

The benefit of the current approach is it's very iterative (code-wise) and results in a small diff with consistent, repeatable behavior. The last approach would involve a much larger diff and would be more likely to have a tail of "it's not stable in this configuration" etc.

Happy to hack on ideas more, once we have an idea of how the algorithm should work.

@intentionally-left-nil
Copy link
Copy Markdown
Author

Thinking about this some more, it looks like findmnt gives me the set of device ID information needed to make a stable lookup. However at that point, I might as well just implement -use-fs-snapshot and be done with it. I'll think on this some more

@rawtaz
Copy link
Copy Markdown
Contributor

rawtaz commented Jan 2, 2022

Is this issue specific to Linux? I'm wondering how many systems have the issue but don't have findmnt.

@shamer
Copy link
Copy Markdown

shamer commented Jan 4, 2022

I am using ZFS snapshots with restic and am affected by issue 3041. With the --device-map flag my strategy for using it would be to assign pseudo device IDs based on the mount point. These are very stable for my use case.
My experiments show that everything else is stable except the inode of the directory of the mountpoint.

#!/bin/bash
for m in $(findmnt -R -T . --json | jq --raw-output '.filesystems[0].children[].target') ; do
  SRCDEV=$(stat --format "%Ld" $m)
  DEVHASH=$(echo ${m//$(pwd)/} | b2sum --length=64 | cut -f1 -d" " | tr a-f A-F)
  PSEUDODEV=$(echo "obase=10; ibase=16; ${DEVHASH}" | bc)
  echo --device-map ${SRCDEV}:${PSEUDODEV}
done

@MichaelEischer
Copy link
Copy Markdown
Member

But the problem with this approach is it isn't stable across backups, unless enumeration happens in the same order. For example, on the next backup, let's say that instead of backing up a.txt, the enumeration finds /device1/b.txt first (or maybe you deleted a.txt).

The backup sorts filenames before traversing a folder. Thus the traversal order is stable.

For example, if you hard link /device1/folderA into /device2/folderA, then iyou would run into the same issue depending on if the backup code sees /device1/a.txt or /device2/folderA/c.txt.

Hardlinks cannot point across devices. Or is there some subvolume related trickery that avoids that limitation?

The archiver component already traverses the filesystem starting from the root directory /. So we'd just have to call stat before entering a directory afterwards to detect when the mountpoint has changed.

@shamer
Copy link
Copy Markdown

shamer commented Jan 8, 2022

Hardlinks cannot point across devices. Or is there some subvolume related trickery that avoids that limitation?

Although hardlinks cannot be made across devices, bind mounts can be made at different points in the filesystem. The bind mounts all share the same device ID.

With the traversal order being stable, bind mounts don't seem like an issue though. The mount that is seen first would be the path used for the pseudo identifier for the device. If this path is hashed in some way to produce the identifier it wouldn't matter how many other devices were seen before this device.

@ArsenArsen
Copy link
Copy Markdown

Hey, I've been successfully using this patch in unattended backups, though, on a filesystem without any bind mounts in it.
Rebasing the patch for new releases becomes increasingly hard, though, as the gap between the source tree and the patch widens.
May I ask for a review and merge, if possible?

Thanks!

@MichaelEischer
Copy link
Copy Markdown
Member

As discussed in #3041 we want to have a solution that just works, without having users manually map device ids. That is the PR is just there as a temporary workaround, but is unlikely to get merged.

@intentionally-left-nil
Copy link
Copy Markdown
Author

intentionally-left-nil commented Nov 28, 2023

As discussed in #3041 we want to have a solution that just works, without having users manually map device ids. That is the PR is just there as a temporary workaround, but is unlikely to get merged.

I'm going to close this PR out. My personal opinion is that in the search of a perfect solution, we've sacrificed a good enough one. This issue has been open for 3 years now. It's frustrating that a low-risk, opt-in fix wasn't able to make it over the line (and a better one hasn't materialized either)

@MichaelEischer
Copy link
Copy Markdown
Member

There's now #4006 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Restic 0.10.0 always reports all directories "changed", adds duplicate metadata, when run on ZFS snapshots

5 participants