Add options to fine tune VSS snapshots#3067
Add options to fine tune VSS snapshots#3067MichaelEischer merged 16 commits intorestic:masterfrom DRON-666:vss-options
Conversation
|
Rebased. Fixed some error checking/ignoring. |
|
@DRON-666 : Is it too late to ask you to consider one more option to add in this PR? Backstory: https://forum.restic.net/t/windows-shadow-copy-snapshot-vss-unexpected-provider-error/3674 What do you think? I've done a proof-of-concept hack/patch with a hard-coded GUID for the standard MS system VSS provider, and it solved my problem. I could try rolling my own PR (it would be my first; I have no real experience with Go), but it would seem more expedient to piggy-back on yours, if you're amenable to it. |
|
@jdiercks As I expected, the |
|
Tested in my sandbox environment. |
|
@jdiercks I'd suggest to use the name of the provider instead of its GUID. GUIDs are a somewhat hard to understand for end users. |
|
@fgma |
|
@fgma I have to agree with @DRON-666 on that one. GUID is more straightforward and reliable for this scenario. If a user is far enough along to be in a situation where explicitly specifying the provider even matters, then they can probably figure out they need to copy and paste a GUID from |
|
@DRON-666 that seems to be what VSSADMIN itself does. I would still leave the option in place to override if needed, but fallback would probably save a few people some confusion and grief. |
@jdiercks I want to check this with API monitor, but I need to find system with more than one provider. |
|
I'm currently using restic in a powershell script that creates a VSS snapshot in the beginning, then calls restic 3 times to backup the same data to 3 different repos (redundancy) and at the end deletes the snapshot. Is there any way this can be simplified with restics own VSS snapshots, but without creating 3 (different) snapshots on every restic call? |
@DRON-666 Looks like without vss.provider it's still using IID_NULL in all the VSS calls. |
|
@DRON-666 I think I'm fine with restic defaulting to There is a slight risk that changing restic's default behavior would break some other user's backup, if they have multiple providers and they were relying on the hierarchical behavior produced by using IID_NULL, but I suspect that's a corner case at worst, and as long as it's documented they can just start using the new option to get back the results they want. |
|
@bjoe2k4 can you share your powershell script? Curious to see how you implemented that. Could be useful to me in a different context. |
|
@DRON-666 ok thanks, that would have been too good to be true. |
|
@DRON-666 Wow, that's a lot more code! But definitely very flexible. Am I reading right that you decided to keep the original default behavior, with restic passing IID_NULL as provider if none specified via options? I'm not seeing anything that looks like it's doing a fallback to ms sys provider. |
|
@jdiercks somehow missed your question, sorry for that. |
|
@DRON-666 Thanks for clarifying. Good call. I've been running nightly backups with this PR for several weeks now, and it seems pretty solid. |
|
I think the VSS implementation in restic doesn't support setting the VSS backup type so far? Differentiation between "full" backups and "copy" backups can be important when for example backing up an Exchange Server or an MS SQL Server. Support for this would be really nice :) See: https://docs.microsoft.com/en-us/windows/win32/vss/vss-backup-state |
|
@kscheel VSS backup types irrelevant to
|
|
@DRON-666 oh sorry, I guess I misunderstood a few things, nevermind then 🙈 Thanks for the explanation! |
MichaelEischer
left a comment
There was a problem hiding this comment.
I'm sorry for not taking a look at this PR for several years 😞 ; it turned out to be easier to review than I've expected it to be. Thanks for keeping the PR up to date all this time!
Most comments are just minor nitpicks. The only more important one is that the option names should use kebab case, similar to other already existing options, that is the name should be vss.exclude-volumes instead of vss.excludevolumes and similar for the other option names.
Changing multiple "callAsyncFunctionAndWait" with fixed timeout to calculated timeout based on deadline.
Add options to exclude all mountpoints and arbitrary volumes from snapshotting.
We don't need `error` here: the only existing implementation of `ErrorHandler` always call `Backup.Error` and all implementations of `Backup.Error` always return nil.
MichaelEischer
left a comment
There was a problem hiding this comment.
LGTM. Thanks for addressing the feedback so quick!
What does this PR change? What problem does it solve?
Original VSS PR use some not-so-universal defaults.
First of all, current Restic architecture don't allow listing all mount points required for backup and we have to enumerate mount points for a given volume and create VSS snapshots for all of them. This can lead to problems if some of this unnecessary mount points links to locked, filled or otherwise unsupported volumes. @fgma has made every effort to avoid this problems, but it's not feasible in all cases. On the other hand, Restic don't follow mount points: if you backup
C:\, mount pointC:\mntwill be saved as link/c/mnt -> d:\. So snapshotting mount points may only make sense for paths likeC:\mnt\somedir.Extended option
-o vss.excludeallmountpointssolves this problem by enterely disabling mount points snapshotting.Another problem can occurred with backups from multiple volumes (e.g. I backup files with some extensions from all drives), but VSS required only on certain volumes. Why not take a snapshot of them all: because snapshotting of large volumes can take several minutes.
Extended option
-o vss.excludevolumesallows you to fine-tune VSS snapshots by excluding specific volumes. Mount points to this volumes are also excluded.Thus, volumes can be specified in three ways: as drive letter
D:\, as mount pointC:\mnt, or as volume GUID path\\?\Volume{b151601a-34a4-11e0-b644-806e6f6e6963}\. The trailing slash can always be omittedF:. Before comparison, all specified volumes are converted (with reporting in case of errors) to the form of a volume GUID path, so ifC:\mntlinks toD:\then-o vss.excludevolumes=d:and-o vss.excludevolumes=c:\mntwill give the same results.Also, default VSS provider may lead to some problems. With the option
-o vss.provider, vss provider can be specified in three ways: GUID, name andMSas alias for{b5946137-7b9f-4925-af80-51abd60b20d5}.And the last potential problem is timeout: in my case, it takes more than a minute to take the longest snapshot. It's too close to the default timeout of 120 seconds.
The extended option
-o vss.timeoutoption allows you to change this timeout. My implementation internally uses deadline concept (idiomatic for Go) instead of timeouts.Was the change discussed in an issue or in the forum before?
I mention some of this issues in the VSS Support PR
Also see timeout issue, provider issue #1 and provider issue #2.
Checklist
changelog/unreleased/that describes the changes for our users (template here)gofmton the code in all commits