add --stdin-from-command flag to backup command#4254
add --stdin-from-command flag to backup command#4254sebhoss wants to merge 5 commits intorestic:masterfrom
Conversation
This new flag is added to the backup subcommand in order to allow restic to control the execution of a command and determine whether to save a snapshot if the given command succeeds. Signed-off-by: Sebastian Hoß <seb@xn--ho-hia.de>
In order to run with --stdin-from-command we need to short-circuit some functions similar to how it is handled for the --stdin flag. The only difference here is that --stdin-from-command actually expects that len(args) should be greater 0 whereas --stdin does not expect any args at all. Signed-off-by: Sebastian Hoß <seb@xn--ho-hia.de>
It acts similar to --stdin but reads its data from the stdout of the given command instead of os.Stdin. Signed-off-by: Sebastian Hoß <seb@xn--ho-hia.de>
In order to determine whether to save a snapshot, we need to capture the exit code returned by a command. In order to provide a nice error message, we supply stderr as well. Signed-off-by: Sebastian Hoß <seb@xn--ho-hia.de>
Return with an error containing the stderr of the given command in case it fails. No new snapshot will be created and future prune operations on the repository will remove the unreferenced data. Signed-off-by: Sebastian Hoß <seb@xn--ho-hia.de>
|
LGTM! I think that this option should be strongly suggested in the documentation in place of the current "stdin procedure". Maybe we want to unset some sensible environment variables? (e.g., |
|
Thanks for the review! That's a good point about environment variables - I completely missed that requirement from the original ticket and thus the current code executes without any environment variables. However inheriting them and unsetting some sensible ones makes sense to me! |
|
AFAIK, A very rough way might be to remove all prefixed variables indicated in the documentation: diff --git a/cmd/restic/cmd_backup.go b/cmd/restic/cmd_backup.go
index ee355bb42..2068d5e6b 100644
--- a/cmd/restic/cmd_backup.go
+++ b/cmd/restic/cmd_backup.go
@@ -608,6 +608,20 @@ func runBackup(ctx context.Context, opts BackupOptions, gopts GlobalOptions, ter
if err != nil {
return err
}
+
+ for _, v := range os.Environ() {
+ if !strings.HasPrefix(v, "RESTIC_") &&
+ !strings.HasPrefix(v, "AWS_") &&
+ !strings.HasPrefix(v, "ST_") &&
+ !strings.HasPrefix(v, "OS_") &&
+ !strings.HasPrefix(v, "B2_") &&
+ !strings.HasPrefix(v, "AZURE_") &&
+ !strings.HasPrefix(v, "GOOGLE_") {
+
+ command.Env = append(command.Env, v)
+ }
+ }
+
if err := command.Start(); err != nil {
return err
}Note: I haven't tested this code. Other options include:
I'd argue that the the first option is too restrictive, and the second is too much maintenance burden (maintainers/developers need to remember to add new sensitive variables in the list). So, resetting variables by their prefix may be a tradeoff. What do you think? |
|
Ah nice, I guess I misread the docs - thanks again! Thinking about unsetting variables a bit more I'm concerned that it might be too magical for users since they cannot influence it directly. So maybe another option like Edit: Or maybe something like |
|
While I agree that it might seem magical, I think too many knobs are equally bad. I don't see any use case for $ restic backup \
--stdin-from-command \
--repo ./backup \
-- env RESTIC_REPOSITORY=$RESTIC_REPOSITORY pg_dump -h foobarSo, if a variable is removed, it can be re-added explicitly (it's safe - a pattern might match more/fewer variables tomorrow). This needs to be clearly documented, of course :-) |
|
Yeah I agree that too many knobs do not feel that good either. I think in the end I would probably just leave this as-is given that 1) borg seems to be fine with that 2) the current |
|
I am testing your branch for some real world backup in my personal/lab. For now, everything is working as expected :-) |
MichaelEischer
left a comment
There was a problem hiding this comment.
I'm in favor of adding a --stdin-from-command option (or maybe --content-from-command). Implementation-wise, it shouldn't be necessary to modify the archiver at all; the fs.FS interface should be generic enough to abstract away everything.
| } | ||
|
|
||
| if len(args) > 0 { | ||
| if len(args) > 0 && opts.StdinCommand == false { |
There was a problem hiding this comment.
please use && !opts.StdinCommand
| } | ||
|
|
||
| if opts.Command != nil { | ||
| errBytes, _ := io.ReadAll(opts.CommandStderr) |
There was a problem hiding this comment.
This will deadlock if a process writes too much data to stderr. Please handle stderr similar as in the rclone/sftp backend.
| Hostname: opts.Host, | ||
| ParentSnapshot: parentSnapshot, | ||
| Command: command, | ||
| CommandStderr: stderr, |
There was a problem hiding this comment.
There's no need to modify the archiver. Instead, the Close() error part of the ReadCloser of fs.Reader should wait for the command to exit and return an error if necessary.
| cmdErr := opts.Command.Wait() | ||
| if cmdErr != nil { | ||
| debug.Log("error while executing command: %v", cmdErr) | ||
| return nil, restic.ID{}, errors.New(string(errBytes)) |
There was a problem hiding this comment.
cmdErr should not be dropped. It should at least be part of the returned error message.
| filename := path.Join("/", opts.StdinFilename) | ||
| var closer io.ReadCloser | ||
| if opts.StdinCommand { | ||
| command = exec.CommandContext(ctx, args[0], args[1:]...) |
There was a problem hiding this comment.
Please move as much of the code that sets up the stdinCommand as possible out of runBackup. The latter function is already far too long.
| f.StringVar(&backupOptions.ExcludeLargerThan, "exclude-larger-than", "", "max `size` of the files to be backed up (allowed suffixes: k/K, m/M, g/G, t/T)") | ||
| f.BoolVar(&backupOptions.Stdin, "stdin", false, "read backup from stdin") | ||
| f.StringVar(&backupOptions.StdinFilename, "stdin-filename", "stdin", "`filename` to use when reading from stdin") | ||
| f.BoolVar(&backupOptions.StdinCommand, "stdin-from-command", false, "execute command and store its stdout") |
There was a problem hiding this comment.
I wonder whether --content-from-command (as in Borg) would be a better name or not. stdin-from-command somewhat contradicts its own description...
Ok thanks for the pointer - that was exactly what I was looking for! So the idea here would be to :
Is that correct? Should that |
Judging from https://github.com/restic/restic/pull/4254/files#diff-30d8672eafa9c0585766b6f437530377563cd1aac5606674c366f3ede4abea9eR618 an
Yes.
The |
|
@sebhoss Are you still interested in working on this PR? |
|
Yes & no - we don't have any immediate pressure at work to get this in anymore, so I put this on my ever growing todo list. I still think that this would be a nice addition but I'm not sure when I'll be able to finish this. I'm fine with closing this PR in case you are doing some summer cleanups or re-assigning this to someone else(?) |
|
I can take over this PR. I am using the code here on a daily basis at home for backups, and I will be happy to have it merged :-) However, don't I think I will be able to continue in this PR as the branch is from @sebhoss's repo, so maybe I can fork & create a new pull request with the same code, and then apply the suggestions from @MichaelEischer. What do you think? |
|
@Enrico204 Sounds good. Go ahead and create a new PR, I'll close the old PR afterwards. |
|
Superseded by #4410. |
The code has been refactored so that the archiver is back to the original code, and the stderr is handled using a go routine to avoid deadlock.
The code has been refactored so that the archiver is back to the original code, and the stderr is handled using a go routine to avoid deadlock.
The code has been refactored so that the archiver is back to the original code, and the stderr is handled using a go routine to avoid deadlock.
The code has been refactored so that the archiver is back to the original code, and the stderr is handled using a go routine to avoid deadlock.
The code has been refactored so that the archiver is back to the original code, and the stderr is handled using a go routine to avoid deadlock.
What does this PR change? What problem does it solve?
This fixes #4251 by providing the requested
--stdin-from-commandflag to therestic backupcommand. It skips snapshot creation in case the given commands fails as determined byexec.Cmd.Wait()and returns the stderr of the given command as an error message back to the user.I have not written any tests, nor documentation because I wanted to get some feedback about the path taken here wrt. upstream inclusion first. I am seeing various test failures in existing tests but these seem to be unrelated and might just occur because I'm building/testing the project within a container. Using these changes locally does look promising though:
Was the change previously discussed in an issue or on the forum?
Closes #4251
Checklist
changelog/unreleased/that describes the changes for our users (see template).gofmton the code in all commits.