Skip to content

restic check with snapshot filters (continuation of PR #5213)#5469

Merged
MichaelEischer merged 17 commits into
restic:masterfrom
wplapper:check_snapshots-v3
Nov 28, 2025
Merged

restic check with snapshot filters (continuation of PR #5213)#5469
MichaelEischer merged 17 commits into
restic:masterfrom
wplapper:check_snapshots-v3

Conversation

@wplapper

@wplapper wplapper commented Aug 7, 2025

Copy link
Copy Markdown
Contributor

This is the continuation of PR #5213. I decided to use a git provided e-mail address instead of my private e-mail address. All changes up-to 2025-03-07 have been incorporated. Details to follow here:

changelog/unreleased/issue-3326:
Introduction of snapshot based filter for command restic check

cmd/restic/cmd_check.go:
Integration of snapshot based filtering. Some messages had to be changed so that the text does not refer to 'all packfiles' but to 'selected packfiles'.

cmd/restic/cmd_check_integration_test.go:
A new test for filtering via snapshots has been added.

doc/077_troubleshooting.rst:
A short paragraph in the manual has been added about snapshot based filtering.

internal/checker/checker.go:
A new func CheckWithSnapshots() has been added which collects the needed packfiles via restic.FindUsedBlobs().

internal/checker/checker_test.go:
A new test TestCheckRepoSnapshot() has been added to check out the new functionality.

What does this PR change? What problem does it solve?

This PR addresses snapshot based filters in command restic check.

Was the change previously discussed in an issue or on the forum?

#3326

#3326

Checklist

  • I have added tests for all code changes.
  • I have added documentation for relevant changes (in the manual).
  • There's a new file in changelog/unreleased/ that describes the changes for our users (see template).
  • I'm done! This pull request is ready for review.

@MichaelEischer MichaelEischer left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes from #5532 require untangling how the pack file selection works here, see the comments for details. Structure(...) should also adhere to the snapshot selection. That method can also generate the pack file list, making CheckWithSnapshots unnecessary and also avoiding the massive performance penalty of iterating a second time over the snapshot trees.

Edit:
Just integrate the snapshot selection into checker.LoadSnapshots. Then checker.Snapshots naturally also adheres to the snapshot filter. The checker.Checker can the also wrap the ReadPacks method such that it replaces the pack list passed to the filter callback with the one only containing the packs relevant for the snapshots.

Comment thread cmd/restic/cmd_check.go Outdated
Comment thread cmd/restic/cmd_check.go Outdated
Comment thread internal/checker/checker.go
Comment thread internal/checker/checker.go Outdated
@wplapper

wplapper commented Oct 4, 2025

Copy link
Copy Markdown
Contributor Author

Hi MichaelEischer,

looking at the existing changes after #5532 and the proposed changed in #5550 I have decided to wait until #5550 has been commited. Then I will start from a restic 0.18.1 base and adopt my changes and your recommendations accordingly. I will place my PR into "draft" status.

@wplapper wplapper marked this pull request as draft October 4, 2025 11:49
@wplapper wplapper marked this pull request as ready for review October 14, 2025 15:09
@wplapper

Copy link
Copy Markdown
Contributor Author

I have no idea what this Windows test ``TestDirAttributeCombinationsOverwrite` does, I have seen it before, it seems to be transient, and I have no idea of how to fix it. I need some solid Windows advice. Thanks!

@wplapper

Copy link
Copy Markdown
Contributor Author

I have removed these failing windows tests from the this PR. I will have to wait anyway until this issue has been resolved.

@MichaelEischer MichaelEischer left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes so far. This is definitely moving into the right direction, but needs a few further cleanups. The current ReadPacks extension is also far slower than it has to be (see comments).

Comment thread changelog/unreleased/issue-3326 Outdated
Comment thread internal/restorer/restorer_windows_test.go Outdated
Comment thread internal/checker/checker.go Outdated
Comment thread internal/checker/checker.go Outdated
Comment thread internal/checker/checker.go Outdated
Comment thread internal/checker/checker.go Outdated
Comment thread internal/checker/checker.go Outdated
Comment thread internal/checker/checker.go
Comment thread cmd/restic/cmd_check_integration_test.go Outdated
Comment thread doc/077_troubleshooting.rst Outdated

@MichaelEischer MichaelEischer left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for keeping up with all the review comments! I have a few further comments, although most of them are nits this time.

Comment thread cmd/restic/cmd_check.go Outdated
return summary, ctx.Err()
}

opts.filteredStatus = chkr.FilterStatus()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move it below the if opts.CheckUnused { block. Then the call is directly next to the place where it's used.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved the call to buildPacksFilter() directly before its use.

Comment thread cmd/restic/cmd_check.go Outdated
}

opts.filteredStatus = chkr.FilterStatus()
readDataFilter, err := buildPacksFilter(opts, printer)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filteredStatus should be passed as parameter to buildPacksFilter(...). It doesn't really belong into the opts struct.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Comment thread internal/checker/checker.go Outdated
c.snapshots, err = restic.MemorizeList(ctx, c.repo, restic.SnapshotFile)
c.snapshotFilter = snapshotFilter
c.args = args
c.trackUnused = true

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should only be set if the snapshotFilter is actually used. Aka. c.filterActive must also be set here. Or even better, get completely rid of filterActive and just use a helper function that determines whether there's an active filter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved setting of c.trackUnused into loadActiveTrees().

Comment thread internal/checker/checker.go Outdated
}

// convert used blobs into their encompassing packfiles
selectedPacks := restic.NewIDSet()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also only be computed within filteredPackfiles. The same applies for the variable definition of packsWithSize.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggestion encapsulates the code better :)

Comment thread internal/checker/checker.go Outdated
}

// FilterStatus returns the filtering status
func (c *Checker) FilterStatus() bool {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should move right below LoadSnapshots().

Comment thread internal/checker/checker.go Outdated
return
}
c.filterActive = true
c.trees = trees

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should just return the trees instead of storing them in c.trees

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Comment thread internal/checker/checker.go Outdated
errs = []error{}

if len(c.args) > 0 || !c.snapshotFilter.Empty() {
err := (&c.snapshotFilter).FindAll(ctx, c.snapshots, c.repo, c.args, func(_ string, sn *data.Snapshot, er error) error {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err := (&c.snapshotFilter).FindAll(ctx, c.snapshots, c.repo, c.args, func(_ string, sn *data.Snapshot, er error) error {
err := (&c.snapshotFilter).FindAll(ctx, c.snapshots, c.repo, c.args, func(_ string, sn *data.Snapshot, err error) error {

There's no problem with shadowing the outer err variable

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

return buf.String(), err
}

func TestCheckFullOutput(t *testing.T) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please merge the tree tests into a table test (see for example TestPrepareCheckCache; you bascially create a struct that contains the expected inputs and output and then use a loop to iterate over all of them). Almost all of their code is shared.

@wplapper wplapper Nov 21, 2025

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bult testCase table.

@MichaelEischer MichaelEischer left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

I've added a few small cleanups, see the extra commits for details. In particular, only packsWithSize is built but not selectedPacks. As sets in go are under the hood also maps, we can just directly build the final map. FilterStatus() is now the authoritative source whether filtering happens and is less sensitive to whether it's called after certain other methods.

wplapper and others added 16 commits November 28, 2025 20:03
New implementation based on PR#5550 (refactor pack selection for read data):

cmd/restic/cmd_check.go: had to move the filter creation code buildPacksFilter()
to the point where the repo is open and the index has been loaded. Otherwise
the new function selectPacks() would not work.

Added the new function selectPack() which creates a packfile list based on current
snapshot filters. selectPack() creates a list of used snapshots, uses
data.FindUsedBlobs() to create a set of used blobs and uses this set to generate a map
of used packfiles.

In buildPacksFilter() code has been inserted for the print output to indicate
snapshot filter selection.

doc/077_troubleshooting.rst adds a paragraph about snapshot filtering.

changelog/unreleased/issue-3326 announces the new enhancement.
cmd/restic/cmd_check.go:
moved filter checking into internal/checker/checker.go
as a consequence, checker.LoadSnapshots() gained thwo paramenters,
which were just saved away. The actual processing of the snapshot filter got moved
into checker.Structure(), where context and repository are available.
function buildPacksFilter() gained the parameter 'chkr', so the selected packs
and the active flag can be retrieved from the checker instance.
buildPacksFilter() has been modified, so there is a choice between 'normal'
processing and snapshot filter processing. The printed messages were adopted
accordingly.

cmd/restic/cmd_check_integration_test.go: three new tests verifying the filtering.

internal/checker/checker.go: seee above

internal/checker/checker.go and internal/checker/testing.go had to be modified for
using the changed LoadSnapshots() interface.
linter caught me out with an incorrect spelling
add subtest in cmd_check_integration_test.go:
show that exaxctly one snapshot has been selected in Structure()
I have no idea whu Windows chokes in a different place from here. Timing?
cmd/restic/cmd_check.go:
moved a lot of code from the command to internal/checker/checker
inserted one message line "Snapshot filtering is active"

internal/checker/checker.go:
split the code from cmd/restic/cmd_check.go into parts for LoadSnapshots(),
Structure() and a new wrapper for ReadPacks()

cmd/restic/cmd_check_integration_test.go:
adopted the tests to the new environment
linter caught me out again
cmd/restic/cmd_check.go:
passing printer parameter to chkr.Structure()

cmd/restic/cmd_check_integration_test.go
internal/checker/checker_test.go
internal/checker/testing.go
adopting changes to chker.Structure()

internal/checker/checker.go:
now printing error message to printer.E
remove these pesky fialing windows tests for the time being
changelog/unreleased/issue-3326:
changed wording

doc/077_troubleshooting.rst:
clarified usage of `restic check` with SnapshotFilter

internal/restorer/restorer_windows_test.go:
reinstated original windows test

cmd/restic/cmd_check.go:
added word 'filtered ' for output display in the various messages

internal/checker/checker.go:
reworked checker logic according to recommendations, removed extra
parameter `printer`in `Structure()`.
added helper function `loadActiveTrees()`
modified `ReadPacks()` to pass filtered packfiles to `repository.ReadPacks()`.

cmd/restic/cmd_check_integration_test.go:
internal/checker/checker_test.go:
internal/checker/testing.go:
adjusted parameter lists for modified functions.
cmd/restic/cmd_check.go:
`buildPacksFilter()` received parameter `filteredStatus`, rearranged code

internal/checker/checker.go:
worked on `ReadPacks()`, rearranged code

cmd/restic/cmd_check_integration_test.go:
build a table of tests to run, together with the expeced results

internal/checker/checker_test.go:
internal/checker/testing.go:
adapt code changes in internal/checker/checker.go
@MichaelEischer

Copy link
Copy Markdown
Member

And a trivial rebase + fix to work with the changed datatype for c.blobRefs.M

@MichaelEischer MichaelEischer enabled auto-merge (squash) November 28, 2025 19:06
@MichaelEischer MichaelEischer merged commit ce57961 into restic:master Nov 28, 2025
11 checks passed
@wplapper wplapper deleted the check_snapshots-v3 branch November 29, 2025 05:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants