Skip to content

Support snapshot reset with symlinks#18273

Merged
anacrolix merged 10 commits into
mainfrom
anacrolix/reset-chaindata-symlinks
Jan 13, 2026
Merged

Support snapshot reset with symlinks#18273
anacrolix merged 10 commits into
mainfrom
anacrolix/reset-chaindata-symlinks

Conversation

@anacrolix

@anacrolix anacrolix commented Dec 12, 2025

Copy link
Copy Markdown
Contributor

Adds a ton of tests.

Fixes #18135, #18191, #16693.

Adds a --preverified flag to snapshots reset to choose embedded, remote or local file to use for reset. I used it for testing faster, but it is helpful.

I tried to implement some extra security to avoid recursively deleting if someone symlinked to the wrong place, but it got out of hand on complexity. I ended up discovering fs.FS, and os.Root are mostly useless for this. I settled on 2 possibilities: Checking symlinks to directories always end with the same name as the source, and having a sequence of os.Roots where deletion is allowed. The first was a bit too complex, and the second would require a bunch of extra documentation and user education which I don't think is worth it. Most good users will be setting correct permissions, and using hard/bind mounts to do things reliably.

TODO:

  • Take advantage of the tests to check snapshot filters work.
  • Double check the os.Root for CLI run
  • Probably move Reset into another package, I think it will get used in some form in stage sync at some point.
  • Clean up tests and commented out crap
  • Document the removal heuristic, and --preverified.

@anacrolix anacrolix force-pushed the anacrolix/reset-chaindata-symlinks branch 2 times, most recently from 69e96d1 to 638ba91 Compare December 17, 2025 12:32
@anacrolix anacrolix force-pushed the anacrolix/reset-chaindata-symlinks branch 4 times, most recently from b60355d to 782e7c5 Compare January 9, 2026 06:29
@anacrolix anacrolix marked this pull request as ready for review January 9, 2026 06:29
@AskAlexSharov

Copy link
Copy Markdown
Collaborator

@sudeepdino008 CR plz

* Move into db/datadir/reset
* Support symlinks and directory mounts
* Add tons of tests
* Improve logging
* Allow selecting preverified source
@anacrolix anacrolix force-pushed the anacrolix/reset-chaindata-symlinks branch from 782e7c5 to 1b7de17 Compare January 9, 2026 06:43
@anacrolix

Copy link
Copy Markdown
Contributor Author

@sudeepdino008 could you take a look? This should be ready

@anacrolix

Copy link
Copy Markdown
Contributor Author

Fixing lints :)

@anacrolix anacrolix closed this Jan 9, 2026
@anacrolix anacrolix reopened this Jan 9, 2026
@sudeepdino008

Copy link
Copy Markdown
Member

LGTM!

some nitpicks:

I tried it on a hard-symlinked clone of a datadir, and got this warning:

WARN[01-09|14:34:40.718] Error removing directory                 name=/Users/moskud/oss/datadir/ethmainnet_full4/snapshots/accessor err="removing file /Users/moskud/oss/datadir/ethmainnet_full4/snapshots/accessor: remove /Users/moskud/oss/datadir/ethmainnet_full4/snapshots/accessor: directory not empty"

maybe this shouldn't be a WARN? accessor has some files so can't be deleted.

on --dry-run, we get msg at end - "Reset complete. Start Erigon as usual, missing files will be downloaded."
Better to have something more dry run related.

@anacrolix

Copy link
Copy Markdown
Contributor Author

Great feedback, thanks. I'll include those suggestions

@anacrolix

Copy link
Copy Markdown
Contributor Author

@sudeepdino008 I incorporated your feedback.

@anacrolix anacrolix enabled auto-merge (squash) January 12, 2026 09:38
@anacrolix anacrolix merged commit 2bc289d into main Jan 13, 2026
19 of 20 checks passed
@anacrolix anacrolix deleted the anacrolix/reset-chaindata-symlinks branch January 13, 2026 00:38
AskAlexSharov pushed a commit that referenced this pull request Apr 30, 2026
- \`seg reset\` regressed in #18273: \`removeLocal\` from the CLI is
wired into \`reset.Reset.RemoveUnknown\` but not \`RemoveLocal\`, which
gates chaindata + Heimdall/PolygonBridge DB removal in
\`db/datadir/reset/reset.go\`. Result: \`seg reset\` no longer touches
chaindata at all.
- Library-level \`TestResetChaindata\` exists but sets \`RemoveLocal:
true\` directly, so the CLI plumbing was uncovered.

closes #20260
yperbasis pushed a commit to Sahil-4555/erigon that referenced this pull request Apr 30, 2026
- \`seg reset\` regressed in erigontech#18273: \`removeLocal\` from the CLI is
wired into \`reset.Reset.RemoveUnknown\` but not \`RemoveLocal\`, which
gates chaindata + Heimdall/PolygonBridge DB removal in
\`db/datadir/reset/reset.go\`. Result: \`seg reset\` no longer touches
chaindata at all.
- Library-level \`TestResetChaindata\` exists but sets \`RemoveLocal:
true\` directly, so the CLI plumbing was uncovered.

closes erigontech#20260
Sahil-4555 pushed a commit to Sahil-4555/erigon that referenced this pull request May 29, 2026
… dep (erigontech#21197)

Fixes erigontech#21154. Fixes erigontech#19732. Sub-task of erigontech#21047.

## Summary
- Drop the `github.com/erigontech/erigon-snapshot` Go-module import. The
embedded TOMLs it ships were loaded at startup, immediately overwritten
by a runtime fetch, and discarded — they have been unused on every
daemon path since erigontech#12415 made remote-fetch failure fatal.
- Drop the `--preverified=embedded` flag value (a dev convenience from
erigontech#18273); `remote` and `local` remain.
- Clean up the now-vestigial registry pieces: remove the unused
`preverifiedRegistry.Reset` method (dead since erigontech#19641 switched to
per-chain loading), promote the immutable supported-chain set to a
package-level `knownChains` var, and inline its only membership-check
consumer in `SetToml`.

Runtime fetch source
(`raw.githubusercontent.com/erigontech/erigon-snapshot` + R2 mirror) and
fail-fast behaviour are unchanged. Binary size: −2,973,120 bytes
uncompressed (−2.0%) / −1,015,102 bytes gzipped (−1.6%) on
`darwin/arm64`, measured by building before/after a stubbed-empty
`erigon-snapshot`.

## Test plan
- [x] `make lint && make erigon integration` clean
- [x] `go test ./db/snapcfg/... ./db/downloader/downloadercfg/...` pass
- [x] Manual: `--chain=hoodi` with both CDN hosts unreachable
(`HTTPS_PROXY` pointed at a dead local port) exits non-zero with the
same fail-fast `[CRIT] Snapshot hashes for supported networks was not
loaded …` startup trace
- [x] Manual: `--chain=hoodi` with normal network logs `Loading remote
snapshot hashes chain=hoodi`, no `Failed to load` warning, no `[CRIT]`,
and progresses into the downloader (segments begin downloading)
- [x] Manual: `--chain=hoodi` against a fresh datadir prepopulated with
a real `<datadir>/snapshots/preverified.toml` (fetched out-of-band)
starts cleanly with HTTPS_PROXY pointed at a dead local port — no
`Loading remote snapshot hashes` log line, no `[CRIT]`, downloader
brings up — confirming the local-file path bypasses remote fetch
- [x] Manual: `erigon snapshots reset --datadir=<dd>
--preverified=embedded --dry-run` exits 1 with `Error: invalid
preverified flag value "embedded"`; the `--help` output shows `(remote,
local)`; `--preverified=remote` and `--preverified=local` continue to
work
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

erigon snapshots reset fails on mounted chaindata directory and may break symlink setups

3 participants