Skip to content

db/snapcfg: drop the github.com/erigontech/erigon-snapshot Go-module dependency #21154

@wmitsuda

Description

@wmitsuda

Parent: #21047

Goal

Stop embedding erigon-snapshot's per-chain preverified hash TOMLs into the erigon binary, and drop the github.com/erigontech/erigon-snapshot Go-module import. The HTTP source (raw.githubusercontent.com/erigontech/erigon-snapshot and the R2 mirror) stays as-is — only the build-time //go:embed coupling goes away.

Why this is safe: the embedded hashes are no longer used

The embedded TOMLs were once the source of preverified hashes, but two PRs demoted them. Today the data is loaded, immediately overwritten by a fetched copy, and discarded.

Era 1 — embedded TOMLs were the only source (Jul 2022 → Jul 2024)

Era 2 — embedded TOMLs as silent offline fallback (Jul 31 → Oct 23 2024)

  • 92fb732224 — PR Evergreen #11370 "Evergreen" added LoadRemotePreverified() returning couldFetch bool. The caller in setupSnapCfg (a) skipped remote fetch entirely if any .torrent files existed in the snapshots dir, and (b) ignored the return value, so any fetch failure silently fell back to the embedded values. Erigon still worked offline.

Era 3 — embedded TOMLs demoted to vestigial scaffolding (Oct 23 2024 → today). The change that removed the fallback.

Empirical confirmation

Reproduced on commit 428c559 with HTTPS_PROXY=http://127.0.0.1:1 ./build/bin/erigon --chain=hoodi --datadir=<ephemeral>. Erigon aborts in ~35 ms:

[INFO] Loading remote snapshot hashes           chain=hoodi
[WARN] Failed to load snapshot hashes from R2; falling back to GitHub chain=hoodi err="...proxyconnect tcp: dial tcp 127.0.0.1:1: connect: operation not permitted"
[CRIT] Snapshot hashes for supported networks was not loaded. Please check your network connection and/or GitHub status here https://www.githubstatus.com/ chain=hoodi err="..."
[EROR] Erigon startup                           err="load snapshot hashes: failed to fetch remote snapshot hashes for chain hoodi"

snapshothashes.Hoodi was populated with a valid embedded TOML at build time but was never consulted — the binary refuses to start.

What the embedded data is actually used for, end-to-end

In db/snapcfg/util.go:

  • registry.raw (lines 62-73) — embedded bytes keyed by chain name. Only the map keys matter at runtime; they drive the (_, known bool) return of KnownCfg. []byte{} placeholders would behave identically.
  • snapshotHashPtrs (lines 144-151) — pointers to the package-level []byte vars in erigon-snapshot. LoadRemotePreverified does *ptr = hashes (lines 578-580) to overwrite them with the fetched bytes. The embedded values are the destination buffer, never read for their content.
  • LoadSnapshotsHashes (db/downloader/downloadercfg/downloadercfg.go) calls KnownCfg once (parses embedded), then either SetToml (local preverified.toml) or LoadRemotePreverified (network fetch). Both invalidate the cached embedded *Cfg before any downstream consumer reads it.
  • The only exception: case "embedded" in db/snapcfg/preverified.go:27-28, reachable only via --preverified=embedded.

Proposal

  1. Drop the github.com/erigontech/erigon-snapshot import from db/snapcfg/util.go.
  2. Replace registry.raw with a chain-name set (map[string]struct{} or similar) so KnownCfg(_, known) keeps working. Empty []byte per chain is a minimal-diff alternative.
  3. Keep the HTTP source unchanged. https://raw.githubusercontent.com/erigontech/erigon-snapshot/<branch>/<chain>.toml and the R2 mirror are independent of the Go-module dep.
  4. Decide the fate of --preverified=embedded (PR Support snapshot reset with symlinks #18273). Two options:
    • Drop it — it's a dev convenience, not user-facing.
    • Repoint to per-build-baked TOMLs from the live CDN (release-time codegen). Worth tracking separately.

Impact

Measured by replacing github.com/erigontech/erigon-snapshot with a stub declaring empty []byte vars (Mainnet, Sepolia, Gnosis, Chiado, Hoodi, ArbSepolia, Bloatnet) via a local replace directive, then running make erigon before/after on commit 428c559, dep v1.3.1-0.20260402120223-7bb412bc89cd, darwin/arm64.

bytes size
HEAD 146,316,610 139.54 MiB
Stubbed 143,343,490 136.70 MiB
Δ uncompressed −2,973,120 −2.84 MiB (−2.03%)
Δ gzip -9 −1,015,102 −0.97 MiB (−1.58%)

The linker already dead-strips ArbSepolia (~1.4 MB) since snapcfg doesn't reference it; the saved bytes are the six referenced chains' embedded data.

Failure mode is unchanged — fail-fast from #12415 is preserved. Restoring an offline-bootstrap path is a separate decision and is not blocked by this issue.

Refs

Metadata

Metadata

Assignees

Labels

dependenciesPull requests that update a dependency file

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions