The erigon-snapshot module embeds pre-built snapshot hash TOML files into the binary at compile time (//go:embed mainnet.toml, etc.). These are loaded into package-level variables (snapshothashes.Mainnet, etc.) and used to initialize knownPreverified in db/snapcfg/util.go.
However, as of Erigon 3.3, these embedded hash values are never consumed by any code path. They are always overwritten before any code reads them, or the node exits before reaching code that would.
How it works today
During Ethereum.Init() (node/eth/backend.go:1302), LoadSnapshotsHashes() is called. It either:
- Loads from local
preverified.toml (exists after a previous completed sync) — overwrites knownPreverified via SetToml()
- Fetches from R2 CDN, falling back to GitHub — overwrites
knownPreverified via LoadRemotePreverified() (which rebuilds the entire map at util.go:559)
- Fails fatally if both remote sources are down — the node logs a
CRIT and exits with code 1
In all three cases, the embedded hash values are never used:
- Cases 1 & 2: overwritten before any downstream code (
SyncSnapshots, MergeLimit, etc.) reads them
- Case 3: the node exits before reaching any code that reads hash items
The only call to KnownCfg() that happens before LoadSnapshotsHashes() is in SetUpBlockReader() (backend.go:1354), but it discards the *Cfg and only checks the bool return value to determine if the chain is known — it never looks at the actual hash items.
Verified behavior
Starting Erigon with both R2 and GitHub unreachable (simulated via HTTPS_PROXY=http://127.0.0.1:1) on a fresh datadir produces:
[INFO] Loading remote snapshot hashes
[WARN] Failed to load snapshot hashes from R2; falling back to GitHub err="..."
[CRIT] Snapshot hashes for supported networks was not loaded. Please check
your network connection and/or GitHub status here https://www.githubstatus.com/
chain=mainnet err="..."
[EROR] Erigon startup err="failed to fetch remote snapshot hashes for chain mainnet"
The node exits immediately. The embedded hashes sitting in memory are never consulted.
Why this is a problem
The code structure is misleading. Reading db/snapcfg/util.go, you see:
var Mainnet = fromEmbeddedToml(snapshothashes.Mainnet)
...and the knownPreverified map initialized from these values. This naturally suggests these serve as defaults or fallbacks. But they don't — they're unconditionally overwritten or the process dies.
Available choices
There are two possible directions: remove the embedded hashes entirely, or fix the code to actually use them as a fallback. We should remove them. Here's why the fallback approach doesn't work:
Embedded hashes go stale and become unusable. The embedded hashes reflect the snapshot file layout at the time the binary was built. Over time, snapshot files get merged — e.g., multiple 500K-block segments get merged into a single 1M-block segment. All live nodes perform these merges, and the old pre-merge files eventually disappear from the torrent network.
As the operator of the snapshot infrastructure, we keep old (pre-merge) files available on our webseeds for up to ~8 weeks. This means a fallback to embedded hashes could theoretically work for up to 8 weeks after a binary is released — but even in that window, it would bootstrap the node from a potentially outdated snapshot layout, which is a poor user experience.
After the 8-week window, the fallback would simply fail: the embedded hashes would reference files that no longer exist on any torrent peer or webseed, leaving the node unable to sync anyway.
Instead of investing in a fallback mechanism with a limited shelf life, the better path is to improve the availability of the hash-publishing infrastructure itself — for example, adding a second CDN besides Cloudflare R2 (and randomizing across them, similar to how embedded bootnodes work), or exploring a decentralized publishing mechanism.
Proposal
Remove the embedded snapshot hash data. Specifically:
- The
//go:embed TOML hash files in the root erigon-snapshot package and the associated snapshothashes.Mainnet/Sepolia/... variables
- The package-level
Mainnet = fromEmbeddedToml(...) initializers in db/snapcfg/util.go
- Initialize
knownPreverified as an empty map (or with chain names only for the KnownCfg() bool check), then populate it from LoadSnapshotsHashes()
What should stay:
- The
LoadSnapshots() / fetchSnapshotHashes() functions (remote fetching logic) — these are actively used
- The
webseed subpackage — its embedded data is actively consumed by KnownWebseeds
This would:
- Make the code honest about what it does — no fake fallback that doesn't work
- Reduce the frequency of
erigon-snapshot dependency bumps (today every hash update requires a go.mod bump even though the embedded values are unused)
- Simplify
db/snapcfg/util.go initialization
The
erigon-snapshotmodule embeds pre-built snapshot hash TOML files into the binary at compile time (//go:embed mainnet.toml, etc.). These are loaded into package-level variables (snapshothashes.Mainnet, etc.) and used to initializeknownPreverifiedindb/snapcfg/util.go.However, as of Erigon 3.3, these embedded hash values are never consumed by any code path. They are always overwritten before any code reads them, or the node exits before reaching code that would.
How it works today
During
Ethereum.Init()(node/eth/backend.go:1302),LoadSnapshotsHashes()is called. It either:preverified.toml(exists after a previous completed sync) — overwritesknownPreverifiedviaSetToml()knownPreverifiedviaLoadRemotePreverified()(which rebuilds the entire map atutil.go:559)CRITand exits with code 1In all three cases, the embedded hash values are never used:
SyncSnapshots,MergeLimit, etc.) reads themThe only call to
KnownCfg()that happens beforeLoadSnapshotsHashes()is inSetUpBlockReader()(backend.go:1354), but it discards the*Cfgand only checks theboolreturn value to determine if the chain is known — it never looks at the actual hash items.Verified behavior
Starting Erigon with both R2 and GitHub unreachable (simulated via
HTTPS_PROXY=http://127.0.0.1:1) on a fresh datadir produces:The node exits immediately. The embedded hashes sitting in memory are never consulted.
Why this is a problem
The code structure is misleading. Reading
db/snapcfg/util.go, you see:...and the
knownPreverifiedmap initialized from these values. This naturally suggests these serve as defaults or fallbacks. But they don't — they're unconditionally overwritten or the process dies.Available choices
There are two possible directions: remove the embedded hashes entirely, or fix the code to actually use them as a fallback. We should remove them. Here's why the fallback approach doesn't work:
Embedded hashes go stale and become unusable. The embedded hashes reflect the snapshot file layout at the time the binary was built. Over time, snapshot files get merged — e.g., multiple 500K-block segments get merged into a single 1M-block segment. All live nodes perform these merges, and the old pre-merge files eventually disappear from the torrent network.
As the operator of the snapshot infrastructure, we keep old (pre-merge) files available on our webseeds for up to ~8 weeks. This means a fallback to embedded hashes could theoretically work for up to 8 weeks after a binary is released — but even in that window, it would bootstrap the node from a potentially outdated snapshot layout, which is a poor user experience.
After the 8-week window, the fallback would simply fail: the embedded hashes would reference files that no longer exist on any torrent peer or webseed, leaving the node unable to sync anyway.
Instead of investing in a fallback mechanism with a limited shelf life, the better path is to improve the availability of the hash-publishing infrastructure itself — for example, adding a second CDN besides Cloudflare R2 (and randomizing across them, similar to how embedded bootnodes work), or exploring a decentralized publishing mechanism.
Proposal
Remove the embedded snapshot hash data. Specifically:
//go:embedTOML hash files in the rooterigon-snapshotpackage and the associatedsnapshothashes.Mainnet/Sepolia/...variablesMainnet = fromEmbeddedToml(...)initializers indb/snapcfg/util.goknownPreverifiedas an empty map (or with chain names only for theKnownCfg() boolcheck), then populate it fromLoadSnapshotsHashes()What should stay:
LoadSnapshots()/fetchSnapshotHashes()functions (remote fetching logic) — these are actively usedwebseedsubpackage — its embedded data is actively consumed byKnownWebseedsThis would:
erigon-snapshotdependency bumps (today every hash update requires a go.mod bump even though the embedded values are unused)db/snapcfg/util.goinitialization