Skip to content

Decentralized snapshot distribution: POC — ENR + BitTorrent flow #19657

@mh0lt

Description

@mh0lt

Summary

Prove the physical distribution flow for decentralized snapshot info-hash distribution using discv5 ENR metadata and BitTorrent.

Background

Currently, snapshot info-hashes are distributed via a centralized path: the erigon-snapshot GitHub repo embeds hashes into the binary as preverified.toml, with a runtime fallback to R2/GitHub. This creates a single point of failure and a central trust dependency.

This POC adds a parallel P2P distribution path alongside the existing system:

  • Publish chain.toml (new file, parallel to preverified.toml) as a BitTorrent file
  • Advertise its info-hash via discv5 ENR entries
  • New nodes discover peers, read ENR entries, and download chain.toml via BitTorrent

Non-destructive rollout: The existing preverified.toml system remains completely untouched. chain.toml is a new parallel file. All existing download logic stays as-is. The new system runs alongside and can be removed without impact.

File Lifecycle

Files now have a 3-stage lifecycle:

  1. Locally generated — node creates .seg files, builds .torrent, publishes to BitTorrent
  2. In local chain.toml — node's manifest lists info-hashes as its known-good set
  3. Confirmed good — multiple peers agree on same info-hashes (what embedded preverified represents today)

A new chain.toml is produced for every max step, driven by the max frozen transaction number (immutable increasing counter).

Design

ENR Entry

A custom ENR entry carries the frozen tx number (monotonic version) and the torrent info-hash:

type ChainToml struct {
    FrozenTx uint64   // max frozen transaction number
    InfoHash [20]byte // torrent info-hash of chain.toml
}
func (v ChainToml) ENRKey() string { return "chain-toml" }

28 bytes total — well within the 300-byte ENR limit.

Deterministic Info-Hash

chain.toml is generated deterministically from the node's current .torrent files (name → infohash map). Same snapshot set = same file content = same torrent info-hash.

Publish Flow

  1. Generate chain.toml from local .torrent files
  2. Create torrent of chain.toml (fixed piece length)
  3. Set ENR entry: {FrozenTx: <max_frozen_tx>, InfoHash: <chain.toml torrent hash>}
  4. Seed the torrent

Discovery Flow

  1. Discover peers via discv5
  2. Read "chain-toml" ENR entry from peers
  3. Find peer(s) with highest FrozenTx
  4. Download chain.toml via BitTorrent using the info-hash
  5. For POC: chain.toml is informational (existing preverified.toml flow handles actual downloads)

Update Propagation

When snapshots advance to a new max step:

  • New chain.toml generated (superset of previous)
  • New torrent → new info-hash
  • ENR updated with new FrozenTx + InfoHash
  • Peers discover updated ENR during normal discv5 refresh

Implementation Plan

Phase 1: ENR Entry Type

New file: p2p/enr/chain_toml.go + tests

Define ChainToml ENR entry with RLP encode/decode following existing patterns in p2p/enr/entries.go.

Phase 2: chain.toml Generation (Parallel to preverified.toml)

New file: db/downloader/chaintoml.go

No existing files modified. Core functions:

  • GenerateChainToml(snapDir) — scan .torrent files, produce TOML bytes
  • SaveChainToml(snapDir, tomlBytes) — atomic write
  • LoadChainToml(snapDir) — read local file
  • BuildChainTomlTorrent(snapDir, torrentFS) — build .torrent, return info-hash
  • PublishChainToml(snapDir, torrentFS, enrUpdater) — orchestrate generate → save → torrent → ENR

Phase 3: Torrent Creation + ENR Advertisement

Add enrUpdater callback to Downloader struct. Wire in node/eth/backend.go.

Trigger points (additive hooks after existing logic):

  • After SaveSnapshotHashes in stage_snapshots.go
  • After seeder.Seed() in MergeBlocks onMerge callback

Phase 4: P2P Discovery + Download

New file: db/downloader/p2p_chaintoml.go

  • DiscoverChainToml(discv5) — iterate peers, find highest FrozenTx
  • DownloadChainToml(torrentClient, infoHash) — download chain.toml via BitTorrent

Runs as a separate path from LoadSnapshotsHashes — no modification to existing startup.

Phase 5: Background Update Loop

Goroutine in Downloader, gated by feature flag. Every 5 minutes: discover peers, download newer chain.toml if available, validate append-only property, update local file + ENR.

Phase 6: Graceful Removal

Entire feature removable by: deleting chain.toml + its .torrent, restarting without feature flag, deleting new source files. No existing code affected.


Files

New files

File Purpose
p2p/enr/chain_toml.go ChainToml ENR entry type + RLP
p2p/enr/chain_toml_test.go ENR roundtrip tests
db/downloader/chaintoml.go Generation, saving, torrent creation, ENR update, background loop
db/downloader/chaintoml_test.go Unit tests
db/downloader/p2p_chaintoml.go P2P discovery + download logic

Modified files (additive only)

File Change
node/eth/backend.go Wire ENR updater callback
execution/stagedsync/stage_snapshots.go Add PublishChainToml after SaveSnapshotHashes
db/snapshotsync/freezeblocks/block_snapshots.go Add PublishChainToml in onMerge
db/downloader/downloader.go Add enrUpdater field + background loop

Untouched (explicitly)

  • db/datadir/dirs.goPreverifiedFileName stays as "preverified.toml"
  • db/downloader/downloadercfg/downloadercfg.goLoadSnapshotsHashes / SaveSnapshotHashes unchanged
  • db/snapcfg/util.goCfg.Local flag unchanged
  • db/snapshotsync/snapshotsync.goSyncSnapshots Local check unchanged

Acceptance Criteria

  • Custom ENR entry type defined and registered with tests
  • Node generates chain.toml from local .torrent files
  • Node publishes {FrozenTx, InfoHash} in ENR on startup
  • Node creates and seeds torrent of chain.toml
  • Discovering node can read chain-toml ENR entries from peers
  • Discovering node can download chain.toml via BitTorrent
  • Background loop updates chain.toml when peers have newer version
  • ENR updates when snapshots advance to new max step
  • Feature gated behind flag — existing flow completely unaffected
  • Existing centralized flow continues to work unchanged

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions