Skip to content

node/eth, db/downloader: gate chain.toml v2 publish paths behind --snap.p2p-manifest#20615

Merged
yperbasis merged 2 commits into
mainfrom
downloader/gate-chain-toml-v2-behind-p2p-manifest
Apr 17, 2026
Merged

node/eth, db/downloader: gate chain.toml v2 publish paths behind --snap.p2p-manifest#20615
yperbasis merged 2 commits into
mainfrom
downloader/gate-chain-toml-v2-behind-p2p-manifest

Conversation

@mh0lt

@mh0lt mh0lt commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Fix fresh-mainnet-sync regression introduced with the chain.toml v2 merge: the publish/seed stack was running unconditionally and collided with the downloaded chain.toml.torrent during post-OtterSync AddTorrentsFromDisk, aborting the execution service with snapshot exists with a different name: "chain.toml".
  • Gate the three unconditional v2 call sites on the existing --snap.p2p-manifest flag (default false) so default syncs stay on the pre-v2 path while opt-in users get the full v2 flow.

Reproduction (before this change)

USE_STATE_CACHE=false ./build/bin/erigon \
  --datadir=<fresh> --chain=mainnet --prune.mode=minimal

Fails during stage [1/6 OtterSync] immediately after Downloader completed remaining snapshots:

[EROR] Could not start execution service
  err="[1/6 OtterSync] after snapshot download: adding torrents from disk:
       adding torrent for chain.toml.torrent:
       snapshot exists with a different name: \"chain.toml\""

Collision originates in getExistingSnapshotTorrent (db/downloader/downloader.go:1683) — the locally-seeded chain.toml torrent is already registered under that info-hash when the disk-scanned chain.toml.torrent tries to add.

Change

Three gated sites, all tied to the pre-existing --snap.p2p-manifest flag (usage: "Discover snapshot manifest (chain.toml) from P2P peers via ENR instead of using centralized preverified.toml" — default false):

  1. node/eth/backend.go:541 — wrap the whole ENR updater / SetNodeSourceFn / delayed PublishLocalChainToml / StartChainTomlDiscovery / StartTorrentPeerManager block in && backend.config.Snapshot.P2PManifest.
  2. node/eth/backend.go:1461 — gate the initial publish in initDownloader().
  3. db/downloader/downloader.go:810 — gate the post-download republish using d.manifestReady != nil, which is the internal signal that EnableP2PManifest() was invoked by the backend. This also covers the loop-internal publish in chainTomlDiscoveryLoop (only runs under P2P manifest mode).

Default --snap.p2p-manifest=falsezero v2 behaviour ⇒ collision avoided.
Opt-in --snap.p2p-manifest=true ⇒ full v2 publish + discover + peer-manager flow preserved.

Test plan

  • make lint — 0 issues
  • make erigon — builds clean
  • go test -short ./db/downloader/... — passes (chaintoml_test.go, chaintoml_v2_test.go call PublishChainToml directly, unaffected)
  • go vet ./node/eth/... ./db/downloader/... — clean
  • Manual: fresh --chain=mainnet sync with default flags completes OtterSync and enters Headers stage without the collision error (reviewer to confirm on their setup — original reproducer took ~13 min to hit the crash point)
  • Manual: fresh sync with --snap.p2p-manifest=true still performs chain.toml discovery / publishing as before (v2 behaviour preserved)

🤖 Generated with Claude Code

…ap.p2p-manifest

The chain.toml v2 publish/seed stack was running unconditionally. On a fresh
mainnet sync with the default --snap.p2p-manifest=false, initDownloader's
PublishLocalChainToml() seeded the locally-computed chain.toml torrent into
the anacrolix client. OtterSync then downloaded the authoritative chain.toml
.torrent from peers and AddTorrentsFromDisk collided with the already-seeded
entry: "snapshot exists with a different name: \"chain.toml\"". The execution
service never started.

Gate the three unconditional v2 call sites on the existing --snap.p2p-manifest
flag so default syncs keep pre-v2 behaviour:

- node/eth/backend.go: wrap the ENR updater / NodeSource / delayed publish /
  discovery / torrent-peer-manager setup block in the P2PManifest check.
- node/eth/backend.go: gate the initial publish in initDownloader().
- db/downloader/downloader.go: gate the post-download republish in
  DownloadSnapshots() using d.manifestReady (non-nil iff EnableP2PManifest
  was invoked by the backend). This also covers the loop-internal publish
  in chainTomlDiscoveryLoop, which only runs under P2P manifest mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mh0lt added a commit that referenced this pull request Apr 17, 2026
…ap.p2p-manifest

Fold in the fix from #20615 — the chain.toml v2 publish/seed stack runs
unconditionally on main and collides with the downloaded chain.toml.torrent
during post-OtterSync AddTorrentsFromDisk, aborting fresh mainnet sync with
"snapshot exists with a different name: chain.toml".

Gate the two call sites we still have after the downloader extraction:

1. node/eth/backend.go: wrap the ENR updater + NodeSource + delayed
   PublishLocalChainToml + StartChainTomlDiscovery + StartTorrentPeerManager
   block in --snap.p2p-manifest. Default syncs skip the whole v2 stack.

2. db/downloader/downloader.go:810: gate the post-download republish using
   d.manifestReady != nil (the internal signal that EnableP2PManifest was
   invoked by the backend). Covers the loop-internal publish in
   chainTomlDiscoveryLoop too.

The third call site from #20615 (initDownloader's initial publish at
backend.go:1461 on main) doesn't exist in our branch — initDownloader moved
to node/components/downloader/Provider.initDownloader as part of the
extraction, and that function doesn't call PublishLocalChainToml.

Default --snap.p2p-manifest=false ⇒ zero v2 behaviour ⇒ collision avoided.
Opt-in --snap.p2p-manifest=true ⇒ full v2 publish + discover + peer-manager
flow preserved.

Ref: #20615
@mh0lt

mh0lt commented Apr 17, 2026

Copy link
Copy Markdown
Contributor Author

Folded the two applicable call sites into #20471 as commit 85b0dce — backend.go outer gate + downloader.go manifestReady gate.

The third call site from this PR (the initDownloader initial publish at backend.go:1461 on main) doesn't exist on #20471's branch because initDownloader moved to node/components/downloader/Provider.initDownloader as part of the extraction there, and that function doesn't call PublishLocalChainToml. So #20471 gets the same default-off behaviour as this PR without needing that third gate.

Whichever of #20471 and this PR merges first, the other can drop the overlap on rebase.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a fresh mainnet sync regression caused by chain.toml v2 publishing/discovery running unconditionally, leading to a torrent-name collision during AddTorrentsFromDisk. It restores the pre-v2 default sync path by gating v2 chain.toml publishing/discovery/peer-management behind the existing opt-in flag --snap.p2p-manifest (default false).

Changes:

  • Gate the chain.toml v2 ENR updater + discovery + torrent peer-manager wiring in New() behind Snapshot.P2PManifest.
  • Gate the initial PublishLocalChainToml() call in initDownloader() behind Snapshot.P2PManifest.
  • Gate the post-download PublishLocalChainToml() in Downloader.DownloadSnapshots() behind the internal manifestReady signal (set only when P2P manifest mode is enabled).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
node/eth/backend.go Gates chain.toml v2 publish/discovery startup and the initial publish to only run when --snap.p2p-manifest is enabled.
db/downloader/downloader.go Prevents post-download chain.toml republish unless P2P manifest mode was enabled (via manifestReady).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yperbasis yperbasis added this pull request to the merge queue Apr 17, 2026
Merged via the queue into main with commit 5d6a5ac Apr 17, 2026
40 checks passed
@yperbasis yperbasis deleted the downloader/gate-chain-toml-v2-behind-p2p-manifest branch April 17, 2026 15:36
sudeepdino008 pushed a commit that referenced this pull request Apr 18, 2026
…ap.p2p-manifest (#20615)

## Summary

- Fix fresh-mainnet-sync regression introduced with the chain.toml v2
merge: the publish/seed stack was running unconditionally and collided
with the downloaded `chain.toml.torrent` during post-OtterSync
`AddTorrentsFromDisk`, aborting the execution service with `snapshot
exists with a different name: "chain.toml"`.
- Gate the three unconditional v2 call sites on the existing
`--snap.p2p-manifest` flag (default `false`) so default syncs stay on
the pre-v2 path while opt-in users get the full v2 flow.

## Reproduction (before this change)

```
USE_STATE_CACHE=false ./build/bin/erigon \
  --datadir=<fresh> --chain=mainnet --prune.mode=minimal
```

Fails during stage `[1/6 OtterSync]` immediately after `Downloader
completed remaining snapshots`:

```
[EROR] Could not start execution service
  err="[1/6 OtterSync] after snapshot download: adding torrents from disk:
       adding torrent for chain.toml.torrent:
       snapshot exists with a different name: \"chain.toml\""
```

Collision originates in `getExistingSnapshotTorrent`
([db/downloader/downloader.go:1683](https://github.com/erigontech/erigon/blob/main/db/downloader/downloader.go#L1683))
— the locally-seeded `chain.toml` torrent is already registered under
that info-hash when the disk-scanned `chain.toml.torrent` tries to add.

## Change

Three gated sites, all tied to the pre-existing `--snap.p2p-manifest`
flag (usage: *"Discover snapshot manifest (chain.toml) from P2P peers
via ENR instead of using centralized preverified.toml"* — default
`false`):

1.
[node/eth/backend.go:541](https://github.com/erigontech/erigon/blob/main/node/eth/backend.go#L541)
— wrap the whole ENR updater / `SetNodeSourceFn` / delayed
`PublishLocalChainToml` / `StartChainTomlDiscovery` /
`StartTorrentPeerManager` block in `&&
backend.config.Snapshot.P2PManifest`.
2.
[node/eth/backend.go:1461](https://github.com/erigontech/erigon/blob/main/node/eth/backend.go#L1461)
— gate the initial publish in `initDownloader()`.
3.
[db/downloader/downloader.go:810](https://github.com/erigontech/erigon/blob/main/db/downloader/downloader.go#L810)
— gate the post-download republish using `d.manifestReady != nil`, which
is the internal signal that `EnableP2PManifest()` was invoked by the
backend. This also covers the loop-internal publish in
`chainTomlDiscoveryLoop` (only runs under P2P manifest mode).

Default `--snap.p2p-manifest=false` ⇒ **zero v2 behaviour** ⇒ collision
avoided.
Opt-in `--snap.p2p-manifest=true` ⇒ full v2 publish + discover +
peer-manager flow preserved.

## Test plan

- [x] `make lint` — 0 issues
- [x] `make erigon` — builds clean
- [x] `go test -short ./db/downloader/...` — passes
(`chaintoml_test.go`, `chaintoml_v2_test.go` call `PublishChainToml`
directly, unaffected)
- [x] `go vet ./node/eth/... ./db/downloader/...` — clean
- [ ] Manual: fresh `--chain=mainnet` sync with default flags completes
OtterSync and enters Headers stage without the collision error (reviewer
to confirm on their setup — original reproducer took ~13 min to hit the
crash point)
- [ ] Manual: fresh sync with `--snap.p2p-manifest=true` still performs
chain.toml discovery / publishing as before (v2 behaviour preserved)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants