Skip to content

snapcfg: lazy-parse EmbeddedWebseeds, only parse the chain in use#19722

Merged
wmitsuda merged 4 commits into
mainfrom
wmitsuda/do-not-read-all-chains-webseeds-toml
Mar 8, 2026
Merged

snapcfg: lazy-parse EmbeddedWebseeds, only parse the chain in use#19722
wmitsuda merged 4 commits into
mainfrom
wmitsuda/do-not-read-all-chains-webseeds-toml

Conversation

@wmitsuda

@wmitsuda wmitsuda commented Mar 7, 2026

Copy link
Copy Markdown
Member

Follow-up to #19641 — step 2/N towards simplifying TOML reading at startup.

Summary

  • Lazy-parse webseed TOML: instead of parsing all 8 chains' webseed TOML at init time, store raw bytes in EmbeddedWebseedsRaw and parse on demand via GetEmbeddedWebseeds(chain) — only the chain actually in use gets parsed.
  • Remove no-op re-assignment: LoadRemotePreverified was redundantly re-building the same KnownWebseeds map; removed.
  • Inline webseedsParse: folded into its sole caller GetEmbeddedWebseeds.
  • Rename KnownWebseedsEmbeddedWebseeds: clearer naming — EmbeddedWebseedsRaw for the raw bytes map, GetEmbeddedWebseeds() for parsed access.

TODO: cherry-pick to release/3.4 after merge. Done: #19727

wmitsuda added 4 commits March 7, 2026 16:16
…ified

The webseed.* vars (from erigon-snapshot/webseed) are //go:embed and
have never been mutated at runtime — LoadSnapshots() only updates
snapshothashes.* vars. Re-parsing the identical embedded bytes after
a remote load produced the same map as the package-level init.
Replace eagerly-parsed KnownWebseeds map with KnownWebseedsRaw (raw
TOML bytes) and GetKnownWebseeds(chain) that parses on demand. This
avoids parsing all 8 chains' webseed TOMLs at init when only 1 is used.
@wmitsuda wmitsuda requested a review from anacrolix March 7, 2026 21:14
@wmitsuda wmitsuda enabled auto-merge (squash) March 7, 2026 21:14
@wmitsuda wmitsuda merged commit a18eb9b into main Mar 8, 2026
28 of 29 checks passed
@wmitsuda wmitsuda deleted the wmitsuda/do-not-read-all-chains-webseeds-toml branch March 8, 2026 09:06
wmitsuda added a commit that referenced this pull request Mar 8, 2026
…9722)

Follow-up to #19641 — step 2/N towards simplifying TOML reading at
startup.

## Summary

- **Lazy-parse webseed TOML**: instead of parsing all 8 chains' webseed
TOML at init time, store raw bytes in `EmbeddedWebseedsRaw` and parse on
demand via `GetEmbeddedWebseeds(chain)` — only the chain actually in use
gets parsed.
- **Remove no-op re-assignment**: `LoadRemotePreverified` was
redundantly re-building the same `KnownWebseeds` map; removed.
- **Inline `webseedsParse`**: folded into its sole caller
`GetEmbeddedWebseeds`.
- **Rename `KnownWebseeds` → `EmbeddedWebseeds`**: clearer naming —
`EmbeddedWebseedsRaw` for the raw bytes map, `GetEmbeddedWebseeds()` for
parsed access.

---

**TODO**: cherry-pick to `release/3.4` after merge.
wmitsuda added a commit that referenced this pull request Mar 9, 2026
…the chain in use (#19727)

Cherry-pick of #19722 (merged to main as a18eb9b) to `release/3.4`.

## Summary

- **Lazy-parse webseed TOML**: instead of parsing all 8 chains' webseed
TOML at init time, store raw bytes in `EmbeddedWebseedsRaw` and parse on
demand via `GetEmbeddedWebseeds(chain)` — only the chain actually in use
gets parsed.
- **Remove no-op re-assignment**: `LoadRemotePreverified` was
redundantly re-building the same `KnownWebseeds` map; removed.
- **Inline `webseedsParse`**: folded into its sole caller
`GetEmbeddedWebseeds`.
- **Rename `KnownWebseeds` → `EmbeddedWebseeds`**: clearer naming —
`EmbeddedWebseedsRaw` for the raw bytes map, `GetEmbeddedWebseeds()` for
parsed access.
wmitsuda added a commit that referenced this pull request Mar 11, 2026
## Summary
Part 3 of optimizing remote preverified hash loading (after #19641,
#19722).

- `LoadPreverified` now takes a `chainName` parameter and calls
`LoadRemotePreverified` instead of the old bulk variant that fetched all
10 chains
- Refactored `webseeds.Verify` to load preverified per-chain inside the
iteration loop instead of bulk-loading all chains upfront
- Removed unused functions: old bulk `LoadRemotePreverified`,
`registry.All`, `registry.ResetRaw`, `GetAllCurrentPreverified`
- Renamed `LoadRemotePreverifiedForChain` → `LoadRemotePreverified`
since it's now the only variant

## Test plan
- [x] Built `erigon` and `downloader` binaries
- [x] Tested `erigon seg reset --dry-run` with mainnet and hoodi
ephemeral datadirs
- [x] Tested `downloader verify_webseeds --chain=chiado
--preverified=embedded` to completion
- [x] Verified only the requested chain is fetched (confirmed via log
output)

## Tasks
- [ ] Cherry-pick merge commit to `release/3.4`
wmitsuda added a commit that referenced this pull request Mar 11, 2026
## Summary
Part 3 of optimizing remote preverified hash loading (after #19641,
#19722).

- `LoadPreverified` now takes a `chainName` parameter and calls
`LoadRemotePreverified` instead of the old bulk variant that fetched all
10 chains
- Refactored `webseeds.Verify` to load preverified per-chain inside the
iteration loop instead of bulk-loading all chains upfront
- Removed unused functions: old bulk `LoadRemotePreverified`,
`registry.All`, `registry.ResetRaw`, `GetAllCurrentPreverified`
- Renamed `LoadRemotePreverifiedForChain` → `LoadRemotePreverified`
since it's now the only variant

## Test plan
- [x] Built `erigon` and `downloader` binaries
- [x] Tested `erigon seg reset --dry-run` with mainnet and hoodi
ephemeral datadirs
- [x] Tested `downloader verify_webseeds --chain=chiado
--preverified=embedded` to completion
- [x] Verified only the requested chain is fetched (confirmed via log
output)

## Tasks
- [ ] Cherry-pick merge commit to `release/3.4`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants