Skip to content

clients/erigon: import all per-block files in a single process#1519

Merged
danceratopz merged 4 commits into
ethereum:masterfrom
yperbasis:erigon-import-all-blocks-one-process
Jun 3, 2026
Merged

clients/erigon: import all per-block files in a single process#1519
danceratopz merged 4 commits into
ethereum:masterfrom
yperbasis:erigon-import-all-blocks-one-process

Conversation

@yperbasis

@yperbasis yperbasis commented May 29, 2026

Copy link
Copy Markdown
Member

Problem

The erigon client restarts a full erigon process for every file in /blocks, whereas clients/go-ethereum/geth.sh imports them all in a single invocation:

# clients/erigon/erigon.sh (before)
for file in $(ls /blocks | sort -n); do
    $erigon $FLAGS import /blocks/$file
done

On block-heavy BlockchainTests (walletReorganizeOwners, ForkStressTest) the per-process startup overhead (~0.5s × hundreds of blocks) pushes total client startup past Hive's 180s container-startup timeout, producing intermittent client did not start: timed out waiting for container startup failures — e.g. legacy-cancun walletReorganizeOwners_{Cancun,Istanbul,London,Paris} (example run, ~181s each; the Shanghai/Berlin variants and ForkStressTest_* sit at 146–150s, right at the cliff).

Fix

Import all per-block files in one erigon process, matching geth.sh:

(cd /blocks && $erigon $FLAGS import $(ls | sort -n))

This collapses hundreds of process startups into one (~150s → ~10s for walletReorganizeOwners).

Dependency

Requires erigontech/erigon#21513 (merged). Older erigon import only processed the first file argument, so import $(ls | sort -n) would import only the first block. That fix must merge and ship in the erigon image before this lands — hence draft.

The erigon client restarts a full erigon process for every file in
/blocks, whereas clients/go-ethereum/geth.sh imports them all in one
invocation. On block-heavy BlockchainTests (walletReorganizeOwners,
ForkStressTest) the per-process startup overhead (~0.5s per block times
hundreds of blocks) pushes total startup past Hive's container-startup
timeout, causing intermittent "client did not start: timed out waiting
for container startup" failures (e.g. legacy-cancun
walletReorganizeOwners_{Cancun,Istanbul,London,Paris}).

Import all per-block files in one erigon process, matching geth. Requires
erigontech/erigon#21513 (older erigon import only processed the first
file argument).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sahil-4555 pushed a commit to Sahil-4555/erigon that referenced this pull request Jun 3, 2026
…#21513)

## Problem

The Hive `legacy-cancun` BlockchainTests suite intermittently fails 4
tests with `client did not start: timed out waiting for container
startup`:

- `walletReorganizeOwners_{Cancun,Istanbul,London,Paris}`

(e.g. [this
run](https://hive.ethpandaops.io/#/test/generic/1779833679-c7f17208e0b5dbf959b3448bbdaed60e)).
They took ~181s, just over Hive's 180s container-startup timeout. The
same test on other forks (Shanghai/Berlin) and `ForkStressTest_*` sit at
146–150s — right at the cliff.

## Root cause

The `import` command documents multi-file support:

```
USAGE: erigon import [command options] <filename> (<filename 2> ... <filename N>)
```

but `importChain` only processed `cliCtx.Args().First()`, silently
ignoring the rest. That forced Hive's erigon entrypoint into a
one-process-per-block-file loop. For `walletReorganizeOwners` (235 block
files) that is 235 full erigon startups — measured at ~0.5s/process ×
261 = 145s of pure startup, while actual block execution is ~40ms each.
go-ethereum has no such problem: its entrypoint imports every block in
one `geth import` invocation.

## Fix

Iterate every file argument in a single process, tolerating per-file
failures when several files are given (matching go-ethereum). This lets
the hive entrypoint import all blocks in one invocation, collapsing
hundreds of process startups into one (~150s → ~10s).

It also force-disables the embedded MCP server for the one-shot import
(`--mcp.disable`, alongside the existing NAT / downloader /
external-consensus disables) — a batch import has no use for it.

## Companion change (required)

The hive entrypoint must pass all block files in one invocation:
ethereum/hive#1519. Order matters — the hive change depends on this one
(on an old erigon, `import /blocks/*` would import only the first
block).

## Testing

- New unit tests (`import_cmd_test.go`): all-files iteration, per-file
failure tolerance, single-file error surfacing.
- Real-binary check: `erigon import a.rlp b.rlp` now attempts **both**
files in one process (previously stopped after the first).
- `make lint` clean; `make erigon integration` builds.

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Giulio Rebuffo <111551070+Giulio2002@users.noreply.github.com>
yperbasis and others added 2 commits June 3, 2026 09:35
Condense the comment to one general sentence (no per-test names) and remove the duplicate "Loading remaining individual blocks..." echo, matching geth.sh.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@yperbasis yperbasis marked this pull request as ready for review June 3, 2026 07:40
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@danceratopz danceratopz left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@danceratopz danceratopz merged commit a41c80a into ethereum:master Jun 3, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants