Skip to content

fix: PoS dev mode — enable Fusaka (Fulu CL + Osaka EL) block production#21646

Merged
yperbasis merged 3 commits into
mainfrom
fix/pos-dev-mode-electra
Jun 10, 2026
Merged

fix: PoS dev mode — enable Fusaka (Fulu CL + Osaka EL) block production#21646
yperbasis merged 3 commits into
mainfrom
fix/pos-dev-mode-electra

Conversation

@mh0lt

@mh0lt mh0lt commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Summary

--chain dev couldn't produce blocks past genesis. Two bugs, plus a builder cleanup found while addressing review:

  • Beacon API routes not registered — dev mode set BeaconAPIRouter.Active=true but never enabled endpoint groups (Beacon, Node, Validator, Config, Events). The embedded dev validator polls /eth/v1/beacon/genesis which returned 404, deadlocking block production.

  • CL/EL fork mismatch: Prague/Osaka not activated, system contracts missing — the CL ran Fulu (epoch 0) but the EL had PragueTime=nil and OsakaTime=nil. Pre-Prague EL headers carry no requestsHash, while the CL reconstructs the header with SHA256("") = e3b0c442… over the (empty) execution-requests list, so every produced block was rejected with "mismatching hash" (cl/cltypes/eth1_block.go). Activating Prague in turn requires the system-contract predeploys in the dev genesis alloc.

  • Dead requestsHash override in the builder removedexecution/builder/exec.go re-derived header.RequestsHash after AssembleBlock, but on a discarded header copy: the final block is rebuilt from current.Header in finishBlock, which merge.FinalizeAndAssemble already stamps with outRequests.Hash() (empty.RequestsHash for empty request sets). An earlier revision of this PR changed the dead write from common.Hash{} to empty.RequestsHash; it is now deleted outright, leaving FinalizeAndAssemble as the single source of truth. Verified dead by running the new regression test against the old zero-hash code (passes) and against zero-hash corruption injected into the live merge.go path (fails with "invalid block hash").

The dev genesis is now a faithful Cancun+Prague genesis and lives in the genesis layer:

  • All five system contracts are predeployed via allocs/dev.json (copied verbatim from allocs/hoodi.json): EIP-6110 deposit contract (with DepositContract set in the chain config — fixes the ParseDepositLogs WARN on every block), EIP-4788 beacon roots (otherwise a silent no-op with Cancun at genesis), EIP-2935 history storage, EIP-7002 withdrawal requests, EIP-7251 consolidation requests.
  • Fork times (Shanghai/Cancun/Prague/Osaka at 0), TTD=0 and the deposit contract address moved from SetEthConfig into DeveloperGenesisBlock(); the remaining Caplin/beacon-genesis setup is extracted into setDevnetEthConfig, leaving the SetEthConfig switch case at one line.
  • TestEngineApiBuiltBlockEmptyRequestsHash pins the requests hash of an empty-request Prague block end to end (FCU → getPayload → newPayload → canonical header via RPC).

Dev mode now runs full Fusaka: Fulu on CL, Osaka on EL.

Note: the dev genesis hash changes (new predeploys in the alloc) — wipe existing --chain dev datadirs.

Test plan

  • erigon --chain dev --dev.slot-time 2 produces Fulu/Osaka blocks at 2s intervals
  • Transactions confirmed on-chain (ETH transfers, contract deployment)
  • No ParseDepositLogs WARNs; EIP-4788 ring buffer populated (eth_getStorageAt of slot timestamp % 8191 returns the block timestamp); block requestsHash is e3b0c442…
  • TestEngineApiBuiltBlockEmptyRequestsHash fails on injected live-path corruption, passes otherwise
  • CI passes

@mh0lt mh0lt force-pushed the fix/pos-dev-mode-electra branch from b6a17b3 to 56143a5 Compare June 5, 2026 19:13
Three issues prevented `--chain dev` from producing blocks past genesis:

1. Beacon API routes not registered: dev mode set
   BeaconAPIRouter.Active=true but never enabled the endpoint groups
   (Beacon, Node, Validator, Config, Events). The dev validator polls
   /eth/v1/beacon/genesis and /eth/v1/validator/duties/* which returned
   404, deadlocking block production.

2. Empty RequestsHash mismatch: the block builder used common.Hash{}
   (zero) when no EIP-7685 requests exist, but the CL computes
   SHA256("") = e3b0c442... over an empty list. Fix: use the existing
   empty.RequestsHash constant.

3. Prague/Osaka not activated + system contracts missing: the CL ran at
   Fulu (epoch 0) but the EL had PragueTime=nil and OsakaTime=nil.
   Enabling Prague requires the EIP-7002/7251/2935 system contracts in
   the dev genesis alloc (copied from Hoodi testnet).

Dev mode now runs the full Fusaka fork: Fulu on CL, Osaka on EL.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mh0lt mh0lt force-pushed the fix/pos-dev-mode-electra branch from 56143a5 to abb171d Compare June 5, 2026 19:23
@mh0lt mh0lt changed the title fix: PoS dev mode — enable Electra/Prague block production fix: PoS dev mode — enable Fusaka (Fulu CL + Osaka EL) block production Jun 5, 2026
Comment thread cmd/utils/flags.go Outdated
cfg.Genesis.Config.ShanghaiTime = &zero
cfg.Genesis.Config.CancunTime = &zero
cfg.Genesis.Config.PragueTime = nil // Prague may need more config; leave disabled
cfg.Genesis.Config.PragueTime = &zero

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be nice to extract all of this into a SetDevnetEthConfig function or something along those lines since this switch's devnet case has grown quite a lot

@yperbasis yperbasis added this to the 3.5.0 milestone Jun 8, 2026
@yperbasis yperbasis requested a review from Copilot June 8, 2026 07:44

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes PoS --chain dev block production by aligning Erigon’s dev-mode EL/CL configuration with Fusaka-era requirements, so the embedded dev validator can progress past genesis and produced blocks are accepted by the CL.

Changes:

  • Fixes Prague requestsHash computation for empty EIP-7685 request sets by using the canonical empty hash.
  • Enables Prague + Osaka at genesis in dev mode and injects required Prague system-contract predeploys into the dev genesis alloc.
  • Enables Beacon API endpoint groups in dev mode so the embedded dev validator can successfully poll required routes (e.g., /eth/v1/beacon/genesis).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
execution/builder/exec.go Uses the canonical empty requests hash for Prague blocks to match CL expectations.
cmd/utils/flags.go Enables Prague/Osaka + Beacon API endpoint groups in dev mode; adds Prague system-contract predeploys to dev genesis alloc.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/utils/flags.go Outdated
Comment on lines +2077 to +2081
zero := uint64(0)
cfg.Genesis.Config.ShanghaiTime = &zero
cfg.Genesis.Config.CancunTime = &zero
cfg.Genesis.Config.PragueTime = nil // Prague may need more config; leave disabled
cfg.Genesis.Config.PragueTime = &zero
cfg.Genesis.Config.OsakaTime = &zero
Comment thread cmd/utils/flags.go Outdated
Comment on lines +2097 to +2102
cfg.Genesis.Alloc[common.HexToAddress("0x0000BBdDc7CE488642fb579F8B00f3a590007251")] = types.GenesisAccount{
Balance: new(big.Int),
Nonce: 1,
Code: common.FromHex("0x3373fffffffffffffffffffffffffffffffffffffffe1460d35760115f54807fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff1461019a57600182026001905f5b5f82111560685781019083028483029004916001019190604d565b9093900492505050366060146088573661019a573461019a575f5260205ff35b341061019a57600154600101600155600354806004026004013381556001015f358155600101602035815560010160403590553360601b5f5260605f60143760745fa0600101600355005b6003546002548082038060021160e7575060025b5f5b8181146101295782810160040260040181607402815460601b815260140181600101548152602001816002015481526020019060030154905260010160e9565b910180921461013b5790600255610146565b90505f6002555f6003555b5f54807fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff141561017357505f5b6001546001828201116101885750505f61018e565b01600190035b5f555f6001556074025ff35b5f5ffd"),
Storage: map[common.Hash]common.Hash{common.Hash{}: common.HexToHash("0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff")},
}
yperbasis

This comment was marked as outdated.

@yperbasis yperbasis left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major

  1. Dev genesis predeploy set is incomplete. The PR adds the predeploys whose absence hard-fails finalization (7002/7251/2935) but omits the two whose absence fails silently:

    • Deposit contract (EIP-6110): DepositContract stays 0x0, so merge.Finalizemisc.ParseDepositLogs logs a WARN on every block. Cosmetic, but noisy.
    • EIP-4788 beacon roots: not predeployed, so with Cancun at genesis the beacon-roots syscall hits an empty account and silently no-ops — EIP-4788 reads return zero.

    Both live in allocs/hoodi.json. Adding the 4788 predeploy + the deposit contract (and setting config.DepositContract) makes the dev genesis a faithful Cancun+Prague genesis and resolves both Copilot comments at once.

  2. Genesis/alloc content is in the wrong layer. The system-contract hex is hardcoded inline in the SetEthConfig switch (CLI flag → config translation), duplicating bytecode that already lives in allocs/hoodi.json. Better to push it into DeveloperGenesisBlock() / allocs/dev.json, or extract a SetDevnetEthConfig helper (as @taratorio noted) — keeps flags.go focused and avoids a silent-staleness copy.

Minor

  1. Root cause of the requestsHash bug is duplicated consensus logic. execution/builder/exec.go reimplements the requests-hash computation that merge.go Finalize already does correctly (outRequests.Hash()). The one-line fix is right, but the duplication is the underlying smell and can drift again on the next EIP-7685 change — worth a tracking issue to converge the two paths.

  2. No regression test. Per the repo's TDD guidance, the consensus-relevant exec.go fix warrants a focused test (the execmoduletester / engineapitester harnesses already exist) asserting the builder emits empty.RequestsHash for an empty-request Prague block — or a note in the PR on why TDD was skipped.

…hainspec

Addresses review feedback on #21646:

- Add the EIP-6110 deposit contract (and set DepositContract in the chain
  config) plus the EIP-4788 beacon-roots predeploy to the dev genesis,
  alongside the EIP-2935/7002/7251 contracts — all copied verbatim from
  allocs/hoodi.json. Fixes the ParseDepositLogs WARN on every block and
  the silently no-op 4788 syscall.
- Move all dev genesis content (fork times, TTD, deposit contract,
  system-contract allocs) into DeveloperGenesisBlock()/allocs/dev.json,
  and extract the remaining Caplin/beacon-genesis setup from the
  SetEthConfig switch into setDevnetEthConfig.
- Delete the builder's post-AssembleBlock RequestsHash override: it wrote
  to a discarded header copy. The final block is rebuilt from
  current.Header in finishBlock, which merge.FinalizeAndAssemble already
  stamps with outRequests.Hash() (empty.RequestsHash for empty sets).
  Verified dead by running the new regression test against the pre-fix
  zero-hash code (passes) and against zero-hash corruption injected into
  the live merge.go path (fails with 'invalid block hash').
- Add TestEngineApiBuiltBlockEmptyRequestsHash pinning the requests hash
  of an empty-request Prague block end to end.
@yperbasis yperbasis enabled auto-merge June 10, 2026 11:03
Dev mode now self-enables the Beacon API endpoint groups the embedded
dev validator needs.
@yperbasis yperbasis added this pull request to the merge queue Jun 10, 2026
Merged via the queue into main with commit d47ea5b Jun 10, 2026
91 checks passed
@yperbasis yperbasis deleted the fix/pos-dev-mode-electra branch June 10, 2026 12:45
pull Bot pushed a commit to Dustin4444/erigon that referenced this pull request Jun 10, 2026
…erigontech#21723)

## Problem

`kurtosis / assertoor_regular_serial_test` failed on erigontech#21646 ([job
link](https://github.com/erigontech/erigon/actions/runs/27271494252/job/80543276225)):
the runner lost Docker Hub connectivity for the entire job (even the
Docker login step timed out against `registry-1.docker.io`), and
`kurtosis engine start` exhausted all 3 retries failing to pull
`badouralix/curl-jq:latest` — the logs-aggregator healthcheck container.
A sibling job on a different runner passed at the same time, so this was
per-runner registry egress, i.e. exactly the class of flake the
cached-image setup exists to absorb.

## Root cause

The `docker-cl-*` cache covers the CL images plus kurtosis
engine/core/expander/vector/fluent-bit, but `kurtosis engine start`
launches **three more** helper containers, none of them cached:

- `badouralix/curl-jq:latest` — logs-aggregator healthcheck
(`logs_aggregator_functions/shared_helpers.go` in
kurtosis-tech/kurtosis)
- `traefik:2.10.6` — reverse proxy
(`reverse_proxy_functions/implementations/traefik/consts.go`)
- `alpine:3.17` — volume-init helper
(`engine_functions`/`logs_collector_functions`)

A passing run's log confirms these are the only three images pulled at
engine start (every cached image shows no pull line), so the
engine-start step — whose stated purpose is to keep registry blips out
of the test step — still had a hard Docker Hub dependency.

## Fix

Add the three images to the existing pull → `docker save` → cache →
`docker load` pipeline and cache keys in both
`test-kurtosis-assertoor.yml` and `test-kurtosis-gloas.yml`, following
the established vector/fluent-bit pattern. Kurtosis uses image download
mode "missing", so pre-loaded images are used without contacting the
registry (verified in the passing run: vector/fluent-bit are started
without pull lines). After this, `kurtosis engine start` is fully
cache-served.

## Rollout

- The cache key changes, so this PR's own kurtosis runs cold-pull once
and exercise the new pull/save path.
- On merge, the push touching these files triggers
`cache-warming-kurtosis-cl-images.yml` and
`cache-warming-kurtosis-gloas-images.yml` (paths filters) which re-warm
the main-scope cache under the new key. No manual action needed.

## Residual exposure (out of scope)

Enclave-time images with intentionally mutable tags
(`ethpandaops/ethereum-genesis-generator`, `rpc-snooper:latest`,
`spamoor:master`, suite-specific CL devnet tags, …) are still pulled
from Docker Hub during `kurtosis run`. Pinning/caching those would
change test semantics (they deliberately track moving tags), so they
stay as-is.

## Validation

- `actionlint`: no new findings (the two pre-existing SC2086 infos in
the apt-get line are unchanged)
- YAML parse clean; resolved cache key ≈ 209 chars (limit 512)
- `make lint`: 0 issues
yperbasis added a commit that referenced this pull request Jun 12, 2026
…roduction (#21728)

Cherry-pick of #21646 to release/3.5.

Co-authored-by: Mark Holt <135143369+mh0lt@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants