Skip to content

cl: replace Clique dev mode with embedded PoS consensus#20451

Merged
AskAlexSharov merged 17 commits into
mainfrom
feat/caplin-dev-mode-pr
Apr 10, 2026
Merged

cl: replace Clique dev mode with embedded PoS consensus#20451
AskAlexSharov merged 17 commits into
mainfrom
feat/caplin-dev-mode-pr

Conversation

@mh0lt

@mh0lt mh0lt commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

Closes #14753

Summary

The --chain=dev mode for Erigon has been updated so that it creates a dev
beacon chain rather than relying on Clique consensus. In dev mode Erigon now
runs the same PoS process as mainnet including the same forking and block
production code.

A single command starts a fully operational PoS node:

./build/bin/erigon --chain=dev --beacon.api=beacon,validator,node,config

What's included

  • Programmatic beacon genesis (cl/clparams/devgenesis/): builds a valid
    Deneb genesis state with deterministic BLS validators, sync committees,
    participation lists, and execution payload header — no external tooling needed.

  • Embedded dev validator (cl/validator/devvalidator/): polls the Beacon API
    for proposer duties, fetches block templates, signs with proper BLS domain
    separation (RANDAO, block, attestation), and submits via the standard
    /eth/v2/beacon/blocks endpoint.

  • Same code paths as mainnet: blocks flow through Caplin's fork choice,
    NewPayload, ForkChoiceUpdated, and the staged sync pipeline. No mock
    consensus or special-cased execution.

New flags

Flag Default Description
--dev-validator-seed devnet BLS key derivation seed
--dev-validator-count 64 Number of genesis validators
--dev.slot-time 6 Seconds per slot (minimum: 2)

--beacon.api=beacon,validator,node,config is required to enable the Beacon API
endpoints used by the embedded validator.

Removed

  • --mine and --dev.period are no longer used for --chain=dev
  • Clique genesis configuration removed from dev mode

Dependencies

Built on top of #20190 (caplin minimal preset support), which is now merged.

Test plan

  • Blocks produced every slot (tested at 6s and 2s slot times)
  • Epoch boundaries crossed cleanly (slots 1-16+, minimal preset 8 slots/epoch)
  • EL validates and commits each block (head advances)
  • No panics or errors over sustained runs
  • make lint clean
  • make erigon integration builds
  • CI passes

mh0lt added 13 commits April 9, 2026 19:50
Dev validator service for single-node PoS dev mode:

- keys.go: ValidatorKey type, LoadKeys from seed, pubkey index map
- client.go: minimal Beacon API HTTP client (get/post/postSSZ)
- service.go: orchestrator with slot loop, duty polling, proposal
  and attestation stubs

The service uses standard Beacon API endpoints (same as Lighthouse) —
no internal shortcuts. Signing and block submission have TODOs for
proper domain separation and SSZ encoding.
- signing.go: signObject, signRandaoReveal, signBlock, signAttestation
  using fork.ComputeSigningRoot + fork.ComputeDomain (spec-compliant)
- service.go: fetch genesisValidatorsRoot on startup, use proper
  RANDAO signing, remove binary import
Phase 1 completion:

Block proposal:
- Fetch unsigned block template from /eth/v3/validator/blocks/{slot}
- Parse as SignedBeaconBlock, sign with DOMAIN_BEACON_PROPOSER
- Submit via POST /eth/v2/beacon/blocks with version header

Attestation:
- Fetch attester duties from /eth/v1/validator/duties/attester/{epoch}
- Get attestation data per (slot, committee_index)
- Sign with DOMAIN_BEACON_ATTESTER
- Build aggregation bits from committee position
- Submit via POST /eth/v1/beacon/pool/attestations

Also: postJSON method on BeaconClient with Eth-Consensus-Version header.
When --chain=dev is specified:
- EL genesis: PoS (TTD=0, no Clique), signer address pre-funded
- CL genesis: minimal preset, deterministic BLS validators from seed
- InternalCL=true: Caplin starts with custom config/genesis paths
- DevValidatorService starts automatically, connecting to Beacon API

New flags:
- --dev-validator-seed (default: "devnet")
- --dev-validator-count (default: 64)

Changes:
- cmd/utils/flags.go: PoS dev genesis, beacon state generation, temp
  file writing for config/genesis SSZ
- node/eth/backend.go: start DevValidatorService after Caplin
- cl/clparams/config.go: export ApplyMinimalPreset, add DevValidator
  fields to CaplinConfig, update IsDevnet()
… hash

- Register DevValidatorSeedFlag + DevValidatorCountFlag in default_flags
- Skip ChainSpecByName for dev/bor-devnet chains (not in registry)
- Compute EL genesis hash via GenesisToBlock for beacon Eth1Data
- Export ApplyMinimalPreset for external callers

Fork choice still fails: beacon genesis state needs proper
initialization to match Caplin's expected genesis block root.
…ping

- writeGenesisBeaconBlock: copy LatestExecutionPayloadHeader fields
  into the genesis block body's ExecutionPayload (BlockHash, StateRoot, etc.)
- NewForkChoiceStore: seed eth2Root→eth1Hash mapping for the anchor
  block from LatestExecutionPayloadHeader.BlockHash
- devgenesis: set LatestExecutionPayloadHeader with EL genesis hash

Fork choice still sees zero hash — the genesis state SSZ encoding
is incomplete (participation lists, slashings arrays not initialized).
Needs proper full-state initialization following the consensus spec's
initialize_beacon_state_from_eth1.
…ssing

Major progress — dev validator is running, connected, proposing:
- Set beacon state version to Deneb (all forks at epoch 0)
- Write fork epochs to config YAML so Caplin loads correct version
- Enable Beacon API automatically for dev mode
- Fix HTTP protocol (was "tcp", now "http")
- Load beacon config from custom config path (was nil)
- Add writeDevGenesisBeaconBlock to write genesis block before fork choice

Dev validator now detects proposer duties and requests block templates.
Remaining: EL "failed to produce execution payload" — Engine API block
building needs investigation (ForkChoiceUpdate with PayloadAttributes).
…k epochs

Major milestone — EL produces blocks for the dev validator:
- Set eth1DepositIndex = validatorCount (all deposits pre-processed)
- Initialize sync committees with validator pubkeys (Altair+)
- Set state version from config (was defaulting to Phase0)
- Add --dev.slot-time flag for configurable slot duration
- Set Shanghai/Cancun times to 0 on EL genesis config
- Add retry loop (30×200ms) in AssembleBlock for semaphore contention
- Skip slot 0 proposals (genesis)

Block produced log: "Block produced proposerIndex=36 slot=1 version=deneb"

Remaining: block signing crashes on nil ExtraData in the returned
template — needs nil-safe parsing of the block JSON response.
… wrapper

Blocks are now produced and submitted at every slot:
  [dev-validator] proposed block  slot=1 validator=36
  [dev-validator] proposed block  slot=2 validator=32
  [dev-validator] proposed block  slot=3 validator=0

Fixes:
- Initialize nil ExtraData/Transactions/Withdrawals on parsed block
- Wrap SignedBeaconBlock in DenebSignedBeaconBlock for POST submission
- Submit via JSON with Eth-Consensus-Version header

Remaining: storeBlockAndBlobs fails because fork choice finalized/safe
checkpoints reference zero hash. The eth2Root→eth1Hash seeding in
NewForkChoiceStore needs to propagate to the finalized checkpoint lookup.
…es blocks

Five fixes that together enable end-to-end block production in dev mode:

1. Fix DenebBeaconBlock JSON parsing: the /eth/v3/validator/blocks endpoint
   returns a DenebBeaconBlock wrapper ({"block": {...}, "kzg_proofs": [...]})
   but the dev validator was unmarshaling directly into BeaconBlock, losing
   all fields including slot.

2. Call OnTick before OnBlock in storeBlockAndBlobs: the beacon API handler
   runs before the ForkChoice stage calls OnTick, so fork choice time was
   stuck at genesis causing all blocks to be rejected as "too early".

3. Fix genesis body root mismatch: writeGenesisBeaconBlock was copying exec
   payload fields into the body, making its hash differ from devgenesis.go.
   Fork choice already has the eth2→eth1 mapping via anchorState seeding.

4. Initialize participation lists in dev genesis: without these, epoch
   processing panics on index-out-of-range at the first epoch boundary.

5. Map zero hash → EL genesis in fork choice: the finalized checkpoint
   root is zero at genesis, so GetEth1Hash needs this mapping.

Tested: 16+ blocks produced and validated across epoch boundaries with no
errors or panics.
… time

The --chain=dev mode now uses an embedded PoS consensus layer (Caplin) with
deterministic BLS validators instead of Clique. Update all documentation to
reflect this:

- Rewrite docs/DEV_CHAIN.md with PoS dev mode flags, examples, and explanation
- Update CLAUDE.md, agents.md, ChangeLog.md to remove --mine references
- Update Docker Compose example in multiple-instances.md
- Update skill file example

Also enforce a 2-second minimum on --dev.slot-time to prevent degenerate
behavior with sub-second slots.
The cherry-pick from the dev-mode branch dropped the BLS private_key.go
change. Re-add NewPrivateKeyFromIKM and fix gofmt formatting in files
that diverged during the cherry-pick.
domiwei and others added 4 commits April 10, 2026 06:26
Fix two issues in the dev mode validator:

1. maybeAttest used post() (which discards the response body) followed
   by get() on /eth/v1/validator/duties/attester/{epoch}. Since that
   endpoint is POST-only per the Beacon API spec, the GET always
   returned 405, making attestations silently fail every slot. Add a
   postAndDecode() client method that sends a POST and parses the
   response in one round-trip, and use it in maybeAttest.

2. The dev-beacon startup path in SetEthConfig silently discarded
   errors from os.MkdirAll, EncodeSSZ, and two os.WriteFile calls.
   A disk-full or permission error would cause Caplin to fail later
   with a misleading "could not read genesis state" message. Add
   explicit error checks with Fatalf for all four operations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Both signing.go and devgenesis.go hardcoded GenesisForkVersion for all
domain computations. In dev mode where all forks activate at epoch 0,
the correct CurrentVersion is the latest active fork (e.g. Deneb's
0x04000000), not Phase0's GenesisForkVersion.

The two bugs cancelled each other out (both sides used the same wrong
value), but fixing either one without the other would break all block
proposals and attestations. Fix both together:

- signing.go: add forkVersionForEpoch() that uses the config's
  GetCurrentStateVersion + GetForkVersionByVersion to derive the
  correct fork version for any epoch.

- devgenesis.go: set Fork.CurrentVersion to the latest fork active at
  epoch 0, with PreviousVersion remaining as GenesisForkVersion per
  the spec's fork transition semantics.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add two test files covering the bugs fixed in previous commits:

client_test.go:
- TestPostAndDecode_AttesterDuties: mock POST-only endpoint, verify
  postAndDecode sends POST and parses the response data correctly.
- TestGet_AttesterDuties_Fails: confirm GET on a POST-only endpoint
  returns 405, documenting the original bug.

signing_test.go:
- TestForkVersionForEpoch_DevMode: all forks at epoch 0 must return
  the latest fork version, not GenesisForkVersion.
- TestForkVersionForEpoch_MainnetProgression: verify version advances
  through Phase0 → Altair → Deneb on mainnet config.
- TestSigningDomainMatchesGenesis: the domain computed by signObject
  at epoch 0 must equal the domain derived from BuildGenesisState's
  Fork.CurrentVersion — the core invariant that prevents silent
  signature mismatches in dev mode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…or handling

Three improvements to the dev mode validator:

1. config.yaml and Go setup now enable Electra and Fulu at epoch 0,
   matching the EL's fork schedule. Previously only Altair through
   Deneb were listed, so Caplin would fall back to the minimal preset
   defaults (far-future epochs) for newer forks, causing EL/CL fork
   mismatch.

2. Replace all 7 fmt.Sscanf calls in service.go with
   strconv.ParseUint. Sscanf silently leaves variables at zero on
   parse failure, which could cause wrong genesis time, validator
   index, slot, or committee calculations with no error indication.

3. DeriveSignerKey now returns an error instead of silently discarding
   the crypto.ToECDSA result. Updated all callers (flags.go, test).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@domiwei

domiwei commented Apr 10, 2026

Copy link
Copy Markdown
Member

I found a few things we could improve, plus some possible bugs, so I just went ahead and made the changes.

@AskAlexSharov AskAlexSharov requested a review from Copilot April 10, 2026 07:48
@AskAlexSharov AskAlexSharov added this pull request to the merge queue Apr 10, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Erigon’s --chain=dev to run as a single-process PoS node (embedded Caplin CL + embedded dev validator) instead of relying on Clique, enabling dev mode to exercise the same fork-choice, payload building, and EL/CL integration paths as mainnet.

Changes:

  • Add programmatic dev beacon genesis generation (minimal preset, forks enabled from genesis) and embed it into --chain=dev startup.
  • Add an embedded dev validator client that proposes blocks and submits attestations via the Beacon API.
  • Wire new CLI flags + update forkchoice/startup glue and documentation for the new PoS dev workflow.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
node/eth/backend.go Starts the embedded dev validator alongside embedded Caplin when configured.
node/cli/default_flags.go Registers new dev-mode flags (validator seed/count, slot time).
cmd/utils/flags.go Implements PoS dev-chain config: EL genesis tweaks, beacon config/state generation, and Caplin/dev-validator wiring.
cmd/caplin/caplin1/run.go Seeds index DB with a genesis beacon block in dev mode before fork choice init.
cmd/caplin/caplin1/dev_genesis.go Constructs and persists the dev genesis beacon block into the indices DB.
cl/clparams/config.go Adds Caplin dev-validator fields and exports ApplyMinimalPreset.
cl/clparams/devgenesis/devgenesis.go Builds deterministic beacon genesis state + derives deterministic validator/signer keys.
cl/clparams/devgenesis/devgenesis_test.go Tests for determinism and basic validity of the generated dev genesis state/keys.
cl/validator/devvalidator/service.go Embedded validator duty loop: proposer duties, block signing/submission, attestation submission.
cl/validator/devvalidator/client.go Minimal Beacon API client helpers (GET/POST + wrapper decoding).
cl/validator/devvalidator/client_test.go Regression tests for POST-only attester duties handling.
cl/validator/devvalidator/keys.go Deterministic key loading and pubkey→key indexing.
cl/validator/devvalidator/signing.go Fork-aware domain computation and signing helpers (RANDAO, block, attestation).
cl/validator/devvalidator/signing_test.go Tests fork-version selection and domain consistency vs genesis fork version.
cl/utils/bls/private_key.go Adds deterministic BLS key derivation helper from IKM.
cl/phase1/forkchoice/forkchoice.go Seeds eth2Root→eth1Hash mapping for the anchor block at startup.
cl/phase1/execution_client/execution_client_direct.go Retries AssembleBlock on transient busy errors in single-process mode.
cl/beacon/handler/block_production.go Advances forkchoice time via OnTick prior to accepting locally-produced blocks.
docs/DEV_CHAIN.md Rewrites dev-chain documentation for PoS dev mode and new flags/workflow.
docs/gitbook/src/fundamentals/multiple-instances.md Updates --chain=dev example to include required Beacon API flags.
ChangeLog.md Updates dev-mode messaging to reflect embedded PoS dev chain.
agents.md Updates dev command example to PoS dev mode flags.
.claude/skills/erigon-ephemeral/SKILL.md Updates dev example flags in skill documentation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/utils/flags.go
Comment on lines +2049 to +2055
// Compute the EL genesis block hash so the beacon state's Eth1Data
// matches the actual chain genesis.
elGenesisBlock, _, err := genesiswrite.GenesisToBlock(nil, cfg.Genesis, cfg.Dirs, logger)
if err != nil {
Fatalf("Failed to compute dev EL genesis hash: %v", err)
}
elGenesisHash := elGenesisBlock.Hash()

Copilot AI Apr 10, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

genesiswrite.GenesisToBlock builds an in-memory DB and requires fully-initialized dirs (it panics if dirs.SnapDomain is empty). Here it’s only used to compute the EL genesis hash, so this is both heavyweight and fragile in contexts where cfg.Dirs may be partially populated (e.g. config-building tests). Consider computing the genesis block/hash via a no-state path (e.g. genesiswrite.GenesisWithoutStateToBlock / header construction) instead of GenesisToBlock.

Copilot uses AI. Check for mistakes.
Comment on lines +107 to +111
s.genesisTime = gt
root, err := hexutil.Decode(genesis.GenesisValidatorsRoot)
if err == nil && len(root) == 32 {
copy(s.genesisValidatorsRoot[:], root)
}

Copilot AI Apr 10, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If genesis_validators_root is missing/invalid, this code still logs "beacon node ready" and returns, leaving s.genesisValidatorsRoot as the zero hash. That will cause all subsequent signatures (blocks/attestations/RANDAO) to be computed with the wrong domain and get rejected. Treat a decode error or wrong length as not-ready and retry (similar to the genesis_time parse path).

Suggested change
s.genesisTime = gt
root, err := hexutil.Decode(genesis.GenesisValidatorsRoot)
if err == nil && len(root) == 32 {
copy(s.genesisValidatorsRoot[:], root)
}
root, err := hexutil.Decode(genesis.GenesisValidatorsRoot)
if err != nil {
s.logger.Warn("[dev-validator] invalid genesis_validators_root", "value", genesis.GenesisValidatorsRoot, "err", err)
time.Sleep(time.Second)
continue
}
if len(root) != 32 {
s.logger.Warn("[dev-validator] invalid genesis_validators_root length", "value", genesis.GenesisValidatorsRoot, "len", len(root))
time.Sleep(time.Second)
continue
}
s.genesisTime = gt
copy(s.genesisValidatorsRoot[:], root)

Copilot uses AI. Check for mistakes.
Comment on lines +322 to +331
// Submit the signed block. For Deneb+, wrap in DenebSignedBeaconBlock
// with empty blob sidecars.
versionStr := version.String()
var submitBody interface{} = block
if version >= clparams.DenebVersion {
submitBody = &cltypes.DenebSignedBeaconBlock{
SignedBlock: block,
KZGProofs: solid.NewStaticListSSZ[*cltypes.KZGProof](cltypes.MaxBlobsCommittmentsPerBlock*int(s.cfg.NumberOfColumns), cltypes.BYTES_KZG_PROOF),
Blobs: solid.NewStaticListSSZ[*cltypes.Blob](cltypes.MaxBlobsCommittmentsPerBlock, int(cltypes.BYTES_PER_BLOB)),
}

Copilot AI Apr 10, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Deneb+, the block template response may include kzg_proofs and blobs, but the submission path always constructs a DenebSignedBeaconBlock with freshly-allocated (empty) KZGProofs/Blobs. If the template contains any blob commitments (e.g. user sends a blob tx), the submitted block will be incomplete/invalid. Populate these fields from the template response when present, or switch to SSZ submission of the exact object returned by the block-production endpoint.

Suggested change
// Submit the signed block. For Deneb+, wrap in DenebSignedBeaconBlock
// with empty blob sidecars.
versionStr := version.String()
var submitBody interface{} = block
if version >= clparams.DenebVersion {
submitBody = &cltypes.DenebSignedBeaconBlock{
SignedBlock: block,
KZGProofs: solid.NewStaticListSSZ[*cltypes.KZGProof](cltypes.MaxBlobsCommittmentsPerBlock*int(s.cfg.NumberOfColumns), cltypes.BYTES_KZG_PROOF),
Blobs: solid.NewStaticListSSZ[*cltypes.Blob](cltypes.MaxBlobsCommittmentsPerBlock, int(cltypes.BYTES_PER_BLOB)),
}
// Submit the signed block. For Deneb+, preserve any blob sidecars
// returned by the block-production endpoint instead of replacing them
// with empty lists.
versionStr := version.String()
var submitBody interface{} = block
if version >= clparams.DenebVersion {
denebBlock := &cltypes.DenebSignedBeaconBlock{}
rawBlock, err := json.Marshal(block)
if err != nil {
return fmt.Errorf("marshal deneb block template: %w", err)
}
if err := json.Unmarshal(rawBlock, denebBlock); err != nil {
return fmt.Errorf("unmarshal deneb block template: %w", err)
}
denebBlock.SignedBlock = block
if denebBlock.KZGProofs == nil {
denebBlock.KZGProofs = solid.NewStaticListSSZ[*cltypes.KZGProof](cltypes.MaxBlobsCommittmentsPerBlock*int(s.cfg.NumberOfColumns), cltypes.BYTES_KZG_PROOF)
}
if denebBlock.Blobs == nil {
denebBlock.Blobs = solid.NewStaticListSSZ[*cltypes.Blob](cltypes.MaxBlobsCommittmentsPerBlock, int(cltypes.BYTES_PER_BLOB))
}
submitBody = denebBlock

Copilot uses AI. Check for mistakes.
Comment thread cmd/utils/flags.go
Comment on lines +2026 to +2027
// Enable Beacon API for dev mode.
cfg.CaplinConfig.BeaconAPIRouter.Active = true

Copilot AI Apr 10, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In dev mode this forces BeaconAPIRouter.Active = true even when the user did not pass --beacon.api=.... In that case the router will start but all endpoint toggles (Beacon, Validator, Node, Config, …) remain false (see UnwrapEndpointsList), so the embedded validator will likely poll /eth/v1/beacon/genesis forever. Consider either (a) failing fast in --chain=dev unless the required endpoints are enabled, or (b) auto-enabling the minimal required endpoint flags alongside Active=true.

Suggested change
// Enable Beacon API for dev mode.
cfg.CaplinConfig.BeaconAPIRouter.Active = true
// Enable Beacon API for dev mode and expose the minimal endpoint groups
// required by the embedded validator/CL components.
cfg.CaplinConfig.BeaconAPIRouter.Active = true
cfg.CaplinConfig.BeaconAPIRouter.Beacon = true
cfg.CaplinConfig.BeaconAPIRouter.Validator = true
cfg.CaplinConfig.BeaconAPIRouter.Node = true
cfg.CaplinConfig.BeaconAPIRouter.Config = true

Copilot uses AI. Check for mistakes.
Merged via the queue into main with commit 4d52bd5 Apr 10, 2026
39 checks passed
@AskAlexSharov AskAlexSharov deleted the feat/caplin-dev-mode-pr branch April 10, 2026 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Switch --chain=dev to PoS

4 participants