Skip to content

[glamsterdam-devnet-5] no change sets for unwinding after initial sync causes node to get stuck #21650

@taratorio

Description

@taratorio

Summary

On glamsterdam-devnet-5, the erigon EL on nimbus-erigon-1 got permanently stuck at block 6137 after a small (4-block) reorg orphaned its initial-sync tip. The canonical chain required unwinding to block 6133, but erigon had no unwind data below its batch-executed tip (minUnwindableBlock=6137), so it rejected every engine_forkchoiceUpdatedV4 with -38006 "Too deep reorg"3,876 times over ~16h with no recovery. The CL (nimbus) stayed at head but is_optimistic=true the entire time, so the node's validators stopped attesting. The rest of the network was healthy (~90% participation, finalizing normally).

Environment

  • erigon: branch glamsterdam-devnet-5, commit 1ca634d4b094f6b3932ab27227a1fa34895753b1 (erigon/3.5.0/linux-amd64/go1.25.11)
  • CL: Nimbus/v26.5.0-05f88a
  • Network: glamsterdam-devnet-5 (12s slots, genesis 2026-06-04 13:00:00 UTC)
  • Node: nimbus-erigon-1

Timeline (UTC, 2026-06-05)

Time Event
Jun 4 13:00 Devnet genesis
08:13 Node comes online ~19h after genesis; erigon starts OtterSync from scratch
09:59:17 Erigon shuts down gracefully (Exiting Engine...) — external restart by deployment tooling, not a crash
09:59:27 After restart (head=320), nimbus FCU targets block 6137 0x137faa…; erigon backward-downloads and batch-executes 321→6137 (~20 blk/s)
10:04:53 head updated number=6137 … age=12m29sthe sync target was already a stale tip
10:05:14 Canonical branch arrives via NewPayloadTrigger (fork point: common ancestor 6133); first no unwindable block found from changesets
10:29:40 First engine_forkchoiceUpdatedV4 err="Too deep reorg" — repeats on every FCU (~10–20s) thereafter
Jun 6 02:10 Still stuck at 6137; canonical head 10545; CL optimistic for ~16h

Key logs

Erigon (repeating on every FCU):

[WARN] no unwindable block found from changesets, falling back to latest with commitment block=6137 err=nil
[WARN] reorg target below minimum unwindable block unwindTarget=6133 minUnwindableBlock=6137
[WARN] [rpc] served method=engine_forkchoiceUpdatedV4 err="Too deep reorg"

Nimbus side:

DBG Failed EL Request topics="elman" requestName=forkchoiceUpdated statusCode=0 err="{\"code\":-38006,\"message\":\"Too deep reorg\"}"

The engine block downloader did attempt the canonical branch once and gave up:

[INFO] [backward-block-downloader] starting forward downloading of blocks count=44 fromNum=6134 fromHash=0x89137b50… toNum=6177
[WARN] [EngineBlockDownloader] could not process backward download request hash=0x7bcc9321… trigger=NewPayloadTrigger chainTipNum=6178

Fork proof (cross-node RPC)

Block nimbus-erigon-1 (stuck) teku-nethermind-1 (canonical) Match
6133 0x995491e501e77fadb4d4cdb79748c47d098e865bf8e90c3028bac9a921bbf4e8 0x995491e501e77fadb4d4cdb79748c47d098e865bf8e90c3028bac9a921bbf4e8 ✅ common ancestor
6137 0x137faa67d6e4a99dd5b9667a6cfc77bf24390f1b54b4be5d0126d84437804e6b 0x754487678a35269b90b30da13d7bc7520598327961ee8ab7cce29f90ffe3865b ❌ orphaned tip

The orphaned 6137's EL timestamp is 120s newer than canonical 6137 — a minority-fork branch with more empty slots, which the optimistic CL fed as the FCU target during initial sync.

Expected behavior

When an FCU requires unwinding below minUnwindableBlock (state with no unwind history, e.g. right after initial-cycle batch execution), erigon should have a recovery path — e.g. backward-download the canonical branch from the common ancestor and re-execute — instead of permanently rejecting every FCU with -38006. As-is, any small reorg that orphans the initial-sync tip bricks the node until a manual datadir wipe + resync.

Notes

  • The trigger combination: external mid-sync restart + CL optimistically targeting a soon-to-be-orphaned tip + reorg landing inside the just-executed batch range.
  • Full debug report (raw Dora/ClickHouse/RPC evidence with re-derivation commands) available on request.

Metadata

Metadata

Assignees

Labels

Glamsterdamhttps://eips.ethereum.org/EIPS/eip-7773

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions