perf(bal): account-presence bitmap + storage_reads on column index — skip-merge#11661
Conversation
- Replace SortedDictionary with Dictionary in BlockAccessList for O(1) account lookups on the EVM hot path (was O(log n) with 20-byte span comparisons). Sorted enumeration preserved for encoding/validation. - Merge TryGetDelegation + GetCachedCodeInfo into a single call in InstructionCall, eliminating a redundant GetCodeHash per CALL opcode. - Inline IsDeadAccount in EXTCODEHASH to avoid a second GetCodeHash call for the same address.
commit 3a3078f Author: Marc Harvey-Hill <10379486+Marchhill@users.noreply.github.com> Date: Tue Apr 28 18:23:03 2026 +0200 delete old tests commit 95de888 Merge: 244c2c6 31a673a Author: MarekM25 <marekm2504@gmail.com> Date: Tue Apr 28 18:08:12 2026 +0200 Merge branch 'master' into engine-api-glamsterdam-cleanup commit 244c2c6 Author: MarekM25 <marekm2504@gmail.com> Date: Tue Apr 28 18:05:32 2026 +0200 fix lint commit 9194b8f Author: MarekM25 <marekm2504@gmail.com> Date: Tue Apr 28 18:04:19 2026 +0200 remove tests commit eedfbb7 Author: MarekM25 <marekm2504@gmail.com> Date: Tue Apr 28 17:57:46 2026 +0200 revert formatting changes and zero hash stuff commit dc71aa5 Author: MarekM25 <marekm2504@gmail.com> Date: Tue Apr 28 17:38:42 2026 +0200 cleanups commit ec2fce3 Author: MarekM25 <marekm2504@gmail.com> Date: Tue Apr 28 17:33:11 2026 +0200 fixing commit 90a1da8 Author: MarekM25 <marekm2504@gmail.com> Date: Tue Apr 28 17:15:58 2026 +0200 remove test commit 4451656 Author: MarekM25 <marekm2504@gmail.com> Date: Tue Apr 28 17:12:32 2026 +0200 Cleanup mess commit 2425748 Merge: 4a90460 a36154c Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Mon Apr 27 14:33:36 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit 4a90460 Merge: 3549768 18d60a4 Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Mon Apr 27 14:08:25 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit 3549768 Merge: 30fcc36 b71c352 Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Mon Apr 27 14:07:39 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit 30fcc36 Merge: 540e4a2 ed6a354 Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Mon Apr 27 13:09:48 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit 540e4a2 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Sun Apr 26 04:52:49 2026 +0300 minor fixes commit 010547b Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Sun Apr 26 03:22:39 2026 +0300 test fixes commit 6fcc136 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Sun Apr 26 02:11:12 2026 +0300 fixes commit 0bccb06 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Sun Apr 26 01:52:02 2026 +0300 test fix commit 830c578 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Sun Apr 26 01:38:55 2026 +0300 fix tests commit 1f5fd2c Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Sun Apr 26 01:35:17 2026 +0300 claude review again commit d87491f Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Sun Apr 26 00:57:57 2026 +0300 fixes commit dd431e2 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Sun Apr 26 00:51:52 2026 +0300 fix tests commit 71dc99f Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Sat Apr 25 22:40:50 2026 +0300 fixes commit 9c03d12 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Sat Apr 25 22:32:58 2026 +0300 update per 786 commit a4c4f3c Merge: fc49e7e 9cba44c Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Sat Apr 25 21:38:25 2026 +0300 Merge remote-tracking branch 'origin/master' into engine-api-glamsterdam commit fc49e7e Merge: bca0c63 42c551f Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Mon Apr 20 16:14:33 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit bca0c63 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Mon Apr 20 16:14:10 2026 +0300 remove unused import commit 3d2c973 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Mon Apr 20 16:09:17 2026 +0300 fix taiko tests commit b30ef5c Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Mon Apr 20 15:34:35 2026 +0300 fix tests commit 6f131cd Merge: a8c884a 0fe41f4 Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Mon Apr 20 14:55:46 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit a8c884a Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Mon Apr 20 14:54:40 2026 +0300 conflicts fix commit 96eb912 Merge: 82a1f70 425a2a3 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Mon Apr 20 14:52:33 2026 +0300 Merge remote-tracking branch 'origin/master' into engine-api-glamsterdam # Conflicts: # src/Nethermind/Nethermind.Merge.Plugin/BlockTreeExtensions.cs # src/Nethermind/Nethermind.Merge.Plugin/Handlers/ForkchoiceUpdatedHandler.cs # src/Nethermind/Nethermind.Taiko/Rpc/TaikoForkchoiceUpdatedHandler.cs commit 82a1f70 Merge: 7b6133f 436c65b Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Mon Apr 20 13:16:55 2026 +0300 Merge remote-tracking branch 'origin/master' into engine-api-glamsterdam commit 7b6133f Merge: 2f068ec 02c2026 Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Thu Apr 16 14:49:46 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit 2f068ec Merge: bdd7ee5 5464c8b Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Thu Apr 16 14:10:02 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit bdd7ee5 Merge: f09cb41 bfaaeb0 Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Wed Apr 15 16:45:19 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit f09cb41 Merge: 370acdf cf03d18 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Wed Apr 15 16:24:17 2026 +0300 Merge remote-tracking branch 'origin/master' into engine-api-glamsterdam commit 370acdf Merge: d1d0370 6cd0ea4 Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Mon Apr 13 14:28:21 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit d1d0370 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Fri Apr 10 17:30:54 2026 +0300 more fixes commit c99d6cc Merge: 4d0f215 a52cb90 Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Fri Apr 10 17:24:06 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit 4d0f215 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Fri Apr 10 17:23:42 2026 +0300 claude review commit cb6defe Merge: af63142 0057bd8 Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Fri Apr 10 12:34:37 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit af63142 Merge: 1b338f3 2f24891 Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com> Date: Fri Apr 10 12:08:36 2026 +0300 Merge branch 'master' into engine-api-glamsterdam commit 1b338f3 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Fri Apr 10 12:08:13 2026 +0300 claude review commit 619259b Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Thu Apr 9 18:32:59 2026 +0300 taiko fix commit d067793 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Thu Apr 9 18:19:14 2026 +0300 cleaner design commit 1e27be1 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Thu Apr 9 18:12:04 2026 +0300 improve tests matching the specs commit a77e182 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Thu Apr 9 18:04:58 2026 +0300 improve test commit 4ba069c Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Thu Apr 9 18:01:59 2026 +0300 fix comment commit fe6f9b0 Author: stavrosvl7 <stavrosvl7@gmail.com> Date: Thu Apr 9 17:53:57 2026 +0300 engine api changes as per ethereum/execution-apis#770 ethereum/execution-apis#760
…halt Top-level REVERT preserves gas_left, so the spilled portion of state_gas_used (originally drawn from gas_left) is still in the user's pocket and must be refunded to the reservoir alongside the reservoir-funded portion. Top-level halt burns gas_left, so the spilled portion was paid out of gas the user can no longer reclaim — only the reservoir-funded portion is restorable. Splits RefundRevertedTopLevelStateGas into a revert path (full refund) and a new RefundHaltedTopLevelStateGas (reservoir-only, spill discarded) and wires the halt error sites to the latter.
… GetContentLength
The pyspec fixture archive ships three uncovered directories: a stray state- tests transition fork, plus two test types we never wired in. Adds: - CancunToPragueAtTime15kStateTests — fills a one-class gap in Tests.cs - PyspecSyncBlockchainTestFixture + AmsterdamSyncBlockchainTests, OsakaSyncBlockchainTests — runs blockchain_tests_sync fixtures through the engine harness; the additional syncPayload field is left for a follow-up - TestType.Transaction + TransactionTest/Json/Base + ConvertTransactionTests + PyspecTransactionTestFixture — decodes raw txbytes via Rlp.Decode and matches expected EEST exception tokens (TYPE_4_*) against the validator's ValidationResult.Error or RLP decode message; covers AmsterdamTxTests, OsakaTxTests, PragueTxTests fork directories Bumps FlatDB pyspec chunking from 4 to 16 to match the regular workflow — ~860 tests/shard instead of ~3,437/shard, well under the 256/matrix cap and the 20-minute job timeout.
… match Two CI failures in the new EEST snobal-devnet-6 fixtures around BAL validation in negative and revert scenarios: - Class B (negative tests, 15 cases): EEST's *_from_state_test synthesized blockchain test versions of state tests ship an incomplete suggested BAL for transactions expected to fail at tx-level. Nethermind correctly rejects the block but via "InvalidBlockLevelAccessList: missing/surplus account changes" rather than the expected TransactionException. The block IS invalid; only the failure mode differs. AssertValidationError now accepts BAL missing/surplus/incorrect-changes errors as a valid alternative outcome. - Class A (positive tests, 4 cases): Nethermind's generated BAL for REVERT/TSTORE scenarios disagrees with the suggested BAL on intermediate storage values, even though the final post-state matches. Filter these four specific from_state_test variants out of Pyspec test loading until the BAL-on-revert behavior is investigated. Original state tests in PyspecStateTestFixture still cover the same scenarios.
|
@claude re-review |
|
Claude finished @LukaszRozmej's task in 3m 41s —— View job Claude finished @LukaszRozmej's task —— Re-review: perf(bal) — account-presence bitmap + storage_reads on column index
Status vs previous review
New commit review (
|
|
Claude finished @LukaszRozmej's task in 2m 48s —— View job Final review: perf(bal) — account-presence bitmap + storage_reads on column index
Status vs all prior reviews
New findingsLow (informational)
The property persists across calls to /// <remarks>
/// Must be set before each <see cref="PrepareForProcessing"/> call; it is not reset
/// between blocks. Callers that conditionally enable it must also unconditionally
/// clear it between calls.
/// </remarks>Algorithmic correctness notes (verified, no issue)
VerdictAll Critical / High / Medium findings from all prior reviews are resolved. All four Low findings from prior reviews are resolved. The new preallocation commit ( |
| private void SlowPathFromColumnIndex(Block block, uint index, bool validateStorageReads) | ||
| { | ||
| BlockAccessListValidationIndex gen = _generatedValidationIndex!; | ||
| BlockAccessListValidationIndex sug = _suggestedValidationIndex!; |
There was a problem hiding this comment.
shouldn't refer to 'legacy' code that doesn't exist any more
| { | ||
| SlowPathFromColumnIndex(block, index, validateStorageReads); | ||
| return; | ||
| } |
There was a problem hiding this comment.
this function is very large now, could be broken up?
| /// check doesn't: account-set presence and the storage_reads contents per account. | ||
| /// </summary> | ||
| private static void ValidateStructuralEquivalence(Block block, GeneratedBlockAccessList generated) | ||
| private void ValidateStructuralEquivalence(Block block) |
| /// <summary>When set, the manager always builds the materialised GeneratedBlockAccessList | ||
| /// even on the parallel-validation path. BAL recorder must set this before | ||
| /// PrepareForProcessing.</summary> | ||
| bool ForceMaterializeGeneratedBlockAccessList { get; set; } |
There was a problem hiding this comment.
| bool ForceMaterializeGeneratedBlockAccessList { get; set; } | |
| bool ForceConstructGeneratedBlockAccessList { get; set; } |
| // + storage_reads). See paradigmxyz/reth#24297 for the prior art. | ||
| private bool _verifyOnly; | ||
| /// <summary> | ||
| /// When set, the manager always builds the materialised GeneratedBlockAccessList even on |
There was a problem hiding this comment.
materialise -> construct, better wording
| _nonce = Lane<ulong>.CreateMutableLike(suggested._nonce); | ||
| _code = Lane<ValueHash256>.CreateMutableLike(suggested._code); | ||
| _storage = StorageLane.CreateMutableLike(suggested._storage); | ||
| // +25% slack: reads can over-count due to per-tx duplication before dedup, and invalid |
| /// Must only be called on the mutable (generated) index. <c>_generatedStorageReads</c> and | ||
| /// <c>_generatedStorageWrites</c> are null on the immutable (suggested) side. | ||
| /// </remarks> | ||
| public StructuralMismatchKind FindStructuralMismatch(ReadOnlyBlockAccessList suggested, out Address? mismatchAddress) |
There was a problem hiding this comment.
make method shorter by breaking into smaller helpers?
| ReadOnlySpan<UInt256> sugReads = sug.StorageReads; | ||
|
|
||
| // Lockstep walk: yield read slots that are NOT shadowed by a write at the same ordinal. | ||
| int wi = 0; |
There was a problem hiding this comment.
more descriptive variable names
| // Binary search for any entry with this ordinal — we only need presence so check the | ||
| // (ordinal, default-slot) pair's bracketing. | ||
| int lo = 0, hi = reads.Length - 1; | ||
| while (lo <= hi) |
There was a problem hiding this comment.
can we reuse a standard BinarySearch method?
* Drop the _verifyOnly bool — it was always equal to ParallelExecutionEnabled in this PR. Use the property directly. (#11661 reintroduces a meaningful variant of this flag via ForceMaterializeGeneratedBlockAccessList, where it actually differs from ParallelExecutionEnabled.) * Move ValidateStructuralEquivalence from BlockAccessListManager.StateChanges to BlockAccessListManager.Validation — it belongs with the other validation helpers, not the state-mutation helpers. * Document on ValidateStructuralEquivalence why StorageChanges (writes) are not compared: they're fully covered by the per-tx ChangesAtIndexEqual lane comparisons that the incremental validation already runs at every index. * Revert the stylistic line reordering in GethGenesisLoader.cs — unrelated to this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The base branch was changed.
Merge conflicts: kept HEAD's verify-only skip-merge structure (VerifyOnly property, conditional GBAL assignment) and moved ValidateStructuralEquivalence into Validation.cs (its column-index variant using _generatedValidationIndex replaces master's GBAL-driven helper). Review fixes: * Marchhill — rename ForceMaterializeGeneratedBlockAccessList → ForceConstructGeneratedBlockAccessList and surrounding 'materialise/d' comments → 'construct/ed'. * Marchhill — 25% slack on the generated-side storage lists is now a named GeneratedListSlackShift constant via a WithSlack helper. * Marchhill — drop 'legacy' wording in code paths that don't exist any more. * Marchhill — ValidateBlockAccessList split into TryFastPath / SlowPathFromColumnIndex / SlowPathFromGeneratedBlockAccessList. * Marchhill — FindStructuralMismatch split: per-ordinal run extraction (TakeOrdinalRun) and per-account compare (CompareStorageReadsForAccount). * Marchhill — descriptive variable names in the structural-mismatch walk (writeIdx/suggestedIdx/readIdx instead of wi/sugIdx/ri). * Marchhill — HasStorageReadsForOrdinal now uses Span.BinarySearch with an OrdinalOnlyComparer instead of an inline manual binary search. * claude[bot] — FindStructuralMismatch returns the generated account count via an out param so ValidateStructuralEquivalence doesn't recompute it for the AccountCountMismatch error string. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@claude review |
|
Claude finished @LukaszRozmej's task in 5m 42s —— View job Review: perf(bal) — account-presence bitmap + storage_reads on column index
Status of all prior findings
Marchhill comment status
Correctness verification (fresh pass)
One new Low (informational)
After while (idx > 0 && ordinals[idx - 1] == ordinal) idx--;This is O(k) where k = storage-write entries for that account at that row. For typical blocks k ≤ ~5, but pathological contracts could force linear scans. A proper VerdictAll Critical / High / Medium findings from all prior reviews are resolved. All prior Low findings are resolved or mitigated. Marchhill's style suggestions are mostly addressed. One new Low (informational) finding — no correctness impact. Mergeable. |
…und binary searches The previous code did one BinarySearch (any-match) then linearly stepped back to the first matching ordinal, which is O(k) in the worst case where k is the storage-write count for that account at the row. Replace with two LowerBound searches (ordinal and ordinal+1) — O(log n) total regardless of k.
What this PR does
On the parallel-validation path, skip the per-tx
GeneratedBlockAccessList.Mergeentirely when verify-only mode is active. This builds on the encode+Keccak skip that already shipped in #11659. Together with that PR, the verify-only path now performs:Add(slice)+target.Merge(slice)(thousands ofGeneratedAccountChanges/GeneratedSlotChanges/List/SortedSetallocations per block)Add(slice)onlyGeneratedBlockAccessList.AccountChangesFindStructuralMismatchover column-index dataGeneratedBlockAccessList.AccountChangesEnumerateMarkedOrdinals+ per-row lane lookupsGeneratedBlockAccessListstays untouched on the building path and on the non-verify validation path (sequential, or with the BAL recorder active).Allocation savings (parallel validation, per block)
GeneratedAccountChanges(one per touched account)GeneratedSlotChanges(one per changed slot)SortedDictionary<UInt256, GeneratedSlotChanges>List<BalanceChange / NonceChange / CodeChange>SortedSet<UInt256>for storage_readsList<(int, UInt256)>for generated storage_readsNet: ~10000 fewer small allocations per validated block on the parallel path. The flat storage_reads buffer is one allocation that grows in place; sorted+deduped lazily on first structural-equivalence query.
Key data-structure additions
BlockAccessListValidationIndex:_hasAccountWordsbitmap populated inAdd(slice)(mirrors the suggested-sideBuildbehaviour).List<(int Ordinal, UInt256 Slot)> _generatedStorageReadswith lazy sort+dedup. Replaces the previous Dictionary<int, SortedSet> design.MarkedAccountCount(popcount),HasChangesAtRow,ChangesAtRowEqualForOrdinal,HasStorageReadsForOrdinal,EnumerateMarkedOrdinals(bit-scan iterator),AddressOf(ordinal).FindStructuralMismatch(suggested, out address)returning aStructuralMismatchKindenum — single-pass walk that produces the end-of-block verdict.BlockAccessListValidationIndex.Lane<T>:HasAt(row, ordinal)/TryGetAt(row, ordinal, out value)— binary-search the row's sorted ordinal array.BlockAccessListValidationIndex.StorageLane:HasAt(row, ordinal)/SlotsEqualAt(other, row, ordinal)— per-ordinal (slot, value) range compare using a privateGetOrdinalRangehelper.BlockAccessListValidationIndex.AddressIndex:Address[]indexed by ordinal, populated inGetOrAdd. Used byAddressOffor diagnostic error messages.How verify-only fits together now
Testing
Nethermind.Core.TestNethermind.State.TestNethermind.Consensus.TestNethermind.Evm.TestNethermind.Blockchain.TestNethermind.Merge.Plugin.TestNethermind.BalRecorder.Test12,458 passed, 0 failed. Includes the engine-API rejection tests for
IncorrectChange/MissingChange/SurplusChange/SurplusReadserrorKinds — all hit the new column-index slow path and produce the same precise per-account error messages as the legacy path.Full solution build clean;
dotnet format --verify-no-changesclean.Review focus
This is still a draft because the surface area is large and the design deserves a careful read:
FindStructuralMismatchAPI shape —out Address? mismatchAddress+StructuralMismatchKindenum. Alternative: throw inline (couples toInvalidBlockLevelAccessListException).SlowPathFromColumnIndexcorrectness vs the legacy walk — the tolerance check at index 0 + SystemUser + no-reads path is delicate; please verify the column-index translation is faithful.EnumerateMarkedOrdinalsallocates an iterator state machine. Hot? Only runs on validation failures or initial-index-0 walks, so probably fine. Could be replaced with a struct enumerator if needed.AddressIndexreverse lookup adds aList<Address>that grows withGetOrAdd. Cheap but worth flagging.