Skip to content

perf(bal): replace SortedDictionary/SortedSet aggregates with Dictionary/HashSet#11672

Merged
LukaszRozmej merged 245 commits into
masterfrom
perf/bal-dict-aggregator
May 21, 2026
Merged

perf(bal): replace SortedDictionary/SortedSet aggregates with Dictionary/HashSet#11672
LukaszRozmej merged 245 commits into
masterfrom
perf/bal-dict-aggregator

Conversation

@LukaszRozmej

Copy link
Copy Markdown
Member

Summary

  • GeneratedBlockAccessList / GeneratedAccountChanges aggregates were SortedDictionary<Address, …>, SortedDictionary<UInt256, …>, and SortedSet<UInt256> purely to satisfy the EIP-7928 sorted-output requirement at encode time, at the cost of O(log n) on every per-tx Merge across three collections.
  • Switched the aggregates to plain Dictionary / HashSet so per-tx merges (sequential execution, BAL recorder, block building) hit O(1) lookups/inserts, and added GetSortedAccountChanges / GetSortedStorageChanges / GetSortedStorageReads accessors that materialise sorted snapshots once at the encoder boundary.
  • SlotChangesAtIndexEqual no longer requires both sides to iterate sorted; it walks the generated side (unsorted) and binary-searches the suggested side's ReadOnlySlotChanges[] (sorted-by-key, decoder-validated). Symmetry is enforced by counting matches on both sides.
  • Wire BAL bytes (and the BAL hash) are unchanged — encoders consume the sorted snapshots, so legacy compatibility holds. Internal iteration (validation, item count, HasAccount) doesn't care about order.

Test plan

  • dotnet build src/Nethermind/Nethermind.slnx -c release
  • dotnet test --project src/Nethermind/Nethermind.Core.Test/Nethermind.Core.Test.csproj -c release
  • dotnet test --project src/Nethermind/Nethermind.Consensus.Test/Nethermind.Consensus.Test.csproj -c release
  • dotnet test --project src/Nethermind/Nethermind.Merge.Plugin.Test/Nethermind.Merge.Plugin.Test.csproj -c release
  • dotnet test --project src/Nethermind/Nethermind.BalRecorder.Test/Nethermind.BalRecorder.Test.csproj -c release
  • dotnet format whitespace src/Nethermind/ --folder --verify-no-changes

Generated with Claude Code

flcl42 and others added 30 commits April 30, 2026 13:49
)

* fix(bal): align EIP-7928 BlockAccessIndex with uint32 spec, add validation

Widens BlockAccessIndex from uint16 to uint32 per EIP-7928 (commit 645099785a)
and geth bal-devnet-4. Hardens the BAL decoder so truncated/malformed wire
bytes produce a clean RlpException at engine_newPayloadV5 instead of crashing.
Adds the missing validation rules geth bal-devnet-4 enforces: empty
storage_changes per slot is rejected, and BlockAccessIndex is bounded by
[0, txCount + 1].

Decoder primitives:
- New Rlp.ValueDecoderContext.DecodeUInt() / RlpStream.Encode(uint) /
  Rlp.LengthOf(uint) helpers.
- Rlp.Decode<T> and Rlp.DecodeArrayPool<T> wrap IndexOutOfRangeException /
  ArgumentOutOfRangeException as RlpException so any truncated-RLP primitive
  read surfaces consistently.
- BlockAccessListDecoder rejects an empty AccountChanges entry (0xc0 inside
  the outer list) and SlotChangesDecoder rejects an empty StorageChange list
  for a slot — geth bal-devnet-4 "empty storage writes" parity.

Type widening:
- IIndexedChange.Index, BalanceChange/NonceChange/CodeChange/StorageChange.Index,
  BlockAccessList.Index and BlockAccessIndex on the Change journal record all
  become uint. SortedList<int, T> keys flip to SortedList<uint, T>; ushort
  query parameters on AccountChanges become uint.

Prestate sentinel preservation:
- Replaces the legacy -1 int sentinel with Eip7928Constants.PrestateIndex
  (uint.MaxValue). Adds PrestateAwareIndexComparer that orders the sentinel
  before all real indices, restoring the iteration semantics that
  AccountChanges.GetBalance/GetNonce/GetCode/AccountExists and
  ApplyStateChanges' [^1].Index check rely on.
- Decoder-built SortedLists also use the prestate-aware comparer so that
  LoadPreStateToSuggestedBlockAccessList grafting prestate onto the suggested
  BAL after decode keeps it sorted first.
- Loop predicates that compare change.Key directly against blockAccessIndex
  explicitly skip PrestateIndex so the raw uint comparison doesn't trigger
  on the huge sentinel value.

Validation:
- BlockValidator gains ValidateBlockLevelAccessListIndexBounds enforcing
  index <= txCount + 1 (mirrors geth's index < txCount + 2 check) with a new
  BlockLevelAccessListIndexOutOfRange error message.

Tests:
- New PrestateAwareIndexComparerTests, AccountChangesPrestateTests covering
  the comparer and prestate-fallback iteration semantics.
- BlockAccessListDecoderTests adds: empty-bytes / truncated-outer-list /
  inner-empty-list throw RlpException; empty storage_changes per slot throw;
  decoded SlotChanges accepts a later prestate graft as first; BalanceChange
  round-trips for indices 0x10_0000 and uint.MaxValue.
- BlockValidatorTests adds tx-index bound cases (0, 1 valid; 2,
  uint.MaxValue rejected for a 0-tx block).
- ExecutionPayloadV4Tests covers the engine-API decoding-error path for
  malformed BAL bytes.

* style: drop unused usings flagged by IDE0005

CI surfaced four IDE0005 warnings (treated as errors):
- BlockAccessListManager: drop `using Nethermind.Crypto;` —
  Keccak.OfAnEmptySequenceRlp comes from Nethermind.Core.Crypto, already imported.
- PrestateAwareIndexComparerTests / AccountChangesPrestateTests: drop
  `using Nethermind.Core;` — Eip7928Constants resolves via the test's parent
  namespace Nethermind.Core.Test.BlockAccessLists.
- Eip8037Tests: drop `using System;` — no System.* type referenced directly.

* fix(bal): record system pre/post-block SSTOREs as reads per EIP-7928 v5.7.0

EIP-7928 v5.7.0 specifies that SSTOREs performed during system pre-block calls
(EIP-2935 BlockHashHistory, EIP-4788 BeaconRootContract) and post-block calls
(EIP-7002 withdrawal requests, EIP-7251 consolidation requests) are recorded
in the BAL as storage reads on the touched slot — not as storage changes with
post-values. Same-value writes are skipped entirely. Without this, nethermind
generated a BAL whose Keccak256 differs from what eels-built fixtures expect,
so the BlockAccessListHash check fails for every Amsterdam block that touches
a system contract slot (most pyspec tests).

IWorldState gains an opt-in scope:

  IDisposable? BeginSystemPreBlockScope()

TracedAccessWorldState implements it via an int depth counter. While depth > 0,
Set(storageCell, value) reclassifies the recording: AddStorageRead when the
slot value would change, no-op when unchanged. The state mutation still
applies via the inner world state.

BlockAccessListManager wraps the three system contract entry points with the
scope: StoreBeaconRoot (EIP-4788 system tx), ApplyBlockhashStateChanges
(EIP-2935 fast-path Set), and ProcessExecutionRequests (EIP-7002 / EIP-7251
post-block system txs).

Parallel-mode state application:

In parallel processing, non-system slots wrap stateProvider with
BlockAccessListBasedWorldState whose Set is a no-op — actual state mutation
relies on ApplyStateChanges replaying the suggested BAL's storage_changes.
With the spec-correct BAL containing reads instead of changes for the system
slots, ApplyStateChanges has nothing to replay for them, so the canonical
state would diverge.

TxProcessorWithWorldState gains an `isSystemSlot` parameter. The
ParallelTxProcessorWithWorldStateManager passes `isSystemSlot: i == 0 || i ==
_len - 1` (pre-execution and post-execution slots). For those slots, the
TracedAccessWorldState wraps stateProvider directly, so system pre/post-block
SSTOREs mutate the canonical state regardless of BAL recording. Tx slots
(1..n) keep the BAL-backed wrapping unchanged.

Sequential pyspec tests (which is what the Ethereum.Blockchain.Pyspec.Test
suite runs) are unaffected by the parallel slot change but benefit from the
BAL recording fix; the BlockAccessListHash check now passes for blocks that
previously failed solely on system pre-block storage encoding.

Note: a residual InvalidStateRoot mismatch remains on a subset of Amsterdam
pyspec tests (~70/360 ecrecover_weird_v + a similar slice of stMemoryStress).
These were previously masked by the BAL hash error firing first. The
remaining state divergence appears unrelated to BAL recording and is left
for follow-up.

* test(jsonrpc): update Eth_get_block_access_list_by_* fixtures for EIP-7928 v5.7.0

The Eth_get_block_access_list_by_hash and _by_number tests had hardcoded the
pre-v5.7.0 shape, recording the EIP-2935 BlockHashHistory system pre-block
SSTORE as a storageChanges entry with the post-value. v5.7.0 records system
pre-block SSTOREs as storageReads (slot key only).

Updated both expected JSON strings to match the new spec-compliant output.

* Revert "fix(bal): record system pre/post-block SSTOREs as reads per EIP-7928 v5.7.0"

This reverts commit 364f294.

* Revert "test(jsonrpc): update Eth_get_block_access_list_by_* fixtures for EIP-7928 v5.7.0"

This reverts commit 937ca5d.

* review: address PR #11362 feedback

- Rlp.DecodeArrayPool<T>: dispose partially-allocated ArrayPoolList<T>
  before wrapping IndexOutOfRangeException/ArgumentOutOfRangeException
  as RlpException so the rented buffer is returned to the pool.
- Rlp.ValueDecoderContext.DecodeUInt(): collapse case 0 to a single
  `return RlpHelpers.ThrowNonCanonicalInteger(Position)` (DoesNotReturn,
  uint) to match DecodeInt and drop the dead `return default`.
- BlockAccessListManager.GetPostExecution(): use uint.MaxValue literal
  to match the uint? balIndex parameter.
- AccountChanges.SlotChangesAtIndex: build the returned SortedList with
  PrestateAwareIndexComparer.Instance so a later prestate graft sorts
  first, matching the rest of the codebase.
- PrestateAwareIndexComparer xmldoc: clarify that decoded BALs also use
  this comparer (so later prestate grafting preserves order).

* test(pyspec): skip EELS bal@v5.7.0 ported_static fixtures with legacy state-test conversion bug

EELS's `from_state_test` conversion for ported_static tests omits the
EIP-2935 / EIP-4788 system pre-block SSTORE entries from the suggested
BAL, while a real client (and Nethermind) executes them — so every such
fixture's BlockAccessListHash diverges from what we compute. 91 such
tests were the entire residual pyspec failure set on the bal-devnet-4a
branch.

Detect the conversion via the legacy difficulty value 0x20000 baked
into the post-merge mixHash field — real prevRandao would never be
exactly 0x...020000 — and Assert.Ignore those tests.

Track upstream EELS fix; remove the guard once bal@>v5.7.0 ships with
the system pre-block SSTOREs included in the BAL.

* fix(pyspec test): drop ?-annotation in non-nullable file

CS8632: PyspecTestFixture.cs is not under `#nullable enable`, so the
`string?` introduced in 6fb6527 broke the build of every pyspec
job. `string` works the same here — the value already gets a null
check on the next line.

* test(pyspec): also skip blockchain_test_engine_from_state_test variant

The first guard only walked test.Blocks, which is null for engine
fixtures. Engine fixtures keep the same legacy 0x...020000 sentinel,
just on the JSON `prevRandao` field of the engine_newPayload params.
Walk EngineNewPayloads too.

* Update tests

* fix(eip-8037): pin cost_per_state_byte at static 1174 for bal-devnet-4

bal-devnet-4 keeps cost_per_state_byte static at 1174 (carried over from
bal-devnet-3), removing the per-block-gas-limit scaling formula that an
earlier draft of EIP-8037 specified. snøbal-devnet-4 fixtures encode the
same gas usage (63574) at 1M / 5M / 10M / 30M block gas limits, confirming
the value is now invariant across blockGasLimit.

Reduce CalculateCostPerStateByte to a direct return of CostPerStateByte
and drop the dynamic-quantization helpers and BitOperations dependency.
The blockGasLimit parameter is retained on call sites in case a future
devnet revisits per-block scaling. Update the EIP-8037 unit test to pin
the static behaviour rather than asserting quantized values.
- Replace SortedDictionary with Dictionary in BlockAccessList for O(1)
  account lookups on the EVM hot path (was O(log n) with 20-byte span
  comparisons). Sorted enumeration preserved for encoding/validation.
- Merge TryGetDelegation + GetCachedCodeInfo into a single call in
  InstructionCall, eliminating a redundant GetCodeHash per CALL opcode.
- Inline IsDeadAccount in EXTCODEHASH to avoid a second GetCodeHash
  call for the same address.
commit 3a3078f
Author: Marc Harvey-Hill <10379486+Marchhill@users.noreply.github.com>
Date:   Tue Apr 28 18:23:03 2026 +0200

    delete old tests

commit 95de888
Merge: 244c2c6 31a673a
Author: MarekM25 <marekm2504@gmail.com>
Date:   Tue Apr 28 18:08:12 2026 +0200

    Merge branch 'master' into engine-api-glamsterdam-cleanup

commit 244c2c6
Author: MarekM25 <marekm2504@gmail.com>
Date:   Tue Apr 28 18:05:32 2026 +0200

    fix lint

commit 9194b8f
Author: MarekM25 <marekm2504@gmail.com>
Date:   Tue Apr 28 18:04:19 2026 +0200

    remove tests

commit eedfbb7
Author: MarekM25 <marekm2504@gmail.com>
Date:   Tue Apr 28 17:57:46 2026 +0200

    revert formatting changes and zero hash stuff

commit dc71aa5
Author: MarekM25 <marekm2504@gmail.com>
Date:   Tue Apr 28 17:38:42 2026 +0200

    cleanups

commit ec2fce3
Author: MarekM25 <marekm2504@gmail.com>
Date:   Tue Apr 28 17:33:11 2026 +0200

    fixing

commit 90a1da8
Author: MarekM25 <marekm2504@gmail.com>
Date:   Tue Apr 28 17:15:58 2026 +0200

    remove test

commit 4451656
Author: MarekM25 <marekm2504@gmail.com>
Date:   Tue Apr 28 17:12:32 2026 +0200

    Cleanup mess

commit 2425748
Merge: 4a90460 a36154c
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Mon Apr 27 14:33:36 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit 4a90460
Merge: 3549768 18d60a4
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Mon Apr 27 14:08:25 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit 3549768
Merge: 30fcc36 b71c352
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Mon Apr 27 14:07:39 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit 30fcc36
Merge: 540e4a2 ed6a354
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Mon Apr 27 13:09:48 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit 540e4a2
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Sun Apr 26 04:52:49 2026 +0300

    minor fixes

commit 010547b
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Sun Apr 26 03:22:39 2026 +0300

    test fixes

commit 6fcc136
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Sun Apr 26 02:11:12 2026 +0300

    fixes

commit 0bccb06
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Sun Apr 26 01:52:02 2026 +0300

    test fix

commit 830c578
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Sun Apr 26 01:38:55 2026 +0300

    fix tests

commit 1f5fd2c
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Sun Apr 26 01:35:17 2026 +0300

    claude review again

commit d87491f
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Sun Apr 26 00:57:57 2026 +0300

    fixes

commit dd431e2
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Sun Apr 26 00:51:52 2026 +0300

    fix tests

commit 71dc99f
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Sat Apr 25 22:40:50 2026 +0300

    fixes

commit 9c03d12
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Sat Apr 25 22:32:58 2026 +0300

    update per 786

commit a4c4f3c
Merge: fc49e7e 9cba44c
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Sat Apr 25 21:38:25 2026 +0300

    Merge remote-tracking branch 'origin/master' into engine-api-glamsterdam

commit fc49e7e
Merge: bca0c63 42c551f
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Mon Apr 20 16:14:33 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit bca0c63
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Mon Apr 20 16:14:10 2026 +0300

    remove unused import

commit 3d2c973
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Mon Apr 20 16:09:17 2026 +0300

    fix taiko tests

commit b30ef5c
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Mon Apr 20 15:34:35 2026 +0300

    fix tests

commit 6f131cd
Merge: a8c884a 0fe41f4
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Mon Apr 20 14:55:46 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit a8c884a
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Mon Apr 20 14:54:40 2026 +0300

    conflicts fix

commit 96eb912
Merge: 82a1f70 425a2a3
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Mon Apr 20 14:52:33 2026 +0300

    Merge remote-tracking branch 'origin/master' into engine-api-glamsterdam

    # Conflicts:
    #	src/Nethermind/Nethermind.Merge.Plugin/BlockTreeExtensions.cs
    #	src/Nethermind/Nethermind.Merge.Plugin/Handlers/ForkchoiceUpdatedHandler.cs
    #	src/Nethermind/Nethermind.Taiko/Rpc/TaikoForkchoiceUpdatedHandler.cs

commit 82a1f70
Merge: 7b6133f 436c65b
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Mon Apr 20 13:16:55 2026 +0300

    Merge remote-tracking branch 'origin/master' into engine-api-glamsterdam

commit 7b6133f
Merge: 2f068ec 02c2026
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Thu Apr 16 14:49:46 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit 2f068ec
Merge: bdd7ee5 5464c8b
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Thu Apr 16 14:10:02 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit bdd7ee5
Merge: f09cb41 bfaaeb0
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Wed Apr 15 16:45:19 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit f09cb41
Merge: 370acdf cf03d18
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Wed Apr 15 16:24:17 2026 +0300

    Merge remote-tracking branch 'origin/master' into engine-api-glamsterdam

commit 370acdf
Merge: d1d0370 6cd0ea4
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Mon Apr 13 14:28:21 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit d1d0370
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Fri Apr 10 17:30:54 2026 +0300

    more fixes

commit c99d6cc
Merge: 4d0f215 a52cb90
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Fri Apr 10 17:24:06 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit 4d0f215
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Fri Apr 10 17:23:42 2026 +0300

    claude review

commit cb6defe
Merge: af63142 0057bd8
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Fri Apr 10 12:34:37 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit af63142
Merge: 1b338f3 2f24891
Author: Stavros Vlachakis <89769224+svlachakis@users.noreply.github.com>
Date:   Fri Apr 10 12:08:36 2026 +0300

    Merge branch 'master' into engine-api-glamsterdam

commit 1b338f3
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Fri Apr 10 12:08:13 2026 +0300

    claude review

commit 619259b
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Thu Apr 9 18:32:59 2026 +0300

    taiko fix

commit d067793
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Thu Apr 9 18:19:14 2026 +0300

    cleaner design

commit 1e27be1
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Thu Apr 9 18:12:04 2026 +0300

    improve tests matching the specs

commit a77e182
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Thu Apr 9 18:04:58 2026 +0300

    improve test

commit 4ba069c
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Thu Apr 9 18:01:59 2026 +0300

    fix comment

commit fe6f9b0
Author: stavrosvl7 <stavrosvl7@gmail.com>
Date:   Thu Apr 9 17:53:57 2026 +0300

    engine api changes as per ethereum/execution-apis#770
                              ethereum/execution-apis#760
…halt

Top-level REVERT preserves gas_left, so the spilled portion of state_gas_used
(originally drawn from gas_left) is still in the user's pocket and must be
refunded to the reservoir alongside the reservoir-funded portion. Top-level
halt burns gas_left, so the spilled portion was paid out of gas the user can
no longer reclaim — only the reservoir-funded portion is restorable.

Splits RefundRevertedTopLevelStateGas into a revert path (full refund) and a
new RefundHaltedTopLevelStateGas (reservoir-only, spill discarded) and wires
the halt error sites to the latter.
The pyspec fixture archive ships three uncovered directories: a stray state-
tests transition fork, plus two test types we never wired in. Adds:

- CancunToPragueAtTime15kStateTests — fills a one-class gap in Tests.cs
- PyspecSyncBlockchainTestFixture + AmsterdamSyncBlockchainTests,
  OsakaSyncBlockchainTests — runs blockchain_tests_sync fixtures through the
  engine harness; the additional syncPayload field is left for a follow-up
- TestType.Transaction + TransactionTest/Json/Base + ConvertTransactionTests
  + PyspecTransactionTestFixture — decodes raw txbytes via Rlp.Decode and
  matches expected EEST exception tokens (TYPE_4_*) against the validator's
  ValidationResult.Error or RLP decode message; covers AmsterdamTxTests,
  OsakaTxTests, PragueTxTests fork directories

Bumps FlatDB pyspec chunking from 4 to 16 to match the regular workflow —
~860 tests/shard instead of ~3,437/shard, well under the 256/matrix cap and
the 20-minute job timeout.
Marchhill and others added 4 commits May 18, 2026 14:39
… with master

Three behavior-preserving cleanups to shrink the diff against master:

1. GetLength overloads moved before Decode (master's order).
2. Helper locals renamed back to master's contentLength / accountLength /
   accountChangesDecoder from the earlier int len / int total / decoder
   re-namings.
3. Static EncodeToBytes drops the try/finally — master doesn't wrap it
   either (instance Encode keeps try/finally as master does, so the
   internal inconsistency now matches master exactly).

Diff vs master: 183 → 149 lines (~18% reduction). 51 BAL Core tests
pass; no behavioural change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…es with master

Three behavior-preserving reductions to BlockAccessListBasedWorldState's
diff vs master, keeping (a) the ref-readonly UInt256/ValueHash256 interface
returns and (b) the TracedAccessWorldState SLOAD allocation savings:

(2) Restore _codeChangesByHash + BuildCodeChangesByHash + TryGetDeclaredCode
    + functional GetCode(in ValueHash256). GetCode-by-hash is no longer a
    null stub.

(3) Re-merge Get/GetOriginal to master's twin-body shape using
    _readScratch/_originalScratch EvmWord fields. Master's pattern is
    already zero-allocation here (MemoryMarshal.CreateReadOnlySpan +
    WithoutLeadingZeros, no .ToArray()), so dropping the unified
    GetAtCurrentIndex helper loses no perf — the byte[32] reuse only
    saved an allocation in TracedAccessWorldState.GetInternal, which is
    untouched.

(4) Re-collapse ResolveAccountChanges + GetParentReader callers back to
    ResolveContext tuple. Master's pattern.

ReadOnlySlotChanges: the TryGetLastBefore(uint, Span<byte>, out span)
buffer overload's only consumer was BlockAccessListBasedWorldState.Get/
GetOriginal; reverted to master's TryGetLastBefore(uint, out StorageChange)
signature. The Get(uint, buffer) wrapper goes away too.

Kept on the branch:
  * ref readonly UInt256 GetBalance / ref readonly ValueHash256 GetCodeHash
    on IWorldState (with _scratchBalance/_scratchCodeHash fields).
  * _scratchStorage byte[32] in TracedAccessWorldState.GetInternal.
  * ReadOnlyBlockAccessList / ReadOnlyAccountChanges / ReadOnlySlotChanges
    type substitutions.

Diff vs master in BlockAccessListBasedWorldState: 237 -> 66 lines (~72%
reduction). All 18 BlockAccessListBasedWorldStateTests pass, 51 BAL
Core.Test pass, 66/66 Bpo2ToAmsterdam pyspec tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
M1: ReadOnlySlotChanges.Equals used Changes.SequenceEqual(...) with
    using System.Linq in scope. Since Changes is StorageChange[], that
    bound to Enumerable.SequenceEqual<T> (allocates an iterator) rather
    than MemoryExtensions.SequenceEqual<T> on ReadOnlySpan<T>. Same
    pattern that was fixed in ReadOnlyAccountChanges.Equals (M1 round 1);
    missed it on the slot-changes file. Force the span overload via
    (ReadOnlySpan<StorageChange>)Changes and drop the now-unused
    System.Linq import.

L1: Drop the "// Otherwise, push the account's code hash." noise
    comment from InstructionExtCodeHash — it restates the next line.
    Came back when EXTCODEHASH was reverted to master's body; cleaner
    to remove on both sides per coding-style.md ("Comments that merely
    restate the code are noise — don't add them, and remove them when
    you encounter them"). The comment above it stays — that one earns
    its place via the EIP-1052 spec reference.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ary/HashSet

GeneratedBlockAccessList and GeneratedAccountChanges keep per-account and per-slot
collections sorted only because the RLP encoder needs sorted output (EIP-7928).
Building the sort incrementally costs O(log n) on every per-tx Merge across three
collections; switching to Dictionary/HashSet defers ordering to a one-shot sort at
the encoder boundary.

Adds GetSortedAccountChanges/GetSortedStorageChanges/GetSortedStorageReads for the
encoder, switches the BlockAccessListDecoder/AccountChangesDecoder encode paths to
them, and refactors SlotChangesAtIndexEqual from a lockstep merge over two sorted
streams to a generated-side walk + binary search into the sorted suggested side.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

Claude finished @LukaszRozmej's task in 4m 46s —— View job


Review

  • Read changed files and relevant rules
  • Analyze correctness of SlotChangesAtIndexEqual rewrite
  • Verify sort/encode boundary correctness
  • Check performance claims and patterns
  • Review decoder changes
  • Post findings

Summary

The optimization is correct and well-designed. Moving the sort from per-tx merge time to once-at-encode-boundary is a clean improvement; the EIP-7928 wire format is preserved. Two low-severity findings below (one pre-existing), no blockers.


Finding 1 — Low: SlotChangesAtIndexEqual second loop misses an early-exit

The second loop always walks all otherSlots, even after othMatches exceeds genMatches, at which point the result is definitively false. Since genMatches ≤ othMatches is an invariant (each gen match maps to a distinct other match via exact-key binary search), the loop can bail out the moment the count would exceed genMatches.

GeneratedAccountChanges.cs:137–143

int othMatches = 0;
for (int j = 0; j < otherSlots.Length; j++)
{
    if (TryGetSlotChangeAtIndex(otherSlots[j], index, out _)) othMatches++;
}
return genMatches == othMatches;

Suggested fix:

int othMatches = 0;
for (int j = 0; j < otherSlots.Length; j++)
{
    if (TryGetSlotChangeAtIndex(otherSlots[j], index, out _) && ++othMatches > genMatches)
        return false;
}
return genMatches == othMatches; // othMatches <= genMatches at this point; true iff equal

Finding 2 — Low (pre-existing): EncodeToBytes leaks rented array on exception

Both EncodeToBytes overloads in BlockAccessListDecoder lack try/finally around ArrayPool.Return. The Encode(RlpStream, …) overloads already do this correctly. Example:

// EncodeToBytes(GeneratedBlockAccessList) — no try/finally
AccountChangesDecoder.EncodingLengths[] accountLengths = ArrayPool<...>.Shared.Rent(sortedAccounts.Length);
PrepareGeneratedLengths(...);
RlpStream stream = new(Rlp.LengthOfSequence(contentLength));  // OOM here leaks the array
EncodeGeneratedPrepared(stream, ...);
byte[] result = stream.Data.ToArray();
ArrayPool<...>.Shared.Return(accountLengths);  // not reached on exception

This existed before this PR, so it's not a regression introduced here. Still worth fixing (wrapping with try/finally) in a follow-up.


Correctness verdict

SlotChangesAtIndexEqual algorithm is sound. The proof:

  • For each gen slot G_k with a change at index, the binary search finds the unique other slot with the same key (or returns false). Keys are unique on both sides (Dictionary on gen; decoder-validated strict monotone order on other), so the gen→other mapping is injective.
  • genMatches ≤ othMatches always holds as a consequence.
  • genMatches == othMatches holds iff every other slot with a change at index was matched — i.e., the slot sets are equal.

Sort/encode boundary is correct. PrepareEncodingLengths uses the unsorted Dictionary/HashSet to compute total byte counts (order-independent: Σ len(item_i) is commutative). EncodePrepared sorts and encodes in EIP-7928 order. The pre-computed content lengths are valid because the RLP size of each item is independent of its position in the sequence.

Wire format unchanged. Encoders receive sorted arrays via GetSortedAccountChanges() / GetSortedStorageChanges() / GetSortedStorageReads(), so the output bytes are identical to the old path.


Comment thread src/Nethermind/Nethermind.Core/BlockAccessLists/GeneratedAccountChanges.cs Outdated
Base automatically changed from bal-devnet-7-benchmarks to master May 19, 2026 12:06
…gator

# Conflicts:
#	src/Nethermind/Nethermind.Consensus.Test/BlockAccessListValidationIndexTests.cs
#	src/Nethermind/Nethermind.Consensus/Processing/BlockAccessListManager.StateChanges.cs
#	src/Nethermind/Nethermind.Consensus/Processing/BlockAccessListManager.TxProcessorPool.cs
#	src/Nethermind/Nethermind.Consensus/Processing/BlockAccessListManager.Validation.cs
#	src/Nethermind/Nethermind.Consensus/Processing/BlockAccessListManager.cs
#	src/Nethermind/Nethermind.Consensus/Processing/BlockAccessListValidationIndex.cs
#	src/Nethermind/Nethermind.Core.Test/BlockAccessLists/BlockAccessListItemCountTests.cs
#	src/Nethermind/Nethermind.Core.Test/BlockAccessLists/BlockAccessListJournalTests.cs
#	src/Nethermind/Nethermind.Core.Test/BlockAccessLists/ReadOnlyAccountChangesLookupTests.cs
#	src/Nethermind/Nethermind.Core/BlockAccessLists/BlockAccessListAtIndex.cs
#	src/Nethermind/Nethermind.Core/BlockAccessLists/GeneratedAccountChanges.cs
#	src/Nethermind/Nethermind.Core/BlockAccessLists/GeneratedBlockAccessList.cs
#	src/Nethermind/Nethermind.Core/BlockAccessLists/ReadOnlyAccountChanges.cs
#	src/Nethermind/Nethermind.Core/BlockAccessLists/ReadOnlyBlockAccessList.cs
#	src/Nethermind/Nethermind.Core/BlockAccessLists/StorageChangesByIndexConverter.cs
#	src/Nethermind/Nethermind.Evm.Test/Eip7928Tests.cs
#	src/Nethermind/Nethermind.Evm/TransactionProcessing/TransactionProcessor.cs
#	src/Nethermind/Nethermind.Serialization.Rlp/Eip7928/AccountChangesDecoder.cs
#	src/Nethermind/Nethermind.Serialization.Rlp/Eip7928/BlockAccessListDecoder.cs
#	src/Nethermind/Nethermind.Specs/ChainSpecStyle/GethGenesisLoader.cs
#	src/Nethermind/Nethermind.State.Test/BlockAccessListBasedWorldStateTests.cs
#	src/Nethermind/Nethermind.State.Test/TracedAccessWorldStateTests.cs
#	src/Nethermind/Nethermind.State/TracedAccessWorldState.cs
LukaszRozmej and others added 2 commits May 20, 2026 15:40
…per-tx side

- GetSortedStorage{Changes,Reads}/GetSortedAccountChanges return ArrayPoolListRef<T>
  so encoder/ToString rent from the pool instead of allocating fresh arrays each call
- Storage-reads sort takes GenericComparer.GetOptimized<UInt256>() to keep the
  AOT/ZK_EVM fallback that the SortedSet construction previously provided
- SlotChangesAtIndexEqual reverts to O(n+m) linear merge over a single pooled
  snapshot of the generated side (binary-search variant was O(n log m + m) plus
  a second scan to compare cardinalities)
- AccountChangesAtIndex (per-tx side) switches SortedDictionary/SortedSet to
  Dictionary/HashSet for the same O(1) win on the merge hot path
- BlockAccessListDecoder.EncodeToBytes wraps in try/finally so the rented
  EncodingLengths buffer is returned to the pool even if Prepare throws
- ToString drops the unnecessary (object[]) cast (string.Join<T> picks the
  IEnumerable<T> overload over params object?[])
- Trim verbose XML doc and inline comments per the review

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mark

- GeneratedSlotChanges/GeneratedAccountChanges implement IComparable<T> so
  sort sites use GenericComparer.GetOptimized<T>() (project pattern) instead
  of static Comparison<T> lambdas.
- ToString avoids ToArray() + string.Join allocation by walking the sorted
  span into the StringBuilder directly; constrained ToString skips boxing
  for value-type elements (UInt256).
- Add BlockAccessListBenchmarks under Nethermind.Benchmark covering per-tx
  Merge, EncodeToBytes, and end-to-end build+merge+encode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@LukaszRozmej

Copy link
Copy Markdown
Member Author

Benchmark: master vs PR

BlockAccessListBenchmarks (added in this PR, under Nethermind.Benchmark/Core/). InProcessNoEmit toolchain, BDN ShortRun, AMD Ryzen 7 5800X, .NET 10.0.5.

Scenarios: synthesise per-tx slices touching 2 EOAs (balance/nonce) + 1 contract with SlotsPerTx storage changes + 1 read, merge into a fresh GeneratedBlockAccessList, then BlockAccessListDecoder.EncodeToBytes.

TxCount × SlotsPerTx Method master PR Δ
50 × 4 BuildAndMerge 0.186 ms 0.195 ms +5% (noise)
50 × 4 EncodeToBytes 0.057 ms 0.046 ms −19%
50 × 4 BuildMergeAndEncode 0.291 ms 0.256 ms −12%
50 × 16 BuildAndMerge 0.500 ms 0.303 ms −40%
50 × 16 EncodeToBytes 0.186 ms 0.162 ms −13%
50 × 16 BuildMergeAndEncode 0.683 ms 0.488 ms −29%
200 × 4 BuildAndMerge 0.881 ms 0.702 ms −20%
200 × 4 EncodeToBytes 0.259 ms 0.210 ms −19%
200 × 4 BuildMergeAndEncode 1.198 ms 1.037 ms −13%
200 × 16 BuildAndMerge 1.904 ms 1.443 ms −24%
200 × 16 EncodeToBytes 0.960 ms 0.779 ms −19%
200 × 16 BuildMergeAndEncode 3.017 ms 2.195 ms −27%

End-to-end 12–29% faster across configs, growing with TxCount and SlotsPerTx. The per-tx merge alone is up to 40% faster on the heavier mix. EncodeToBytes is also faster — the one-shot sort at encode is cheaper than the running cost of a SortedDictionary. Allocations on par. The small 50×4 BuildAndMerge regression is within the ShortRun variance.

@LukaszRozmej LukaszRozmej marked this pull request as ready for review May 20, 2026 16:28
@LukaszRozmej LukaszRozmej requested a review from Marchhill as a code owner May 20, 2026 16:28
@LukaszRozmej LukaszRozmej requested a review from benaadams May 20, 2026 16:29
LukaszRozmej and others added 2 commits May 20, 2026 18:29
Restore the per-key optimisation lost when SortedDictionary/SortedSet became
Dictionary/HashSet — GenericEqualityComparer.GetOptimized<T>() is the
equivalent for hashing collections.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@benaadams benaadams left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird though I already removed the SortedSets etc

Comment thread src/Nethermind/Nethermind.Core/BlockAccessLists/GeneratedAccountChanges.cs Outdated
Comment thread src/Nethermind/Nethermind.Serialization.Rlp/Eip7928/BlockAccessListDecoder.cs Outdated
Address PR review:
- Convert inner while-walks in SlotChangesAtIndexEqual to for loops.
- Replace ArrayPool<EncodingLengths> rentals with ArrayPoolListRef so
  the encode paths consistently use the pooled-list helper; helpers
  now take Span<EncodingLengths>.
@LukaszRozmej LukaszRozmej merged commit 789cb76 into master May 21, 2026
542 checks passed
@LukaszRozmej LukaszRozmej deleted the perf/bal-dict-aggregator branch May 21, 2026 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants