Skip to content

Fix bal-devnet-2 system address filter and AccessedAddresses propagation#19628

Merged
mh0lt merged 6 commits into
mainfrom
fix/bal-system-address-filter
Mar 5, 2026
Merged

Fix bal-devnet-2 system address filter and AccessedAddresses propagation#19628
mh0lt merged 6 commits into
mainfrom
fix/bal-system-address-filter

Conversation

@mh0lt

@mh0lt mh0lt commented Mar 4, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes multiple issues with Block Access List (EIP-7928) computation that caused BAL hash mismatches on bal-devnet-2 when running with parallel execution (12 workers). These fixes were validated on the live devnet with 200+ consecutive blocks fork-validated without errors.

These represent errors encountered on the live devnet rather than in hive tests which don't include the same issues in combination.

Changes

1. Fix AccessedAddresses() dropping revertable flag (intra_block_state.go)

AccessedAddresses() was returning nil for every accessOptions entry instead of the actual options containing the revertable flag. This made it impossible for downstream BAL code to distinguish gas-calculation reads from real state access.

Before: out[addr] = nil
After: out[addr] = opts

2. System address filter using non-revertable access flag (versionedio.go)

Per EIP-7928, SYSTEM_ADDRESS (0xff...fe) "MUST NOT be included unless it experiences state access itself." The system address is touched during every block's system call (EIP-4788 beacon root) as msg.sender, creating incidental gas-calc reads.

Added a nonRevertableUserAccess field to accountState that tracks whether any user transaction (txIndex >= 0) performed a non-revertable access (e.g. evm.Call target, BALANCE opcode, SELFDESTRUCT beneficiary). The system address is now only included in the BAL when it has actual state changes OR when a user tx explicitly accessed it.

3. Mark state-reading opcodes as non-revertable (instructions.go)

Per EIP-7928, BALANCE, EXTCODESIZE, EXTCODECOPY, and EXTCODEHASH are real state access operations. Added MarkAddressAccess(addr, false) (non-revertable) to these opcodes so they correctly trigger system address inclusion when a user tx queries it.

Previously, EXTCODEHASH used MarkAddressAccess(address, true) (revertable) and the others had no explicit access marking at all — they relied only on the implicit revertable marking from versionRead.

4. VersionMap HasBAL flag for read validation (versionmap.go)

When a BAL is present, significant writes for BalancePath, NoncePath, CodePath and StoragePath are pre-populated in the VersionMap before execution. Added a HasBAL field with two validation refinements:

Path-specific bypass (MVReadResultDone): The HasBAL validation bypass now only applies to BAL-pre-populated paths (BalancePath, NoncePath, CodePath, StoragePath). AddressPath and other paths that are NOT pre-populated by the BAL always trigger invalidation, since a new VersionMap entry for those paths indicates a real state change from a concurrent worker (e.g. account creation).

AddressPath cross-check (MVReadResultNone): When HasBAL is true and an AddressPath read returned None (account not in DB), cross-check against BalancePath which IS pre-populated by the BAL. If BalancePath has an entry at a prior txIndex, the account was created by a prior tx and the nil AddressPath read is stale — triggering re-execution. This fixes a class of BAL mismatches where a tx speculatively reads an account as non-existent, a prior tx creates the account, but the stale read goes undetected because AddressPath writes are not flushed to the VersionMap when HasBAL is true.

5. Skip BAL validation when stored body is missing (bal_create.go)

Blocks downloaded via the backward block downloader (p2p sync) do not carry BAL bodies — only blocks received via engine API have them stored. When the stored BAL is missing, HasBAL=false on the VersionMap, so the computed BAL may be inaccurate. Skip the computed-vs-header hash comparison in this case. Added debug dump of stored vs computed BAL on mismatch for diagnostics.

This limitation goes away once eth/71 delivers BAL bodies via p2p sync.

6. Read BAL data with fresh tx (exec3.go)

Changed BAL data reading from tx.Apply() to te.cfg.db.View() so the execution stage can see BAL data committed by InsertBlocks after the execution transaction was opened.

7. Replace isChainTip with !isApplyingBlocks (exec3.go)

Replaced the isChainTip variable (derived from maxBlockNum == startBlockNum) with the existing isApplyingBlocks flag (derived from sync mode). These are semantically equivalent but isApplyingBlocks is clearer and avoids having two variables for the same condition.

8. Skip test_bal_invalid_* spec tests (block_test.go)

Added a skip for test_bal_invalid_* execution spec tests. These tests require BAL structural validation (rejecting blocks with intentionally invalid BALs) which is not yet implemented. Note: these tests also fail on main (see CI run #22680631291) — they were added in #19595 without corresponding validation implementation. This skip is not a regression introduced by this PR.

Testing

  • bal-devnet-2: 200+ consecutive blocks fork-validated with 12 parallel workers, zero BAL hash mismatches
  • Unit tests: TestExecutionSpecBlockchainDevnet — all pass (128s)
  • Lint: make lint — clean

@mh0lt mh0lt requested a review from yperbasis as a code owner March 4, 2026 18:52
@mh0lt mh0lt requested review from shohamc1 and taratorio March 4, 2026 18:52
mh0lt and others added 2 commits March 4, 2026 19:32
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The HasBAL validation bypass in validateRead was too aggressive — it
suppressed invalidation for ALL paths when source != MapRead. But
AddressPath is NOT pre-populated by the BAL (WriteChanges only populates
BalancePath, NoncePath, CodePath, StoragePath). This caused stale
AddressPath reads to go undetected: a tx reading an account as
non-existent would never be re-executed even after a prior tx created
that account, leading to BAL hash mismatches.

Two fixes:
1. Restrict the HasBAL bypass to only BAL-pre-populated paths
   (BalancePath, NoncePath, CodePath, StoragePath). AddressPath and
   other paths always trigger invalidation.
2. In the MVReadResultNone case, when HasBAL is true and an AddressPath
   read returned None (account not in DB), cross-check against
   BalancePath. If BalancePath has an entry at a prior txIndex (from BAL
   pre-population), the account was created by a prior tx and the nil
   read is stale — invalidate.

Also skip test_bal_invalid_* tests which require BAL structural
validation not yet implemented (these also fail on main).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@yperbasis yperbasis added the Glamsterdam https://eips.ethereum.org/EIPS/eip-7773 label Mar 5, 2026
@yperbasis yperbasis requested a review from Copilot March 5, 2026 08:34

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes several issues affecting EIP-7928 Block Access List (BAL) computation and validation, particularly under parallel execution (multi-worker) where BAL hash mismatches were observed on bal-devnet-2.

Changes:

  • Fix propagation of per-address access options (revertable vs non-revertable) and mark additional opcode-driven reads as non-revertable BAL accesses.
  • Refine system address (0xff…fe) inclusion rules using non-revertable user access tracking to avoid incidental gas-read inclusion.
  • Improve parallel read validation with a HasBAL flag in VersionMap, plus several staged-sync adjustments around BAL retrieval/validation and execution-mode flags.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
execution/vm/instructions.go Marks BALANCE/EXTCODE* opcode address reads as non-revertable access for BAL purposes.
execution/tests/block_test.go Skips test_bal_invalid_* fixtures pending BAL structural validation implementation.
execution/state/versionmap.go Adds HasBAL and refines read validation rules when BAL pre-population is present.
execution/state/versionedio.go Tracks non-revertable user access to support correct system-address filtering in BAL.
execution/state/intra_block_state.go Fixes AccessedAddresses() to return actual access options instead of nil.
execution/stagedsync/exec3.go Reads stored BAL bytes using a fresh read-only tx; simplifies “chain tip” condition naming/usage.
execution/stagedsync/bal_create.go Skips computed-vs-header BAL hash validation when stored BAL body is missing; adds mismatch dumps.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +352 to +356
isBALPrePopulatedPath := path == BalancePath || path == NoncePath ||
path == CodePath || path == StoragePath
if !vm.HasBAL || !isBALPrePopulatedPath {
valid = VersionInvalid
}

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new HasBAL-dependent validation paths change when reads are invalidated under parallel execution. There are existing unit tests for VersionMap, but none appear to cover the HasBAL bypass and the AddressPath/BalancePath cross-check behavior. Adding targeted tests for these cases would help prevent regressions in conflict detection and BAL hash determinism.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by: 6f13cd8

Added 4 tests for HasBAL validation bypass:

HasBAL_BypassForPrePopulatedPaths — StorageRead finding MVReadResultDone on Balance/Nonce/Code/Storage paths stays valid
HasBAL_NoBypassForAddressPath — AddressPath is NOT bypassed (new entry = real state change)
NoHasBAL_InvalidatesAllPaths — baseline: without HasBAL, all paths invalidate
HasBAL_AddressPathCrossCheckWithBalancePath — AddressPath read invalidated when BAL has a BalancePath entry from a prior tx

Comment on lines +1094 to +1096
if isUserTx && opts != nil && !opts.revertable {
account.nonRevertableUserAccess = true
}

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds behavior that affects whether the system address is included in the computed BAL (based on opts.revertable and txIndex). There’s existing BAL ordering coverage in bal_create_test.go, but no test that asserts the system-address filter behavior or the propagation of accessOptions through AccessedAddresses(). Consider adding a unit test that builds a VersionedIO with both revertable and non-revertable accesses to params.SystemAddress and verifies the resulting BAL includes/excludes it appropriately.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by: 6f13cd8

Added 5 tests for AsBlockAccessList system-address filtering:

SystemAddressExcludedWithoutChanges — excluded when only revertable accesses, no state changes
SystemAddressIncludedWithNonRevertableAccess — included when a user tx has non-revertable access (e.g. BALANCE opcode)
SystemAddressIncludedWithStateChanges — included when it has actual balance/nonce/code/storage changes
SystemAddressRevertableFromSystemCallOnly — non-revertable from system call (txIndex < 0) does NOT trigger inclusion
NonRevertableOverridesRevertable — a later non-revertable user access overrides earlier revertable ones

Comment thread execution/stagedsync/bal_create.go Outdated
Comment on lines +157 to +167
// Only validate computed BAL against header when we have the stored BAL
// body. Without it, HasBAL=false on the VersionMap, so the computed BAL
// may be inaccurate due to VersionMap mutations during execution.
// This limitation goes away once eth/71 delivers BAL bodies via p2p sync.
if dbBALBytes != nil && headerBALHash != bal.Hash() {
balDir := filepath.Join(dataDir, "bal")
os.MkdirAll(balDir, 0755) //nolint:errcheck
os.WriteFile(filepath.Join(balDir, fmt.Sprintf("computed_bal_%d.txt", blockNum)), []byte(bal.DebugString()), 0644) //nolint:errcheck
dbBAL2, _ := types.DecodeBlockAccessListBytes(dbBALBytes)
if dbBAL2 != nil {
os.WriteFile(filepath.Join(balDir, fmt.Sprintf("stored_bal_%d.txt", blockNum)), []byte(dbBAL2.DebugString()), 0644) //nolint:errcheck

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProcessBAL now skips validating the computed BAL hash against the header when dbBALBytes == nil. That means blocks without a locally stored BAL body (e.g. p2p/backward downloader) won’t have their EIP-7928 header hash verified at all, which weakens block validity checking. Consider still comparing bal.Hash() to headerBALHash (or gating the skip behind an explicit non-consensus/debug flag) so invalid BAL hashes can’t be accepted silently.

Suggested change
// Only validate computed BAL against header when we have the stored BAL
// body. Without it, HasBAL=false on the VersionMap, so the computed BAL
// may be inaccurate due to VersionMap mutations during execution.
// This limitation goes away once eth/71 delivers BAL bodies via p2p sync.
if dbBALBytes != nil && headerBALHash != bal.Hash() {
balDir := filepath.Join(dataDir, "bal")
os.MkdirAll(balDir, 0755) //nolint:errcheck
os.WriteFile(filepath.Join(balDir, fmt.Sprintf("computed_bal_%d.txt", blockNum)), []byte(bal.DebugString()), 0644) //nolint:errcheck
dbBAL2, _ := types.DecodeBlockAccessListBytes(dbBALBytes)
if dbBAL2 != nil {
os.WriteFile(filepath.Join(balDir, fmt.Sprintf("stored_bal_%d.txt", blockNum)), []byte(dbBAL2.DebugString()), 0644) //nolint:errcheck
// Validate computed BAL against the header hash. When we have the stored BAL
// body, also dump both versions for debugging if there is a mismatch.
// Note: Without a stored BAL, HasBAL=false on the VersionMap, so the computed
// BAL may be inaccurate due to VersionMap mutations during execution, but we
// still enforce the header hash to avoid silently accepting invalid blocks.
if headerBALHash != bal.Hash() {
if dbBALBytes != nil {
balDir := filepath.Join(dataDir, "bal")
os.MkdirAll(balDir, 0755) //nolint:errcheck
os.WriteFile(filepath.Join(balDir, fmt.Sprintf("computed_bal_%d.txt", blockNum)), []byte(bal.DebugString()), 0644) //nolint:errcheck
dbBAL2, _ := types.DecodeBlockAccessListBytes(dbBALBytes)
if dbBAL2 != nil {
os.WriteFile(filepath.Join(balDir, fmt.Sprintf("stored_bal_%d.txt", blockNum)), []byte(dbBAL2.DebugString()), 0644) //nolint:errcheck
}

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to push this fix on a subsequent PR as its a small change needs retesting

Comment thread execution/stagedsync/bal_create.go Outdated
Comment on lines 206 to 211
// Do it only for chain-tip blocks!
doms.EnableWarmupCache(isChainTip)
//log.Debug("Warmup Cache", "enabled", isChainTip)
doms.EnableWarmupCache(!isApplyingBlocks)
postValidator := newBlockPostExecutionValidator()
doms.SetDeferCommitmentUpdates(false)
if isChainTip {
if !isApplyingBlocks {
postValidator = newParallelBlockPostExecutionValidator()

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "Do it only for chain-tip blocks!", but the new condition is !isApplyingBlocks (sync mode) rather than the prior maxBlockNum == startBlockNum check. If the intent is truly “chain tip / single-block execution”, this change can alter behavior when applying blocks but the batch happens to contain a single block. Consider either updating the comment to match the new semantics, or restoring a condition that actually reflects “chain tip” (e.g. using CurrentSyncCycle.IsInitialCycle or the previous equality check).

Copilot uses AI. Check for mistakes.
mh0lt and others added 3 commits March 5, 2026 11:06
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
… paths

Add tests addressing PR review comments:

- versionedio_test.go: 5 tests for AsBlockAccessList system-address filtering,
  covering revertable vs non-revertable accesses, system call vs user tx
  distinction, state changes override, and access flag propagation.

- versionmap_test.go: 4 tests for HasBAL-dependent validation in validateRead,
  covering BAL bypass for pre-populated paths (Balance/Nonce/Code/Storage),
  no bypass for AddressPath, baseline invalidation without HasBAL, and
  AddressPath cross-check with BalancePath for account creation detection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mh0lt mh0lt enabled auto-merge (squash) March 5, 2026 11:44
@mh0lt mh0lt merged commit 76fed7d into main Mar 5, 2026
24 checks passed
@mh0lt mh0lt deleted the fix/bal-system-address-filter branch March 5, 2026 12:18
mh0lt added a commit that referenced this pull request Mar 5, 2026
## Summary

Follow-up to #19628. Two fixes for deterministic BAL computation without
a stored BAL body (p2p blocks before eth/71):

- **Remove `vm.HasBAL` guard from BalancePath cross-check**
(`versionmap.go`): The cross-check was only running when a stored BAL
body was available for pre-population. Without it, stale "account
doesn't exist" reads passed validation when a concurrent worker had
already created the account via a BalancePath flush. This caused phantom
accounts in the computed BAL, producing hash mismatches.

- **Always validate computed BAL against header** (`bal_create.go`):
Remove the `dbBALBytes != nil` guard so the computed BAL hash is
validated even for p2p blocks. With the cross-check fix above, parallel
execution is deterministic regardless of whether BAL pre-population was
used.

## Root Cause

When TX N creates an account and TX M (M > N) reads the account as
non-existent, the stale read must be invalidated. The BalancePath
cross-check catches this: if BalancePath has an entry at a lower txIndex
(from BAL pre-population **or worker flush**), the AddressPath read is
stale.

The bug was that the cross-check was gated behind `vm.HasBAL`, so it
only worked with BAL pre-population — not with worker flushes (the
`HasBAL=false` case).

## Test plan

- [x] New unit test:
`TestValidateRead_NoHasBAL_AddressPathCrossCheckWithBalancePath`
- [x] All existing `TestValidateRead_*` tests pass
- [x] `make lint` clean
- [x] Verified on bal-devnet-2: 40K+ p2p blocks with zero BAL mismatches
(previously failed at block 8997)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Mark Holt <erigon@dev-bm-e3-ethmainnet-n4.erigon.io>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
mh0lt added a commit that referenced this pull request Mar 6, 2026
…ion (#19628)

Fixes multiple issues with Block Access List (EIP-7928) computation that
caused BAL hash mismatches on bal-devnet-2 when running with parallel
execution (12 workers). These fixes were validated on the live devnet
with 200+ consecutive blocks fork-validated without errors.

These represent errors encountered on the live devnet rather than in
hive tests which don't include the same issues in combination.

(`intra_block_state.go`)

`AccessedAddresses()` was returning `nil` for every `accessOptions`
entry instead of the actual options containing the `revertable` flag.
This made it impossible for downstream BAL code to distinguish
gas-calculation reads from real state access.

**Before:** `out[addr] = nil`
**After:** `out[addr] = opts`

(`versionedio.go`)

Per EIP-7928, `SYSTEM_ADDRESS` (0xff...fe) "MUST NOT be included unless
it experiences state access itself." The system address is touched
during every block's system call (EIP-4788 beacon root) as `msg.sender`,
creating incidental gas-calc reads.

Added a `nonRevertableUserAccess` field to `accountState` that tracks
whether any user transaction (txIndex >= 0) performed a non-revertable
access (e.g. `evm.Call` target, `BALANCE` opcode, `SELFDESTRUCT`
beneficiary). The system address is now only included in the BAL when it
has actual state changes OR when a user tx explicitly accessed it.

Per EIP-7928, `BALANCE`, `EXTCODESIZE`, `EXTCODECOPY`, and `EXTCODEHASH`
are real state access operations. Added `MarkAddressAccess(addr, false)`
(non-revertable) to these opcodes so they correctly trigger system
address inclusion when a user tx queries it.

Previously, `EXTCODEHASH` used `MarkAddressAccess(address, true)`
(revertable) and the others had no explicit access marking at all — they
relied only on the implicit revertable marking from `versionRead`.

When a BAL is present, significant writes for BalancePath, NoncePath,
CodePath and StoragePath are pre-populated in the VersionMap before
execution. Added a `HasBAL` field with two validation refinements:

**Path-specific bypass (MVReadResultDone):** The HasBAL validation
bypass now only applies to BAL-pre-populated paths (BalancePath,
NoncePath, CodePath, StoragePath). AddressPath and other paths that are
NOT pre-populated by the BAL always trigger invalidation, since a new
VersionMap entry for those paths indicates a real state change from a
concurrent worker (e.g. account creation).

**AddressPath cross-check (MVReadResultNone):** When HasBAL is true and
an AddressPath read returned None (account not in DB), cross-check
against BalancePath which IS pre-populated by the BAL. If BalancePath
has an entry at a prior txIndex, the account was created by a prior tx
and the nil AddressPath read is stale — triggering re-execution. This
fixes a class of BAL mismatches where a tx speculatively reads an
account as non-existent, a prior tx creates the account, but the stale
read goes undetected because AddressPath writes are not flushed to the
VersionMap when HasBAL is true.

Blocks downloaded via the backward block downloader (p2p sync) do not
carry BAL bodies — only blocks received via engine API have them stored.
When the stored BAL is missing, `HasBAL=false` on the VersionMap, so the
computed BAL may be inaccurate. Skip the computed-vs-header hash
comparison in this case. Added debug dump of stored vs computed BAL on
mismatch for diagnostics.

This limitation goes away once eth/71 delivers BAL bodies via p2p sync.

Changed BAL data reading from `tx.Apply()` to `te.cfg.db.View()` so the
execution stage can see BAL data committed by `InsertBlocks` after the
execution transaction was opened.

Replaced the `isChainTip` variable (derived from `maxBlockNum ==
startBlockNum`) with the existing `isApplyingBlocks` flag (derived from
sync mode). These are semantically equivalent but `isApplyingBlocks` is
clearer and avoids having two variables for the same condition.

Added a skip for `test_bal_invalid_*` execution spec tests. These tests
require BAL structural validation (rejecting blocks with intentionally
invalid BALs) which is not yet implemented. **Note:** these tests also
fail on `main` (see CI run
[#22680631291](https://github.com/erigontech/erigon/actions/runs/22680631291))
— they were added in #19595 without corresponding validation
implementation. This skip is not a regression introduced by this PR.

- **bal-devnet-2**: 200+ consecutive blocks fork-validated with 12
parallel workers, zero BAL hash mismatches
- **Unit tests**: `TestExecutionSpecBlockchainDevnet` — all pass (128s)
- **Lint**: `make lint` — clean

---------

Co-authored-by: Mark Holt <erigon@dev-bm-e3-ethmainnet-n4.erigon.io>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
mh0lt added a commit that referenced this pull request Mar 6, 2026
## Summary

Follow-up to #19628. Two fixes for deterministic BAL computation without
a stored BAL body (p2p blocks before eth/71):

- **Remove `vm.HasBAL` guard from BalancePath cross-check**
(`versionmap.go`): The cross-check was only running when a stored BAL
body was available for pre-population. Without it, stale "account
doesn't exist" reads passed validation when a concurrent worker had
already created the account via a BalancePath flush. This caused phantom
accounts in the computed BAL, producing hash mismatches.

- **Always validate computed BAL against header** (`bal_create.go`):
Remove the `dbBALBytes != nil` guard so the computed BAL hash is
validated even for p2p blocks. With the cross-check fix above, parallel
execution is deterministic regardless of whether BAL pre-population was
used.

## Root Cause

When TX N creates an account and TX M (M > N) reads the account as
non-existent, the stale read must be invalidated. The BalancePath
cross-check catches this: if BalancePath has an entry at a lower txIndex
(from BAL pre-population **or worker flush**), the AddressPath read is
stale.

The bug was that the cross-check was gated behind `vm.HasBAL`, so it
only worked with BAL pre-population — not with worker flushes (the
`HasBAL=false` case).

## Test plan

- [x] New unit test:
`TestValidateRead_NoHasBAL_AddressPathCrossCheckWithBalancePath`
- [x] All existing `TestValidateRead_*` tests pass
- [x] `make lint` clean
- [x] Verified on bal-devnet-2: 40K+ p2p blocks with zero BAL mismatches
(previously failed at block 8997)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Mark Holt <erigon@dev-bm-e3-ethmainnet-n4.erigon.io>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sahil-4555 pushed a commit to Sahil-4555/erigon that referenced this pull request May 20, 2026
… in precompile benchmarks (erigontech#21240)

# Point-evaluation precompile: BAL parallel-exec optimisations

## Performance

Same-sender Amsterdam BAL parallel `newPayload` validate, EEST 150 M
point_evaluation workload (linear projection from a 9 × 5 M synthesised
in-process run):

```
+----------------+----------+---------+
| variant        | before   | after   |
+----------------+----------+---------+
| 150 M cachable | ~1940 ms | ~305 ms |
| 150 M uncach.  |  ~305 ms | ~300 ms |
+----------------+----------+---------+
```

- 6.4× speedup on the same-sender (cachable) path.
- Multi-sender unaffected; the bug only fired under same-sender BAL.
- `invalid=0` retries on the user-tx block (was `invalid=8`, ~47 % retry
rate).

## Issue & root cause

`validateRead` in `execution/state/versionmap.go` invalidated every
same-sender tx under BAL pre-pop, forcing serial re-execution and
collapsing parallel throughput.

The `AddressPath` / `MVReadResultNone` arm used `BalancePath` as a proxy
for "a prior tx created this account":

```go
if valid == VersionValid {
    balRR := vm.Read(addr, BalancePath, ..., txIndex)
    if balRR.Status() == MVReadResultDone {
        valid = VersionInvalid
    }
}
```

`BalancePath` is written by **three** events, only one of which is a
creation:

1. `CreateAccount` (the real bug PR erigontech#19628 was trying to catch).
2. `UpdateAccountData` on a pre-existing account — fires for every
gas-paying same-sender tx.
3. BAL pre-population for every BAL-listed balance change.

Under same-sender BAL workloads (the EEST point_evaluation cachable
benchmark), the BAL pre-populates `BalancePath` at `TxIdx = 0..N-1` for
the shared sender. Every tx after tx 0 then false-invalidated its
`AddressPath` storage-fallback read, producing the retry storm.

## Fix

Replace the `BalancePath` cross-check with an `IncarnationPath`
cross-check. `IncarnationPath` is the **specific** signal for
account-existence change: written only by `CreateAccount` and
`SelfDestruct`, never by `UpdateAccountData`, never by `WriteChanges`
(BAL pre-pop). The existing `versionedRead` storage-zero-out path
already uses `IncarnationPath` this way for `StoragePath`.

```go
if valid == VersionValid {
    incRR := vm.Read(addr, IncarnationPath, ..., txIndex)
    if incRR.Status() == MVReadResultDone {
        valid = VersionInvalid
    }
}
```

This also subsumes the secondary "precompile-touch records phantom
AddressPath read" retry-storm path: precompiles are never
`CreateAccount`'d or `SelfDestruct`'d, so a `TouchAccount` on
`0x01`-`0x0a` no longer trips invalidation under the new cross-check. No
EVM-level change needed.

## Correctness

- **PR erigontech#19628 still covered.** A genuine `CreateAccount` writes
`IncarnationPath`, so the stale-AddressPath-after-account-creation case
is still caught. New unit test
`TestValidateRead_PriorAccountCreation_DetectedViaIncarnationPath`
asserts this (failed without the cross-check, passes with the
`IncarnationPath` form).
- **SelfDestruct path unchanged.** `SelfDestructPath` cross-check
remains in place at the start of the arm.
- **No-BAL same-sender conflicts still caught.** When same-sender txs
run without BAL, conflict detection happens through the
`MVReadResultDone` arm's value-tiebreaker, which is untouched by this
change. New unit test `TestNoBAL_SameSenderTxs_DetectsConflicts`
exercises the 9-tx-same-sender flow and asserts 8 invalidations.
- **No false positives under BAL.** `WriteChanges` (BAL pre-pop) doesn't
touch `IncarnationPath`, so for pure balance/nonce/code/storage updates
the cross-check stays silent. New unit test
`TestBALPrePop_SameSenderTxs_NoConflicts` confirms zero conflicts on the
synthesised 9-tx BAL workload.

## Tests added

`execution/state/versionmap_test.go`:

- `TestBALPrePop_SameSenderTxs_NoConflicts` — same-sender BAL
retry-storm reproducer; green after the fix.
- `TestValidateRead_PriorAccountCreation_DetectedViaIncarnationPath` —
PR erigontech#19628 regression coverage; red without the cross-check, green with
the `IncarnationPath` version.
- `TestNoBAL_SameSenderTxs_DetectsConflicts` — confirms the no-BAL
conflict-detection path still fires.

`execution/engineapi/amsterdam_bal_point_eval_bench_test.go` (new file,
gated by `EXEC3_PARALLEL=true`) — in-process Amsterdam BAL benchmark
that produced the performance numbers above. Mirrors the EEST
point_evaluation workload (9 txs × ~5 M gas calling KZG-loop contract)
end-to-end via `MockCl.BuildNewPayload` + `InsertNewPayload`.
yperbasis added a commit that referenced this pull request May 20, 2026
Notable adaptation: the merge brought in PR #21240 ("optimise parallel
exec with BALs for same-sender conflicts in precompile benchmarks"),
which replaced the BalancePath cross-check in validateRead's
AddressPath / MVReadResultNone arm with an IncarnationPath cross-check
to avoid a same-sender BAL retry storm.

moksha has already removed IncarnationPath as part of the incarnation
cleanup (#12440). The substitute on this branch is CreateContractPath
— written only by the CREATE / CREATE2 branch of IBS.CreateAccount,
never by UpdateAccountData and never by BAL pre-population, so it
preserves PR #21240's performance fix:

- BAL same-sender retry storm: still gone (CreateContractPath isn't
  pre-populated by BAL, isn't emitted by routine balance/nonce
  updates).
- PR #19628's regression (stale AddressPath storage-fallback after
  prior-tx contract creation) is still caught.

Gap vs upstream: EOA-creation-via-CALL+value (IBS.CreateAccount with
contractCreation=false) emits BalancePath only, no CreateContractPath
on moksha — so this cross-check doesn't fire for that scenario.
Downstream value-tiebreaker validation on BalancePath/NoncePath
catches real EOA-balance staleness, and the practical impact is
limited (EOAs have no storage, so a stale AddressPath nil read of an
EOA can't produce divergent slot reads).

Renamed TestValidateRead_PriorAccountCreation_DetectedViaIncarnationPath
→ TestValidateRead_PriorContractCreation_DetectedViaCreateContractPath
to reflect the moksha signal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Glamsterdam https://eips.ethereum.org/EIPS/eip-7773

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants