Skip to content

feat(tests): EIP-8037 additional state gas test coverage#2639

Closed
spencer-tb wants to merge 0 commit into
ethereum:eips/amsterdam/eip-8037from
spencer-tb:eips/amsterdam/eip-8037
Closed

feat(tests): EIP-8037 additional state gas test coverage#2639
spencer-tb wants to merge 0 commit into
ethereum:eips/amsterdam/eip-8037from
spencer-tb:eips/amsterdam/eip-8037

Conversation

@spencer-tb

@spencer-tb spencer-tb commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

🗒️ Description

Additional EIP-8037 test coverage for state gas edge cases found during devnet testing, spec reviews, and Maria's state gas accounting review.

Reservoir and top-level halt (bal-devnet-3 Besu bug)

Block 2D gas accounting

  • test_failed_create_tx_state_gas_dominates: CREATE tx with tight gas where state gas dominates. Catches 1D accounting bugs. Parametrized: init_code (revert, halt).
  • test_failed_create_opcode_state_gas_dominates: CREATE/CREATE2 opcode failures with 2D header verification. Parametrized: init_code (revert, halt, oog) x create_opcode (CREATE, CREATE2). cc @chfast
  • test_block_2d_gas_tx_gas_limit_exceeds_regular_remaining: Tx with gas_limit >> TX_MAX_GAS_LIMIT accepted when capped regular gas fits. Parametrized: tx2_gas_limit_equals_block_gas_limit (true, false). cc @jwasinger
  • test_tx_rejected_when_regular_gas_exceeds_block_limit: Tx rejected when cumulative regular gas exceeds block limit. cc @kclowes, original test PR: feat(tests): 8037 - Tx rejected when regular gas exceeds block limit #2654
  • test_access_list_gas_is_regular_not_state: Access list gas classified as regular intrinsic gas with header_verify. Catches clients that incorrectly put access list gas into the state budget. Parametrized: num_access_list_entries (1, 10) x slots_per_entry (0, 3). cc @MariusVanDerWijden

Initcode size validation ordering

  • test_oversized_initcode_tx_no_state_gas: CREATE tx with oversized initcode rejected before state gas charged. Parametrized: initcode_size_delta (0, +1). cc @chfast
  • test_oversized_initcode_opcode_no_state_gas: CREATE/CREATE2 opcode with oversized initcode. Parametrized: initcode_size_delta (0, +1) x create_opcode (CREATE, CREATE2). cc @chfast

Nested CREATE state gas persistence (bal-devnet-3 geth bug)

  • test_inner_create_state_gas_persists_on_failure: Inner CREATE/CREATE2 fails but GAS_NEW_ACCOUNT persists (charged to parent frame before child starts). Geth was incorrectly restoring this gas to the parent's reservoir on nested CREATE failures (block 1031, diff = 131,488 = GAS_NEW_ACCOUNT). Parametrized: inner_op (CREATE, CREATE2) x outer_outcome (succeeds, reverts, halts) x num_inner_ops (1, 3). cc @MariusVanDerWijden
  • test_inner_create_succeeds_code_deposit_state_gas: Inner CREATE succeeds with code deposit, outer succeeds/reverts/halts. Both account-creation and code deposit gas counted. Parametrized: create_opcode (CREATE, CREATE2) x outer_outcome (succeeds, reverts, halts).

CREATE silent failure state gas (Maria point 3)

  • test_create_nonce_overflow_state_gas_consumed: CREATE at max nonce, GAS_NEW_ACCOUNT consumed despite failure, reservoir preserved.
  • test_create_stack_depth_state_gas_consumed: CREATE at depth 1024, GAS_NEW_ACCOUNT consumed despite failure, reservoir preserved.
  • test_create2_collision_state_gas_block_accounting: CREATE2 collision with header_verify confirming 2 x GAS_NEW_ACCOUNT in block state gas.

SELFDESTRUCT and OOG edge cases (Maria points 4, 5)

  • test_selfdestruct_in_create_tx_initcode: SELFDESTRUCT to new beneficiary in same-creation context (EIP-6780). Both outer and beneficiary account gas charged.
  • test_create_selfdestruct_same_tx_no_state_gas_refund: CREATE+SELFDESTRUCT same TX, net-zero state but GAS_NEW_ACCOUNT not refunded. Consistent with EIP-3529. Parametrized: create_opcode (CREATE, CREATE2).
  • test_call_value_to_selfdestructed_same_tx_account: CALL with value to same-TX self-destructed account does NOT charge GAS_NEW_ACCOUNT (account still alive during execution). Parametrized: create_opcode (CREATE, CREATE2).
  • test_create_oog_during_state_gas_charge: CREATE OOGs during state gas charge itself. No state gas consumed, parent recovers reservoir.

🔗 Related Issues or PRs

✅ Checklist

  • All: Ran fast static checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    just static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.

@spencer-tb spencer-tb added C-feat Category: an improvement or new feature A-tests Area: Consensus tests. labels Apr 9, 2026
@marioevz marioevz force-pushed the eips/amsterdam/eip-8037 branch from b9f0afa to 23a638d Compare April 9, 2026 18:18
@spencer-tb spencer-tb force-pushed the eips/amsterdam/eip-8037 branch 5 times, most recently from 20c5082 to 05a0a5e Compare April 10, 2026 12:08
@spencer-tb spencer-tb changed the title feat(tests): add nested CREATE state gas and access list gas tests feat(tests): EIP-8037 additional state gas test coverage Apr 10, 2026
@codecov

codecov Bot commented Apr 10, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (eips/amsterdam/eip-8037@9755dba). Learn more about missing BASE report.

Additional details and impacted files
@@                    Coverage Diff                     @@
##             eips/amsterdam/eip-8037    #2639   +/-   ##
==========================================================
  Coverage                           ?   88.17%           
==========================================================
  Files                              ?      524           
  Lines                              ?    31088           
  Branches                           ?     3036           
==========================================================
  Hits                               ?    27412           
  Misses                             ?     3161           
  Partials                           ?      515           
Flag Coverage Δ
unittests 88.17% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@chfast

This comment was marked as resolved.

@marioevz marioevz force-pushed the eips/amsterdam/eip-8037 branch from 629b34b to 9755dba Compare April 10, 2026 22:19
@spencer-tb spencer-tb force-pushed the eips/amsterdam/eip-8037 branch 5 times, most recently from 85b2e3a to e5c2d87 Compare April 12, 2026 11:58
@spencer-tb spencer-tb marked this pull request as draft April 12, 2026 13:00
@spencer-tb spencer-tb requested review from kclowes and marioevz April 12, 2026 13:08
@chfast

chfast commented Apr 13, 2026

Copy link
Copy Markdown
Member

Missing EIP-8037 test coverage — mutation testing results

While implementing EIP-8037 in evmone, I ran mutation testing against the bal@v5.6.1 EIP-8037 test suite (118 blockchain + 92 state tests). I systematically reverted each individual gas change and checked if any test failed.

5 mutations were NOT detected (all 210 tests still pass with the bug):

1. CALL account creation cost not removed for Amsterdam

Keeping the pre-Amsterdam ACCOUNT_CREATION_COST = 25000 regular gas for CALL-to-new-account passes all tests. No test exercises a CALL that transfers value to a non-existent account at Amsterdam to verify that the 25000 regular gas was replaced by 112×CPSB state gas.

2. SELFDESTRUCT account creation cost not removed for Amsterdam

Same as above: keeping 25000 regular gas for SELFDESTRUCT-to-new-beneficiary is undetected. No test verifies the cost change for SELFDESTRUCT creating a new account.

3. SSTORE charge order (regular-before-state vs state-before-regular)

Per geth/Nethermind/revm, regular gas must be charged before state gas to prevent state gas spill from inflating state_gas_used on regular OOG. Swapping the order (charging state gas first, then regular) passes all tests. This matters for edge cases where regular gas is just barely sufficient — the spill inflates state_gas_used before the regular OOG is detected.

4. Child failure state gas return destination (reservoir vs gas_left)

On child CALL/CREATE failure (revert/OOG), the child's state_gas_used should return to the parent's state gas reservoir (per geth's RefundGas). Returning it to gas_left (regular gas) instead passes all tests. This affects the 2D block formula dimensions — state gas returned to regular inflates the regular dimension and deflates state.

5. Collision block gas formula

On CREATE collision, no EVM execution happens. The burned gas should not be categorized as regular in the block formula (geth: txRegular = intrinsic.Regular + 0). Disabling the collision detection (treating burned gas as regular) passes all EIP-8037 tests. The collision test lives in paris/eip7610 and ported_static, not in the EIP-8037 suite.


These gaps were found via evmone's Amsterdam implementation. Tests for mutations 1-2 would be straightforward (CALL/SELFDESTRUCT to new account, verify block gas_used). Tests for 3-5 require specific gas boundary conditions.

@kclowes

kclowes commented Apr 13, 2026

Copy link
Copy Markdown
Contributor

@spencer-tb #2646 addresses point 5 from Maria's doc too. Not sure if you want to pull those commits/that PR in here or keep them separate. I think they test a unique scenario from what you have here.

@spencer-tb spencer-tb force-pushed the eips/amsterdam/eip-8037 branch 2 times, most recently from 66f4f7a to e014423 Compare April 14, 2026 13:26
@spencer-tb spencer-tb closed this Apr 14, 2026
@spencer-tb spencer-tb force-pushed the eips/amsterdam/eip-8037 branch from e014423 to 9755dba Compare April 14, 2026 19:45
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 19, 2026
Adds two tests to `test_state_gas_reservoir.py` that catch clients
misclassifying EIP-2930 access list gas as state gas under the
EIP-8037 two-dimensional accounting:

  test_access_list_gas_is_regular_not_state
    Parametrized over access list size (1, 10 entries) and
    storage keys per entry (0, 3). The tx has no state ops so
    block_state_gas_used must be zero and header.gas_used equals
    the regular intrinsic total.

  test_access_list_warm_savings_stay_regular
    Pre-warms a storage slot via the access list; the contract
    then runs a warm SLOAD + warm nonzero-to-nonzero SSTORE.
    All costs stay on the regular dimension; a client crediting
    the cold-to-warm delta to state gas would shift the header.

Ported from the closed PR ethereum#2639.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 19, 2026
Ported from closed PR ethereum#2639. Adds four tests that exercise
scenarios the existing 2D accounting tests don't cover:

  test_tx_rejected_when_regular_gas_exceeds_block_limit_small
    Complements `test_block_regular_gas_limit`. Uses a tight block
    gas_limit (2 * intrinsic) and a rejected tx sized just one gas
    above the remaining regular budget. Catches clients that only
    handle the TX_MAX-sized case.

  test_block_2d_gas_tx_gas_limit_exceeds_regular_remaining
    Parametrized over `tx_gas_limit_equals_block_limit` and
    `tx_gas_limit_just_above_remaining`. A preceding STOP tx
    consumes regular gas, then a second tx has
    `gas_limit >> TX_MAX_GAS_LIMIT`. The pre-execution check must
    use `min(TX_MAX_GAS_LIMIT, tx.gas - intrinsic.state)` not the
    raw `tx.gas_limit`; clients that subtract the full gas_limit
    reject this valid block.

  test_receipt_cumulative_differs_from_header_gas_used
    Explicit assertion that 2D `header.gas_used` can diverge
    from 1D receipt `cumulative_gas_used` when state dominates.
    Catches clients that mix up the two values.

  test_failed_create_tx_state_gas_dominates (parametrized
    `revert`, `halt`)
    Creation tx (to=None) with tight gas where initcode fails.
    Intrinsic state gas (GAS_NEW_ACCOUNT) is preserved across
    the top-level failure refund; tight regular budget keeps
    block_regular below `create_state_gas` so the state
    dimension dominates the header. Complements PR ethereum#2689's
    `test_creation_tx_failure_preserves_intrinsic_state_gas`
    by covering the REVERT path and the tight-gas scenario.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 19, 2026
Ported from closed PR ethereum#2639. Covers the PR ethereum#2608 ordering
requirement: the MAX_INITCODE_SIZE check must run before the
account-creation state gas charge.

  test_oversized_initcode_tx_no_state_gas (parametrized
    `at_max`, `over_max`)
    Creation tx whose initcode is exactly MAX_INITCODE_SIZE
    (accepted) or one byte over (rejected with
    INITCODE_SIZE_EXCEEDED). If state gas were charged before
    the size check, block_state_gas_used would include a
    spurious GAS_NEW_ACCOUNT for an account never created.

  test_oversized_initcode_opcode_no_state_gas (parametrized
    `at_max`, `over_max`, `CREATE`, `CREATE2`)
    Factory calls CREATE/CREATE2 opcode with initcode exceeding
    the limit; the opcode returns 0 with no state gas consumed.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 19, 2026
Ports three tests from the closed PR ethereum#2639 that cover reservoir
behavior paths not exercised by the merged ethereum#2689/ethereum#2704/ethereum#2707
tests.

  test_top_level_halt_preserves_restored_reservoir (parametrized
    reservoir_delta in {-1, 0, 1} x child_termination in {revert,
    halt})
    Regression test for the bal-devnet-3 Besu bug (ethereum#2644). Child
    runs an SSTORE then fails, restoring state gas to the parent.
    Parent then INVALIDs, triggering the top-level failure
    refund. Expected `header.gas_used =
    gas_limit_cap + min(reservoir_delta, 0)` so the reservoir
    (including any spill-restore) is preserved across the halt.

  test_callcode_value_no_new_account_state_gas
    CALLCODE transfers value to the caller, not to the target,
    so no new-account state gas is ever charged regardless of
    whether the target exists. The reservoir stays intact for a
    subsequent SSTORE.

  test_create_oog_during_state_gas_charge
    Parent CALLs an inner with only 20k gas forwarded. The
    inner's CREATE charges GAS_NEW_ACCOUNT which exceeds the
    forwarded budget, OOGing before any state gas lands. Per
    PR ethereum#2704 the refund restores the parent's reservoir and the
    parent's subsequent SSTORE succeeds from it.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 19, 2026
Ports the remaining two tests from the
`feat/eip-8037-additional-tests` / `feat/eip-8037-tests-devnet3`
branches that were not yet covered.

  test_nested_create_fail_parent_revert_state_gas
    Two-layer refund composition: caller CALLs factory, factory
    does CREATE with failing initcode, factory then REVERTs or
    STOPs. Parametrized over `child_failure` (revert, halt) x
    `parent_reverts` x `create_opcode`. Verifies the nonce
    side effect of factory's CREATE is rolled back when the
    parent reverts, and preserved (nonce=2) when it STOPs.
    Complements PR ethereum#2704's single-layer refund tests by
    exercising the caller→factory→inner chain through
    `incorporate_child_on_error` at both depths.

  test_create_stack_depth_state_gas_consumed
    Deep-recursion robustness check. The contract CALLs itself
    until gas exhaustion (EIP-150 63/64 rule limits effective
    depth well below STACK_DEPTH_LIMIT at the current
    `gas_limit_cap`; reaching depth 1024 is physically
    infeasible since the cumulative survival factor is
    `(63/64)**1024 ≈ 1e-7`). As recursion unwinds, frames run
    an SSTORE; the outermost frame's SSTORE must succeed,
    proving the reservoir threads through nested CALLs intact.
    Docstring notes that despite the name (retained for
    continuity with closed PR ethereum#2639), this exercises CALL's
    silent-failure branch rather than `generic_create`'s
    depth-1024 branch (which is unreachable at current gas
    params — effectively dead code in the spec).
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 19, 2026
The three regression-fix tests in commit 4828ae6 used hardcoded
empirical `block_regular` dicts (per CREATE/CREATE2 x
self/external variant) to discriminate a spurious
`GAS_NEW_ACCOUNT` charge on the CALL. The dicts are brittle to any
regular-gas constant change and the spurious-charge discriminator
is redundant: PR ethereum#2707's own tests (`test_create_selfdestruct_*`)
already exercise the refund path.

Drop `header_verify` from:
  test_call_value_to_self_destructed_header_gas_used
  test_call_value_to_self_destructed_burns_value
  test_call_zero_value_to_self_destructed_same_tx_account

The tests still verify runtime behavior: NONEXISTENT created
address and orchestrator balance burned to zero.

Also adds a cross-over test for the ethereum#2704 + ethereum#2689 refund
composition that PR ethereum#2704 does not exercise directly:

  test_inner_create_fail_refunds_in_creation_tx (parametrized
    `outer_outcome` in {succeeds, reverts}, `num_inner_ops` in
    {1, 3}, `create_opcode` in {CREATE, CREATE2})
    Creation tx with `num_inner_ops` inner CREATE/CREATE2 calls
    whose initcode REVERTs. Each inner CREATE's GAS_NEW_ACCOUNT
    is refunded by PR ethereum#2704. Outer then succeeds or reverts.
    block_state == outer intrinsic in both cases; a client that
    regressed to pre-ethereum#2704 "gas persists" behavior would inflate
    it by `num_inner_ops * GAS_NEW_ACCOUNT`. Rewrites the
    inverted-premise test from the closed PR ethereum#2639.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 20, 2026
Four new tests derived from evmone's mutation testing report on the
closed PR ethereum#2639 (GitHub comment):

* `test_call_new_account_no_regular_account_creation_cost` —
  catches restoring the pre-Amsterdam 25,000 regular
  `ACCOUNT_CREATION_COST` for CALL-to-new-account. Tight-gas
  discriminator: 20,000 slack (< 25,000) so the mutation OOGs the
  caller and the target is never funded.
* `test_selfdestruct_new_beneficiary_no_regular_account_creation_cost`
  — same 25,000 regular charge removal, SELFDESTRUCT variant.
  Mutation OOGs the SELFDESTRUCT and the beneficiary stays at 0
  balance.
* `test_child_failure_refunds_state_gas_to_reservoir_not_gas_left`
  — catches routing `incorporate_child_on_error`'s state-gas
  restoration to `gas_left` instead of `state_gas_left`.
  Grandchild forwarded a stipend sized for one cold SSTORE's
  regular cost only — with the mutation the grandchild's
  reservoir is zero and the SSTORE's state-gas spill OOGs.
* `test_create_collision_burned_gas_counted_in_block_regular` —
  catches removing `regular_gas_used += create_message_gas` on the
  CREATE/CREATE2 collision branch. `block_state_gas` is zero for
  this tx so `header.gas_used == block_regular`; dropping the
  burned-gas accounting reduces the header by `create_message_gas`.

Each mutation was temporarily injected into the spec to confirm
the test fails under the bug and passes without it.

Mutation 3 (SSTORE regular-before-state ordering) is already
covered by `test_sstore_oog_reservoir_inflation_detection` in
`test_state_gas_ordering.py`.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 20, 2026
Adds two tests to `test_state_gas_reservoir.py` that catch clients
misclassifying EIP-2930 access list gas as state gas under the
EIP-8037 two-dimensional accounting:

  test_access_list_gas_is_regular_not_state
    Parametrized over access list size (1, 10 entries) and
    storage keys per entry (0, 3). The tx has no state ops so
    block_state_gas_used must be zero and header.gas_used equals
    the regular intrinsic total.

  test_access_list_warm_savings_stay_regular
    Pre-warms a storage slot via the access list; the contract
    then runs a warm SLOAD + warm nonzero-to-nonzero SSTORE.
    All costs stay on the regular dimension; a client crediting
    the cold-to-warm delta to state gas would shift the header.

Ported from the closed PR ethereum#2639.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 20, 2026
Ported from closed PR ethereum#2639. Adds four tests that exercise
scenarios the existing 2D accounting tests don't cover:

  test_tx_rejected_when_regular_gas_exceeds_block_limit_small
    Complements `test_block_regular_gas_limit`. Uses a tight block
    gas_limit (2 * intrinsic) and a rejected tx sized just one gas
    above the remaining regular budget. Catches clients that only
    handle the TX_MAX-sized case.

  test_block_2d_gas_tx_gas_limit_exceeds_regular_remaining
    Parametrized over `tx_gas_limit_equals_block_limit` and
    `tx_gas_limit_just_above_remaining`. A preceding STOP tx
    consumes regular gas, then a second tx has
    `gas_limit >> TX_MAX_GAS_LIMIT`. The pre-execution check must
    use `min(TX_MAX_GAS_LIMIT, tx.gas - intrinsic.state)` not the
    raw `tx.gas_limit`; clients that subtract the full gas_limit
    reject this valid block.

  test_receipt_cumulative_differs_from_header_gas_used
    Explicit assertion that 2D `header.gas_used` can diverge
    from 1D receipt `cumulative_gas_used` when state dominates.
    Catches clients that mix up the two values.

  test_failed_create_tx_state_gas_dominates (parametrized
    `revert`, `halt`)
    Creation tx (to=None) with tight gas where initcode fails.
    Intrinsic state gas (GAS_NEW_ACCOUNT) is preserved across
    the top-level failure refund; tight regular budget keeps
    block_regular below `create_state_gas` so the state
    dimension dominates the header. Complements PR ethereum#2689's
    `test_creation_tx_failure_preserves_intrinsic_state_gas`
    by covering the REVERT path and the tight-gas scenario.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 20, 2026
Ported from closed PR ethereum#2639. Covers the PR ethereum#2608 ordering
requirement: the MAX_INITCODE_SIZE check must run before the
account-creation state gas charge.

  test_oversized_initcode_tx_no_state_gas (parametrized
    `at_max`, `over_max`)
    Creation tx whose initcode is exactly MAX_INITCODE_SIZE
    (accepted) or one byte over (rejected with
    INITCODE_SIZE_EXCEEDED). If state gas were charged before
    the size check, block_state_gas_used would include a
    spurious GAS_NEW_ACCOUNT for an account never created.

  test_oversized_initcode_opcode_no_state_gas (parametrized
    `at_max`, `over_max`, `CREATE`, `CREATE2`)
    Factory calls CREATE/CREATE2 opcode with initcode exceeding
    the limit; the opcode returns 0 with no state gas consumed.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 20, 2026
Ports three tests from the closed PR ethereum#2639 that cover reservoir
behavior paths not exercised by the merged ethereum#2689/ethereum#2704/ethereum#2707
tests.

  test_top_level_halt_preserves_restored_reservoir (parametrized
    reservoir_delta in {-1, 0, 1} x child_termination in {revert,
    halt})
    Regression test for the bal-devnet-3 Besu bug (ethereum#2644). Child
    runs an SSTORE then fails, restoring state gas to the parent.
    Parent then INVALIDs, triggering the top-level failure
    refund. Expected `header.gas_used =
    gas_limit_cap + min(reservoir_delta, 0)` so the reservoir
    (including any spill-restore) is preserved across the halt.

  test_callcode_value_no_new_account_state_gas
    CALLCODE transfers value to the caller, not to the target,
    so no new-account state gas is ever charged regardless of
    whether the target exists. The reservoir stays intact for a
    subsequent SSTORE.

  test_create_oog_during_state_gas_charge
    Parent CALLs an inner with only 20k gas forwarded. The
    inner's CREATE charges GAS_NEW_ACCOUNT which exceeds the
    forwarded budget, OOGing before any state gas lands. Per
    PR ethereum#2704 the refund restores the parent's reservoir and the
    parent's subsequent SSTORE succeeds from it.
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 20, 2026
Ports the remaining two tests from the
`feat/eip-8037-additional-tests` / `feat/eip-8037-tests-devnet3`
branches that were not yet covered.

  test_nested_create_fail_parent_revert_state_gas
    Two-layer refund composition: caller CALLs factory, factory
    does CREATE with failing initcode, factory then REVERTs or
    STOPs. Parametrized over `child_failure` (revert, halt) x
    `parent_reverts` x `create_opcode`. Verifies the nonce
    side effect of factory's CREATE is rolled back when the
    parent reverts, and preserved (nonce=2) when it STOPs.
    Complements PR ethereum#2704's single-layer refund tests by
    exercising the caller→factory→inner chain through
    `incorporate_child_on_error` at both depths.

  test_create_stack_depth_state_gas_consumed
    Deep-recursion robustness check. The contract CALLs itself
    until gas exhaustion (EIP-150 63/64 rule limits effective
    depth well below STACK_DEPTH_LIMIT at the current
    `gas_limit_cap`; reaching depth 1024 is physically
    infeasible since the cumulative survival factor is
    `(63/64)**1024 ≈ 1e-7`). As recursion unwinds, frames run
    an SSTORE; the outermost frame's SSTORE must succeed,
    proving the reservoir threads through nested CALLs intact.
    Docstring notes that despite the name (retained for
    continuity with closed PR ethereum#2639), this exercises CALL's
    silent-failure branch rather than `generic_create`'s
    depth-1024 branch (which is unreachable at current gas
    params — effectively dead code in the spec).
spencer-tb added a commit to spencer-tb/execution-specs that referenced this pull request Apr 20, 2026
Five mutation-testing regression tests surfacing gaps not caught
by the existing EIP-8037 suite. Each was validated by temporarily
injecting its target mutation into the spec and confirming the
test fails under the bug.

Four from evmone's mutation-testing report on the closed PR ethereum#2639
(GitHub review comment):

* `test_call_new_account_no_regular_account_creation_cost` —
  catches restoring the pre-Amsterdam 25,000 regular
  `ACCOUNT_CREATION_COST` for CALL-to-new-account.
* `test_selfdestruct_new_beneficiary_no_regular_account_creation_cost`
  — same 25,000 regular charge removal, SELFDESTRUCT variant.
* `test_child_failure_refunds_state_gas_to_reservoir_not_gas_left`
  — catches routing `incorporate_child_on_error`'s state-gas
  restoration to `gas_left` instead of `state_gas_left`.
* `test_create_collision_burned_gas_counted_in_block_regular` —
  catches removing `regular_gas_used += create_message_gas` on
  the CREATE/CREATE2 collision branch.

One from local mutation testing against the spec:

* `test_create_selfdestruct_code_deposit_refund_header_check` —
  catches dropping the `len(code) * cost_per_state_byte` portion
  of the same-tx CREATE+SELFDESTRUCT refund. The existing
  `test_create_selfdestruct_refunds_code_deposit_state_gas` only
  asserts the created account is gone; it never verifies the
  refund reduced `block_state_gas_used`.

evmone's mutation 3 (SSTORE regular-before-state ordering) is
already covered by `test_sstore_oog_reservoir_inflation_detection`
in `test_state_gas_ordering.py`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-tests Area: Consensus tests. C-feat Category: an improvement or new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants