feat(tests): EIP-8037 additional state gas test coverage#2639
feat(tests): EIP-8037 additional state gas test coverage#2639spencer-tb wants to merge 0 commit into
Conversation
b9f0afa to
23a638d
Compare
20c5082 to
05a0a5e
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## eips/amsterdam/eip-8037 #2639 +/- ##
==========================================================
Coverage ? 88.17%
==========================================================
Files ? 524
Lines ? 31088
Branches ? 3036
==========================================================
Hits ? 27412
Misses ? 3161
Partials ? 515
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This comment was marked as resolved.
This comment was marked as resolved.
629b34b to
9755dba
Compare
85b2e3a to
e5c2d87
Compare
Missing EIP-8037 test coverage — mutation testing resultsWhile implementing EIP-8037 in evmone, I ran mutation testing against the 5 mutations were NOT detected (all 210 tests still pass with the bug): 1. CALL account creation cost not removed for AmsterdamKeeping the pre-Amsterdam 2. SELFDESTRUCT account creation cost not removed for AmsterdamSame as above: keeping 3. SSTORE charge order (regular-before-state vs state-before-regular)Per geth/Nethermind/revm, regular gas must be charged before state gas to prevent state gas spill from inflating 4. Child failure state gas return destination (reservoir vs gas_left)On child CALL/CREATE failure (revert/OOG), the child's 5. Collision block gas formulaOn CREATE collision, no EVM execution happens. The burned gas should not be categorized as regular in the block formula (geth: These gaps were found via evmone's Amsterdam implementation. Tests for mutations 1-2 would be straightforward (CALL/SELFDESTRUCT to new account, verify block |
d352348 to
a42ea81
Compare
f6ac109 to
79bee59
Compare
|
@spencer-tb #2646 addresses point 5 from Maria's doc too. Not sure if you want to pull those commits/that PR in here or keep them separate. I think they test a unique scenario from what you have here. |
66f4f7a to
e014423
Compare
e014423 to
9755dba
Compare
Adds two tests to `test_state_gas_reservoir.py` that catch clients
misclassifying EIP-2930 access list gas as state gas under the
EIP-8037 two-dimensional accounting:
test_access_list_gas_is_regular_not_state
Parametrized over access list size (1, 10 entries) and
storage keys per entry (0, 3). The tx has no state ops so
block_state_gas_used must be zero and header.gas_used equals
the regular intrinsic total.
test_access_list_warm_savings_stay_regular
Pre-warms a storage slot via the access list; the contract
then runs a warm SLOAD + warm nonzero-to-nonzero SSTORE.
All costs stay on the regular dimension; a client crediting
the cold-to-warm delta to state gas would shift the header.
Ported from the closed PR ethereum#2639.
Ported from closed PR ethereum#2639. Adds four tests that exercise scenarios the existing 2D accounting tests don't cover: test_tx_rejected_when_regular_gas_exceeds_block_limit_small Complements `test_block_regular_gas_limit`. Uses a tight block gas_limit (2 * intrinsic) and a rejected tx sized just one gas above the remaining regular budget. Catches clients that only handle the TX_MAX-sized case. test_block_2d_gas_tx_gas_limit_exceeds_regular_remaining Parametrized over `tx_gas_limit_equals_block_limit` and `tx_gas_limit_just_above_remaining`. A preceding STOP tx consumes regular gas, then a second tx has `gas_limit >> TX_MAX_GAS_LIMIT`. The pre-execution check must use `min(TX_MAX_GAS_LIMIT, tx.gas - intrinsic.state)` not the raw `tx.gas_limit`; clients that subtract the full gas_limit reject this valid block. test_receipt_cumulative_differs_from_header_gas_used Explicit assertion that 2D `header.gas_used` can diverge from 1D receipt `cumulative_gas_used` when state dominates. Catches clients that mix up the two values. test_failed_create_tx_state_gas_dominates (parametrized `revert`, `halt`) Creation tx (to=None) with tight gas where initcode fails. Intrinsic state gas (GAS_NEW_ACCOUNT) is preserved across the top-level failure refund; tight regular budget keeps block_regular below `create_state_gas` so the state dimension dominates the header. Complements PR ethereum#2689's `test_creation_tx_failure_preserves_intrinsic_state_gas` by covering the REVERT path and the tight-gas scenario.
Ported from closed PR ethereum#2639. Covers the PR ethereum#2608 ordering requirement: the MAX_INITCODE_SIZE check must run before the account-creation state gas charge. test_oversized_initcode_tx_no_state_gas (parametrized `at_max`, `over_max`) Creation tx whose initcode is exactly MAX_INITCODE_SIZE (accepted) or one byte over (rejected with INITCODE_SIZE_EXCEEDED). If state gas were charged before the size check, block_state_gas_used would include a spurious GAS_NEW_ACCOUNT for an account never created. test_oversized_initcode_opcode_no_state_gas (parametrized `at_max`, `over_max`, `CREATE`, `CREATE2`) Factory calls CREATE/CREATE2 opcode with initcode exceeding the limit; the opcode returns 0 with no state gas consumed.
Ports three tests from the closed PR ethereum#2639 that cover reservoir behavior paths not exercised by the merged ethereum#2689/ethereum#2704/ethereum#2707 tests. test_top_level_halt_preserves_restored_reservoir (parametrized reservoir_delta in {-1, 0, 1} x child_termination in {revert, halt}) Regression test for the bal-devnet-3 Besu bug (ethereum#2644). Child runs an SSTORE then fails, restoring state gas to the parent. Parent then INVALIDs, triggering the top-level failure refund. Expected `header.gas_used = gas_limit_cap + min(reservoir_delta, 0)` so the reservoir (including any spill-restore) is preserved across the halt. test_callcode_value_no_new_account_state_gas CALLCODE transfers value to the caller, not to the target, so no new-account state gas is ever charged regardless of whether the target exists. The reservoir stays intact for a subsequent SSTORE. test_create_oog_during_state_gas_charge Parent CALLs an inner with only 20k gas forwarded. The inner's CREATE charges GAS_NEW_ACCOUNT which exceeds the forwarded budget, OOGing before any state gas lands. Per PR ethereum#2704 the refund restores the parent's reservoir and the parent's subsequent SSTORE succeeds from it.
Ports the remaining two tests from the
`feat/eip-8037-additional-tests` / `feat/eip-8037-tests-devnet3`
branches that were not yet covered.
test_nested_create_fail_parent_revert_state_gas
Two-layer refund composition: caller CALLs factory, factory
does CREATE with failing initcode, factory then REVERTs or
STOPs. Parametrized over `child_failure` (revert, halt) x
`parent_reverts` x `create_opcode`. Verifies the nonce
side effect of factory's CREATE is rolled back when the
parent reverts, and preserved (nonce=2) when it STOPs.
Complements PR ethereum#2704's single-layer refund tests by
exercising the caller→factory→inner chain through
`incorporate_child_on_error` at both depths.
test_create_stack_depth_state_gas_consumed
Deep-recursion robustness check. The contract CALLs itself
until gas exhaustion (EIP-150 63/64 rule limits effective
depth well below STACK_DEPTH_LIMIT at the current
`gas_limit_cap`; reaching depth 1024 is physically
infeasible since the cumulative survival factor is
`(63/64)**1024 ≈ 1e-7`). As recursion unwinds, frames run
an SSTORE; the outermost frame's SSTORE must succeed,
proving the reservoir threads through nested CALLs intact.
Docstring notes that despite the name (retained for
continuity with closed PR ethereum#2639), this exercises CALL's
silent-failure branch rather than `generic_create`'s
depth-1024 branch (which is unreachable at current gas
params — effectively dead code in the spec).
The three regression-fix tests in commit 4828ae6 used hardcoded empirical `block_regular` dicts (per CREATE/CREATE2 x self/external variant) to discriminate a spurious `GAS_NEW_ACCOUNT` charge on the CALL. The dicts are brittle to any regular-gas constant change and the spurious-charge discriminator is redundant: PR ethereum#2707's own tests (`test_create_selfdestruct_*`) already exercise the refund path. Drop `header_verify` from: test_call_value_to_self_destructed_header_gas_used test_call_value_to_self_destructed_burns_value test_call_zero_value_to_self_destructed_same_tx_account The tests still verify runtime behavior: NONEXISTENT created address and orchestrator balance burned to zero. Also adds a cross-over test for the ethereum#2704 + ethereum#2689 refund composition that PR ethereum#2704 does not exercise directly: test_inner_create_fail_refunds_in_creation_tx (parametrized `outer_outcome` in {succeeds, reverts}, `num_inner_ops` in {1, 3}, `create_opcode` in {CREATE, CREATE2}) Creation tx with `num_inner_ops` inner CREATE/CREATE2 calls whose initcode REVERTs. Each inner CREATE's GAS_NEW_ACCOUNT is refunded by PR ethereum#2704. Outer then succeeds or reverts. block_state == outer intrinsic in both cases; a client that regressed to pre-ethereum#2704 "gas persists" behavior would inflate it by `num_inner_ops * GAS_NEW_ACCOUNT`. Rewrites the inverted-premise test from the closed PR ethereum#2639.
Four new tests derived from evmone's mutation testing report on the closed PR ethereum#2639 (GitHub comment): * `test_call_new_account_no_regular_account_creation_cost` — catches restoring the pre-Amsterdam 25,000 regular `ACCOUNT_CREATION_COST` for CALL-to-new-account. Tight-gas discriminator: 20,000 slack (< 25,000) so the mutation OOGs the caller and the target is never funded. * `test_selfdestruct_new_beneficiary_no_regular_account_creation_cost` — same 25,000 regular charge removal, SELFDESTRUCT variant. Mutation OOGs the SELFDESTRUCT and the beneficiary stays at 0 balance. * `test_child_failure_refunds_state_gas_to_reservoir_not_gas_left` — catches routing `incorporate_child_on_error`'s state-gas restoration to `gas_left` instead of `state_gas_left`. Grandchild forwarded a stipend sized for one cold SSTORE's regular cost only — with the mutation the grandchild's reservoir is zero and the SSTORE's state-gas spill OOGs. * `test_create_collision_burned_gas_counted_in_block_regular` — catches removing `regular_gas_used += create_message_gas` on the CREATE/CREATE2 collision branch. `block_state_gas` is zero for this tx so `header.gas_used == block_regular`; dropping the burned-gas accounting reduces the header by `create_message_gas`. Each mutation was temporarily injected into the spec to confirm the test fails under the bug and passes without it. Mutation 3 (SSTORE regular-before-state ordering) is already covered by `test_sstore_oog_reservoir_inflation_detection` in `test_state_gas_ordering.py`.
Adds two tests to `test_state_gas_reservoir.py` that catch clients
misclassifying EIP-2930 access list gas as state gas under the
EIP-8037 two-dimensional accounting:
test_access_list_gas_is_regular_not_state
Parametrized over access list size (1, 10 entries) and
storage keys per entry (0, 3). The tx has no state ops so
block_state_gas_used must be zero and header.gas_used equals
the regular intrinsic total.
test_access_list_warm_savings_stay_regular
Pre-warms a storage slot via the access list; the contract
then runs a warm SLOAD + warm nonzero-to-nonzero SSTORE.
All costs stay on the regular dimension; a client crediting
the cold-to-warm delta to state gas would shift the header.
Ported from the closed PR ethereum#2639.
Ported from closed PR ethereum#2639. Adds four tests that exercise scenarios the existing 2D accounting tests don't cover: test_tx_rejected_when_regular_gas_exceeds_block_limit_small Complements `test_block_regular_gas_limit`. Uses a tight block gas_limit (2 * intrinsic) and a rejected tx sized just one gas above the remaining regular budget. Catches clients that only handle the TX_MAX-sized case. test_block_2d_gas_tx_gas_limit_exceeds_regular_remaining Parametrized over `tx_gas_limit_equals_block_limit` and `tx_gas_limit_just_above_remaining`. A preceding STOP tx consumes regular gas, then a second tx has `gas_limit >> TX_MAX_GAS_LIMIT`. The pre-execution check must use `min(TX_MAX_GAS_LIMIT, tx.gas - intrinsic.state)` not the raw `tx.gas_limit`; clients that subtract the full gas_limit reject this valid block. test_receipt_cumulative_differs_from_header_gas_used Explicit assertion that 2D `header.gas_used` can diverge from 1D receipt `cumulative_gas_used` when state dominates. Catches clients that mix up the two values. test_failed_create_tx_state_gas_dominates (parametrized `revert`, `halt`) Creation tx (to=None) with tight gas where initcode fails. Intrinsic state gas (GAS_NEW_ACCOUNT) is preserved across the top-level failure refund; tight regular budget keeps block_regular below `create_state_gas` so the state dimension dominates the header. Complements PR ethereum#2689's `test_creation_tx_failure_preserves_intrinsic_state_gas` by covering the REVERT path and the tight-gas scenario.
Ported from closed PR ethereum#2639. Covers the PR ethereum#2608 ordering requirement: the MAX_INITCODE_SIZE check must run before the account-creation state gas charge. test_oversized_initcode_tx_no_state_gas (parametrized `at_max`, `over_max`) Creation tx whose initcode is exactly MAX_INITCODE_SIZE (accepted) or one byte over (rejected with INITCODE_SIZE_EXCEEDED). If state gas were charged before the size check, block_state_gas_used would include a spurious GAS_NEW_ACCOUNT for an account never created. test_oversized_initcode_opcode_no_state_gas (parametrized `at_max`, `over_max`, `CREATE`, `CREATE2`) Factory calls CREATE/CREATE2 opcode with initcode exceeding the limit; the opcode returns 0 with no state gas consumed.
Ports three tests from the closed PR ethereum#2639 that cover reservoir behavior paths not exercised by the merged ethereum#2689/ethereum#2704/ethereum#2707 tests. test_top_level_halt_preserves_restored_reservoir (parametrized reservoir_delta in {-1, 0, 1} x child_termination in {revert, halt}) Regression test for the bal-devnet-3 Besu bug (ethereum#2644). Child runs an SSTORE then fails, restoring state gas to the parent. Parent then INVALIDs, triggering the top-level failure refund. Expected `header.gas_used = gas_limit_cap + min(reservoir_delta, 0)` so the reservoir (including any spill-restore) is preserved across the halt. test_callcode_value_no_new_account_state_gas CALLCODE transfers value to the caller, not to the target, so no new-account state gas is ever charged regardless of whether the target exists. The reservoir stays intact for a subsequent SSTORE. test_create_oog_during_state_gas_charge Parent CALLs an inner with only 20k gas forwarded. The inner's CREATE charges GAS_NEW_ACCOUNT which exceeds the forwarded budget, OOGing before any state gas lands. Per PR ethereum#2704 the refund restores the parent's reservoir and the parent's subsequent SSTORE succeeds from it.
Ports the remaining two tests from the
`feat/eip-8037-additional-tests` / `feat/eip-8037-tests-devnet3`
branches that were not yet covered.
test_nested_create_fail_parent_revert_state_gas
Two-layer refund composition: caller CALLs factory, factory
does CREATE with failing initcode, factory then REVERTs or
STOPs. Parametrized over `child_failure` (revert, halt) x
`parent_reverts` x `create_opcode`. Verifies the nonce
side effect of factory's CREATE is rolled back when the
parent reverts, and preserved (nonce=2) when it STOPs.
Complements PR ethereum#2704's single-layer refund tests by
exercising the caller→factory→inner chain through
`incorporate_child_on_error` at both depths.
test_create_stack_depth_state_gas_consumed
Deep-recursion robustness check. The contract CALLs itself
until gas exhaustion (EIP-150 63/64 rule limits effective
depth well below STACK_DEPTH_LIMIT at the current
`gas_limit_cap`; reaching depth 1024 is physically
infeasible since the cumulative survival factor is
`(63/64)**1024 ≈ 1e-7`). As recursion unwinds, frames run
an SSTORE; the outermost frame's SSTORE must succeed,
proving the reservoir threads through nested CALLs intact.
Docstring notes that despite the name (retained for
continuity with closed PR ethereum#2639), this exercises CALL's
silent-failure branch rather than `generic_create`'s
depth-1024 branch (which is unreachable at current gas
params — effectively dead code in the spec).
Five mutation-testing regression tests surfacing gaps not caught by the existing EIP-8037 suite. Each was validated by temporarily injecting its target mutation into the spec and confirming the test fails under the bug. Four from evmone's mutation-testing report on the closed PR ethereum#2639 (GitHub review comment): * `test_call_new_account_no_regular_account_creation_cost` — catches restoring the pre-Amsterdam 25,000 regular `ACCOUNT_CREATION_COST` for CALL-to-new-account. * `test_selfdestruct_new_beneficiary_no_regular_account_creation_cost` — same 25,000 regular charge removal, SELFDESTRUCT variant. * `test_child_failure_refunds_state_gas_to_reservoir_not_gas_left` — catches routing `incorporate_child_on_error`'s state-gas restoration to `gas_left` instead of `state_gas_left`. * `test_create_collision_burned_gas_counted_in_block_regular` — catches removing `regular_gas_used += create_message_gas` on the CREATE/CREATE2 collision branch. One from local mutation testing against the spec: * `test_create_selfdestruct_code_deposit_refund_header_check` — catches dropping the `len(code) * cost_per_state_byte` portion of the same-tx CREATE+SELFDESTRUCT refund. The existing `test_create_selfdestruct_refunds_code_deposit_state_gas` only asserts the created account is gone; it never verifies the refund reduced `block_state_gas_used`. evmone's mutation 3 (SSTORE regular-before-state ordering) is already covered by `test_sstore_oog_reservoir_inflation_detection` in `test_state_gas_ordering.py`.
🗒️ Description
Additional EIP-8037 test coverage for state gas edge cases found during devnet testing, spec reviews, and Maria's state gas accounting review.
Reservoir and top-level halt (bal-devnet-3 Besu bug)
test_top_level_halt_preserves_restored_reservoir: Reservoir survives top-level exceptional halt after child spill-restore. When a child frame spills state gas and the top-level frame OOGs, the reservoir must be refunded, not consumed. Besu was consuming all state gas in this case (fixed in f4c8b36). Parametrized:child_termination(revert, halt) xreservoir_delta(-1, 0, +1). cc @daniellehrner @jochem-brouwer, original test PR: Test reservoir refunded at top level when parent halts after child sp… #2644Block 2D gas accounting
test_failed_create_tx_state_gas_dominates: CREATE tx with tight gas where state gas dominates. Catches 1D accounting bugs. Parametrized:init_code(revert, halt).test_failed_create_opcode_state_gas_dominates: CREATE/CREATE2 opcode failures with 2D header verification. Parametrized:init_code(revert, halt, oog) xcreate_opcode(CREATE, CREATE2). cc @chfasttest_block_2d_gas_tx_gas_limit_exceeds_regular_remaining: Tx withgas_limit >> TX_MAX_GAS_LIMITaccepted when capped regular gas fits. Parametrized:tx2_gas_limit_equals_block_gas_limit(true, false). cc @jwasingertest_tx_rejected_when_regular_gas_exceeds_block_limit: Tx rejected when cumulative regular gas exceeds block limit. cc @kclowes, original test PR: feat(tests): 8037 - Tx rejected when regular gas exceeds block limit #2654test_access_list_gas_is_regular_not_state: Access list gas classified as regular intrinsic gas withheader_verify. Catches clients that incorrectly put access list gas into the state budget. Parametrized:num_access_list_entries(1, 10) xslots_per_entry(0, 3). cc @MariusVanDerWijdenInitcode size validation ordering
test_oversized_initcode_tx_no_state_gas: CREATE tx with oversized initcode rejected before state gas charged. Parametrized:initcode_size_delta(0, +1). cc @chfasttest_oversized_initcode_opcode_no_state_gas: CREATE/CREATE2 opcode with oversized initcode. Parametrized:initcode_size_delta(0, +1) xcreate_opcode(CREATE, CREATE2). cc @chfastNested CREATE state gas persistence (bal-devnet-3 geth bug)
test_inner_create_state_gas_persists_on_failure: Inner CREATE/CREATE2 fails but GAS_NEW_ACCOUNT persists (charged to parent frame before child starts). Geth was incorrectly restoring this gas to the parent's reservoir on nested CREATE failures (block 1031, diff = 131,488 = GAS_NEW_ACCOUNT). Parametrized:inner_op(CREATE, CREATE2) xouter_outcome(succeeds, reverts, halts) xnum_inner_ops(1, 3). cc @MariusVanDerWijdentest_inner_create_succeeds_code_deposit_state_gas: Inner CREATE succeeds with code deposit, outer succeeds/reverts/halts. Both account-creation and code deposit gas counted. Parametrized:create_opcode(CREATE, CREATE2) xouter_outcome(succeeds, reverts, halts).CREATE silent failure state gas (Maria point 3)
test_create_nonce_overflow_state_gas_consumed: CREATE at max nonce, GAS_NEW_ACCOUNT consumed despite failure, reservoir preserved.test_create_stack_depth_state_gas_consumed: CREATE at depth 1024, GAS_NEW_ACCOUNT consumed despite failure, reservoir preserved.test_create2_collision_state_gas_block_accounting: CREATE2 collision withheader_verifyconfirming 2 x GAS_NEW_ACCOUNT in block state gas.SELFDESTRUCT and OOG edge cases (Maria points 4, 5)
test_selfdestruct_in_create_tx_initcode: SELFDESTRUCT to new beneficiary in same-creation context (EIP-6780). Both outer and beneficiary account gas charged.test_create_selfdestruct_same_tx_no_state_gas_refund: CREATE+SELFDESTRUCT same TX, net-zero state but GAS_NEW_ACCOUNT not refunded. Consistent with EIP-3529. Parametrized:create_opcode(CREATE, CREATE2).test_call_value_to_selfdestructed_same_tx_account: CALL with value to same-TX self-destructed account does NOT charge GAS_NEW_ACCOUNT (account still alive during execution). Parametrized:create_opcode(CREATE, CREATE2).test_create_oog_during_state_gas_charge: CREATE OOGs during state gas charge itself. No state gas consumed, parent recovers reservoir.🔗 Related Issues or PRs
min(TX_MAX_GAS_LIMIT, tx.gas_limit)tx inclusion ruleexecution_state_gas_used✅ Checklist
just statictype(scope):.mkdocs servelocally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.