Skip to content

fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests#2532

Merged
felix314159 merged 6 commits into
ethereum:eips/amsterdam/eip-8037from
felix314159:eips/amsterdam/eip-8037
Mar 20, 2026
Merged

fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests#2532
felix314159 merged 6 commits into
ethereum:eips/amsterdam/eip-8037from
felix314159:eips/amsterdam/eip-8037

Conversation

@felix314159

Copy link
Copy Markdown
Contributor

jangko from nimbus notified us that in our bal 5.5.1 fixtures there is a block hash mismatch between blockchain_test and blockchain_test_engine for test_withdrawal_requests.

Cause of block_hash mismatches

Since pytest runs the blocks fixture separately for each fixture format (once for blockchain_test, once for blockchain_test_engine), and the WithdrawalRequestContract objects are shared across those runs (they're pytest parameter objects), the gas limit was accumulated twice — adding 121404 each time. This made the transaction RLP differ between formats, which changed the transactions_trie, which changed the block_hash.

Why only contract tests were affected: Only WithdrawalRequestContract had this gas adjustment. EOA-based tests (WithdrawalRequestTransaction) didn't get the adjustment, so their hashes were consistent.

Fix

Save the original gas limit before the first adjustment and always compute from that base value.

How to reproduce

Before applying this PR's fix run:
uv run fill --clean -k "test_withdrawal_requests" tests/prague/eip7002_el_triggerable_withdrawals/test_withdrawal_requests.py --fork=Amsterdam -s

and then verify issue via:

import json

bt = json.load(
    open(
        "fixtures/blockchain_tests/for_amsterdam/prague/eip7002_el_triggerable_withdrawals/withdrawal_requests/withdrawal_requests.json"
    )
)

be = json.load(
    open(
        "fixtures/blockchain_tests_engine/for_amsterdam/prague/eip7002_el_triggerable_withdrawals/withdrawal_requests/withdrawal_requests.json"
    )
)


def n(k):
    return k.replace("blockchain_test_engine", "X").replace(
        "blockchain_test", "X"
    )


bm = {n(k): k for k in bt}
em = {n(k): k for k in be}

d = [
    (
        nk,
        bt[bm[nk]]["blocks"][-1]["blockHeader"]["hash"],
        be[em[nk]]["lastblockhash"],
    )
    for nk in sorted(bm.keys() & em.keys())
    if bt[bm[nk]]["blocks"][-1]["blockHeader"]["hash"]
    != be[em[nk]]["lastblockhash"]
]

print(
    f"{len(d)}/{len(bm.keys() & em.keys())} tests have mismatched block hashes"
)

for nk, bh, eh in d:
    print(f"  {nk.split('X-')[-1][:70]}")

then apply the fix from this PR and re-run the script above. You will see that pre-fix shows 14/26 tests have mismatched block hashes, after-fix 0/26 mismatches.

Cute Animal Picture

Cute Animal Picture

@felix314159

Copy link
Copy Markdown
Contributor Author

We should also add a script similar to above to CI so that this can not again happen undetected

@codecov

codecov Bot commented Mar 19, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (eips/amsterdam/eip-8037@ae172c3). Learn more about missing BASE report.

Additional details and impacted files
@@                    Coverage Diff                     @@
##             eips/amsterdam/eip-8037    #2532   +/-   ##
==========================================================
  Coverage                           ?   83.63%           
==========================================================
  Files                              ?      642           
  Lines                              ?    39703           
  Branches                           ?     4057           
==========================================================
  Hits                               ?    33207           
  Misses                             ?     5797           
  Partials                           ?      699           
Flag Coverage Δ
unittests 83.63% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@fselmo

fselmo commented Mar 19, 2026

Copy link
Copy Markdown
Contributor

Hey @felix314159, thanks for sniffing this out... this is a really important finding. I agree with you, we need to have some guardrails here so these things don't happen again.

I think we can simplify the approach a bit but I'm happy to discuss it. Instead of stashing the base gas limit to make the mutation idempotent, we could pass fork into transactions() and compute the state gas adjustment there as a pure function. That way the shared pytest.param object(s) are never mutated at all... so the problem can't really happen regardless of how many format runs share the same object.

I pushed two commits here with this approach that I'm happy to drop after discussion if this isn't agreed on. The second commit snapshots repr (since copy / pickle isn't possible on certain objects) of params before a test and compares it after to issue a warning for tests that currently mutate params in place. It flags some tests but these currently don't accumulate (+=), they mutate via assignment (self.var = val) so in this sense they are OK. But imo, we should create an issue to fix any tests that trigger this warning to return a new value rather than mutate and then we can flip this warning to a raise or something like this.

What do you think?

Again, happy to drop these and get your idempotent fix in but I think this might help steer us to the right framework changes.

cc @danceratopz

@fselmo

fselmo commented Mar 19, 2026

Copy link
Copy Markdown
Contributor

But imo, we should create an issue to fix any tests that trigger this warning to return a new value rather than mutate and then we can flip this warning to a raise or something like this.

I asked Claude to summarize the goal for the follow-up issue:

Context

PR #2532 fixed a bug where WithdrawalRequestContract.tx_gas_limit was double-accumulated across fixture 
format runs because pytest parameter objects are shared by reference. The fix moved the gas computation 
into transactions() as a pure function. However, the underlying anti-pattern — fixtures mutating shared 
pytest.param objects — remains in all three EL request test modules via update_pre().

A new repr-based snapshot hook in filler.py (pytest_runtest_setup/pytest_runtest_teardown) now warns 
when parametrize values are mutated during a test run. Currently, every EL request test triggers this 
warning due to update_pre() setting sender_account, contract_address, and entry_address on shared 
helper instances.

These mutations happen to be idempotent today (assignments, not accumulations), so they don't cause 
incorrect fixtures. But the pattern is fragile — any future += or conditional mutation would silently produce 
divergent block hashes across blockchain_test vs blockchain_engine_test, exactly like the bug #2532 fixed.

- Goal

  - Refactor the EL request test helpers so that update_pre() returns values rather than mutating self, and 
    downstream methods (transactions(), valid_requests()) accept those values as parameters.
  - Once all warnings are eliminated, flip the filler hook from warning to hard failure.

…ss in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.
@fselmo fselmo force-pushed the eips/amsterdam/eip-8037 branch from b38095f to 96d941c Compare March 19, 2026 21:20
@felix314159

Copy link
Copy Markdown
Contributor Author

thanks for your commits, making it fork-aware and using a pure function is cleaner.

however, i now worry about what happens if no fork is passed, because that still seems to be optional: e.g if we look at this line we can see WithdrawalRequestInteractionBase.transactions() being called without a fork explicitly passed to it (still optional today..). so it seems that correctness now depends on callers remembering to pass fork..

i fully support us removing mutations for el helpers in update_pre() and just have it return a value. also +1 on the start with warnings and then pivot to hard failure.

however even with the proposed warning/failure hook, i still would like to also have an explicit hash comparison script (compares certain hashes in blockchain_test vs blockchain_test_engine fixtures) to run. it should be simple to write and it would detect even cases where no fork is passed, that seems like a footgun that now exists (separate problem). the proposed warning hook wouldn't catch a missed fork because nothing mutates.

@felix314159 felix314159 merged commit e4ec313 into ethereum:eips/amsterdam/eip-8037 Mar 20, 2026
17 checks passed
@fselmo

fselmo commented Mar 20, 2026

Copy link
Copy Markdown
Contributor

Ah bummer I meant to make one last comment that we should push this to forks/amsterdam :). Looks like it was sitting as "pending" on a new review or something. Anyhow, I think we should get this in the framework not the EIP branch. But I guess it will get there eventually and the only issue was within EIP-8037... so it's ok I think 👍🏼

spencer-tb pushed a commit to spencer-tb/execution-specs that referenced this pull request Apr 1, 2026
… format runs in withdrawal request contract tests (ethereum#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
marioevz pushed a commit that referenced this pull request Apr 9, 2026
… format runs in withdrawal request contract tests (#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
marioevz pushed a commit that referenced this pull request Apr 10, 2026
… format runs in withdrawal request contract tests (#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
felix314159 added a commit to felix314159/execution-specs that referenced this pull request Apr 14, 2026
… format runs in withdrawal request contract tests (ethereum#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
felix314159 added a commit that referenced this pull request Apr 16, 2026
… format runs in withdrawal request contract tests (#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
spencer-tb pushed a commit that referenced this pull request Apr 17, 2026
… format runs in withdrawal request contract tests (#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
marioevz pushed a commit that referenced this pull request Apr 20, 2026
… format runs in withdrawal request contract tests (#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
spencer-tb pushed a commit that referenced this pull request Apr 20, 2026
… format runs in withdrawal request contract tests (#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
spencer-tb pushed a commit to spencer-tb/execution-specs that referenced this pull request Apr 21, 2026
… format runs in withdrawal request contract tests (ethereum#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
spencer-tb pushed a commit to spencer-tb/execution-specs that referenced this pull request Apr 21, 2026
… format runs in withdrawal request contract tests (ethereum#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
spencer-tb pushed a commit that referenced this pull request Apr 21, 2026
… format runs in withdrawal request contract tests (#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
fselmo added a commit that referenced this pull request May 5, 2026
… format runs in withdrawal request contract tests (#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
spencer-tb pushed a commit to spencer-tb/execution-specs that referenced this pull request May 22, 2026
… format runs in withdrawal request contract tests (ethereum#2532)

* fix(tests): prevent tx_gas_limit double-accumulation across fixture format runs in withdrawal request contract tests

* fix: mypy

* fix: ruff

* fix: ruff

* chore(tests): refactor fix to fork-aware transactions to prevent mutation

* chore(test): add a warning to all tests that could mutate vars; address in later PR

Add a lightweight repr-based snapshot hook to the filler plugin that warns whenever
any ``pytest.param`` value is mutated during a test run.

A subsequent PR could address this by returning values instead of mutating, then
flipping the hook to a hard failure.

---------

Co-authored-by: fselmo <fselmo2@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants