Skip to content

fix: warmup transactions bypass SystemTransactionProcessor due to Warmup flag#10628

Merged
smartprogrammer93 merged 5 commits into
masterfrom
fix/warmup-system-tx-routing
Feb 25, 2026
Merged

fix: warmup transactions bypass SystemTransactionProcessor due to Warmup flag#10628
smartprogrammer93 merged 5 commits into
masterfrom
fix/warmup-system-tx-routing

Conversation

@kamilchodola

Copy link
Copy Markdown
Contributor

Changes

Fixes a performance regression introduced by #10267 where the pre-warmer executes transactions through the full heavy Execute() path instead of the lightweight SystemTransactionProcessor path.

Root Cause

PR #10267 added a Warmup = 8 flag to ExecutionOptions and changed the warmup call from:

opts = SkipValidation           // value 4

to:

opts = Warmup | SkipValidation  // value 12

The routing logic in ExecuteCore used an equality check:

if (tx.IsSystem() || opts == ExecutionOptions.SkipValidation)  // 12 != 4 → false

This broke the routing: warmup transactions no longer matched and fell through to the full Execute() path, which performs:

  • BuyGas — reads/writes sender balance for gas reservation
  • IncrementNonce — increments sender nonce in state
  • PayFees — distributes gas fees to beneficiary
  • PayRefund — refunds unused gas to sender
  • PayValue — transfers transaction value

The SystemTransactionProcessor overrides all of these as no-ops, making it significantly cheaper for warmup/system transactions.

The same pattern existed in SystemTransactionProcessor.Execute where opts != ExecutionOptions.SkipValidation was used to decide whether to add the OriginalValidate flag. With the Warmup bit set, this comparison also broke, causing incorrect flag propagation.

Fix

Mask out the Warmup flag before comparing ExecutionOptions:

TransactionProcessor.cs (ExecuteCore):

// Before (broken):
if (tx.IsSystem() || opts == ExecutionOptions.SkipValidation)
// After (fixed):
if (tx.IsSystem() || (opts & ~ExecutionOptions.Warmup) == ExecutionOptions.SkipValidation)

SystemTransactionProcessor.cs (Execute):

// Before (broken):
return base.Execute(tx, tracer, (opts != ExecutionOptions.SkipValidation && ...)
// After (fixed):
ExecutionOptions coreOpts = opts & ~ExecutionOptions.Warmup;
return base.Execute(tx, tracer, (coreOpts != ExecutionOptions.SkipValidation && ...)

Impact

During block processing, the pre-warmer runs transactions ahead of the main thread to warm state caches. Without this fix, every warmup transaction performs unnecessary state mutations (gas accounting, nonce increment, fee distribution) that:

  1. Waste CPU on operations that will be discarded
  2. Create unnecessary state writes in the warmer thread
  3. Defeat the purpose of using SystemTransactionProcessor for lightweight execution

On benchmarks running 10k+ sequential blocks, this compounds into a measurable performance regression.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Testing

Added 3 regression tests to TransactionProcessorTests:

  • Warmup_does_not_update_SpentGas — verifies warmup doesn't write to tx.SpentGas
  • Warmup_does_not_modify_sender_nonce — verifies warmup doesn't increment sender nonce
  • Warmup_does_not_deduct_sender_balance — verifies warmup doesn't deduct gas/value from sender

All 6 test runs pass (3 tests × 2 EIP-155 fixture variants).

…mup flag

PR #10267 added a `Warmup = 8` flag to ExecutionOptions, changing warmup
opts from `SkipValidation` (4) to `Warmup | SkipValidation` (12). This
broke the equality check `opts == ExecutionOptions.SkipValidation` in
ExecuteCore, which gates routing to the lightweight
SystemTransactionProcessor. As a result, warmup transactions fell through
to the full Execute() path — performing BuyGas, IncrementNonce, PayFees,
PayRefund and other state-mutating operations unnecessarily.

The same pattern existed in SystemTransactionProcessor.Execute where
opts equality checks determined whether to add OriginalValidate flag.

Fix: mask out the Warmup flag before comparing ExecutionOptions in both
ExecuteCore and SystemTransactionProcessor.Execute.
@github-actions

Copy link
Copy Markdown
Contributor

EXPB Benchmark Comparison

Run: View workflow run

Scenario: nethermind-halfpath-superblocks-fix-warmup-system-tx-routing-delay0s

Metric PR Master (cached) Delta PR vs Master
AVG (ms) 1525.728900 1512.445500 +0.88%
MEDIAN (ms) 1162.120000 1158.805000 +0.29%
P90 (ms) 2600.34 2693.48 -3.46%
P95 (ms) 2944.91 2914.44 +1.05%
P99 (ms) 4302.78 4190.66 +2.68%
MIN (ms) 766.13 756.08 +1.33%
MAX (ms) 5443.59 4208.34 +29.35%

@kamilchodola kamilchodola marked this pull request as ready for review February 24, 2026 19:03
kamilchodola added a commit that referenced this pull request Feb 24, 2026
…mup flag

Cherry-pick of fix/warmup-system-tx-routing (#10628) adapted for perf/flat.
Mask out the Warmup flag before comparing ExecutionOptions in ExecuteCore
(both OnFlag and OffFlag branches) and SystemTransactionProcessor.Execute.

@LukaszRozmej LukaszRozmej left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check ExecutionOptions.SkipValidation as a flag? (Can use HasFlag or & -whatever)

{
if (Logger.IsTrace) Logger.Trace($"Executing tx {tx.Hash}");
if (tx.IsSystem() || opts == ExecutionOptions.SkipValidation)
if (tx.IsSystem() || (opts & ~ExecutionOptions.Warmup) == ExecutionOptions.SkipValidation)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (tx.IsSystem() || (opts & ~ExecutionOptions.Warmup) == ExecutionOptions.SkipValidation)
if (tx.IsSystem() || (opts & ExecutionOptions.SkipValidation) == ExecutionOptions.SkipValidation)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to claude:

CommitAndRestore = Commit | Restore | SkipValidation = 7. With HasFlag(SkipValidation): (7 & 4) == 4 → TRUE → routes CallAndRestore to SystemTransactionProcessor.

which is not a desired outcome. same with

SkipValidationAndCommit = 5. (5 & 4) == 4 → TRUE → routes Trace() to SystemTransactionProcessor. Also a behavioral change.

Comment thread src/Nethermind/Nethermind.Evm/TransactionProcessing/SystemTransactionProcessor.cs Outdated
smartprogrammer93 and others added 2 commits February 25, 2026 02:36
…p in base Execute

Instead of routing warmup through SystemTransactionProcessor (which wraps
the spec in SystemTransactionReleaseSpec with 119 virtual dispatch properties),
handle warmup directly in the base Execute method by checking the Warmup flag
and skipping state-mutating operations. This avoids the ReleaseSpecDecorator
allocation overhead while achieving the same semantic: warmup executes the EVM
for cache warming without modifying gas, nonce, or balance state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use HasFlag-style check instead of exact equality for OriginalValidate
guard, making it resilient to future ExecutionOptions flag combinations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@smartprogrammer93

Copy link
Copy Markdown
Contributor

Benchmarks Against master

Method Master Fix Change
Warmup_SimpleTx 3,239 ns 1,009 ns -69% faster
Warmup_AccessListTx 3,435 ns 1,154 ns -66% faster
Warmup_Eip1559Tx 3,128 ns 1,043 ns -67% faster
Warmup_ContractCall 3,265 ns 994 ns -70% faster
CallAndRestore_SimpleTx 3,186 ns 3,000 ns -6% (noise)
CallAndRestore_ContractCall 3,390 ns 3,166 ns -7% (noise)

Allocations

Method Master Fix Change
Warmup_SimpleTx 4.95 KB 1.73 KB -65%
Warmup_AccessListTx 5.04 KB 1.81 KB -64%
Warmup_Eip1559Tx 4.95 KB 1.74 KB -65%
Warmup_ContractCall 4.59 KB 1.48 KB -68%

Detailed Fix Stats

Method Mean Error StdDev Min Max Median P90 P95
Warmup_SimpleTx 1,009 ns 66.6 ns 99.7 ns 892 ns 1,204 ns 965 ns 1,152 ns 1,185 ns
Warmup_AccessListTx 1,154 ns 31.1 ns 44.6 ns 1,070 ns 1,251 ns 1,158 ns 1,208 ns 1,226 ns
Warmup_Eip1559Tx 1,043 ns 90.6 ns 127.0 ns 925 ns 1,296 ns 972 ns 1,241 ns 1,256 ns
Warmup_ContractCall 994 ns 73.3 ns 105.2 ns 884 ns 1,231 ns 989 ns 1,132 ns 1,196 ns

@smartprogrammer93

Copy link
Copy Markdown
Contributor

BlockProcessingBenchmark Comparison: master vs fix/warmup-system-tx-routing


Master (baseline — no warmup optimization)

Method Mean StdDev Median P90 P95 Allocated
EmptyBlock 187.3 us 22.91 us 182.0 us 216.0 us 223.3 us 13.47 KB
SingleTransfer 305.2 us 31.94 us 309.5 us 347.2 us 353.8 us 25.02 KB
Transfers_50 1,528.1 us 109.35 us 1,524.1 us 1,653.6 us 1,703.1 us 2603.68 KB
Transfers_200 3,881.6 us 472.07 us 3,638.4 us 4,661.9 us 4,758.4 us 3005.4 KB
Eip1559_200 3,418.1 us 504.81 us 3,495.2 us 4,355.0 us 4,454.8 us 2995.27 KB
AccessList_50 874.6 us 60.21 us 899.4 us 943.1 us 951.3 us 1651.36 KB
ContractDeploy_10 512.2 us 42.26 us 499.7 us 575.9 us 603.0 us 1549.18 KB
ContractCall_200 1,542.7 us 86.28 us 1,533.5 us 1,645.8 us 1,678.0 us 2563.79 KB
MixedBlock 976.9 us 49.93 us 971.6 us 1,057.6 us 1,060.9 us 2921.45 KB

Warmup PR (fix/warmup-system-tx-routing)

Method Mean StdDev Median P90 P95 Allocated
EmptyBlock 191.8 us 32.56 us 180.6 us 225.2 us 248.0 us 13.94 KB
SingleTransfer 270.6 us 23.92 us 267.8 us 304.9 us 312.1 us 26.2 KB
Transfers_50 1,156.6 us 243.89 us 1,345.9 us 1,402.6 us 1,437.8 us 2610.78 KB
Transfers_200 2,771.5 us 203.71 us 2,699.4 us 3,012.1 us 3,035.6 us 2927.77 KB
Eip1559_200 2,585.7 us 190.18 us 2,571.5 us 2,806.8 us 2,813.8 us 2927.09 KB
AccessList_50 694.9 us 100.70 us 708.3 us 811.0 us 815.1 us 2114.2 KB
ContractDeploy_10 393.3 us 75.19 us 392.2 us 470.3 us 502.0 us 1058.13 KB
ContractCall_200 922.5 us 42.51 us 907.9 us 981.5 us 1,000.6 us 1773.63 KB
MixedBlock 641.6 us 36.79 us 639.5 us 683.2 us 700.5 us 2234.95 KB

Delta (Warmup PR vs master)

Method Master Mean Warmup Mean Delta Change Alloc Delta
EmptyBlock 187.3 us 191.8 us +4.5 us +2.4% +0.47 KB
SingleTransfer 305.2 us 270.6 us -34.6 us -11.3% +1.18 KB
Transfers_50 1,528.1 us 1,156.6 us -371.5 us -24.3% +7.1 KB
Transfers_200 3,881.6 us 2,771.5 us -1,110.1 us -28.6% -77.6 KB
Eip1559_200 3,418.1 us 2,585.7 us -832.4 us -24.4% -68.2 KB
AccessList_50 874.6 us 694.9 us -179.7 us -20.5% +462.8 KB
ContractDeploy_10 512.2 us 393.3 us -118.9 us -23.2% -491.1 KB
ContractCall_200 1,542.7 us 922.5 us -620.2 us -40.2% -790.2 KB
MixedBlock 976.9 us 641.6 us -335.3 us -34.3% -686.5 KB

Key Takeaways

  • EmptyBlock / SingleTransfer: No meaningful change (< 3 txs, pre-warming does not trigger) — within noise.
  • Blocks with 3+ transactions show 20–40% improvement in mean processing time on the warmup PR.
    • The fix skips BuyGas, IncrementNonce, PayFees, PayValue, and PayRefund during the pre-warmer’s Warmup() calls, making the concurrent warmup pass substantially cheaper.
  • ContractCall_200 shows the largest improvement (-40.2%).
    • Contract calls execute the EVM during warmup, so skipping gas/nonce/fee accounting has a larger relative impact.
  • Allocations drop significantly in most scenarios (up to -790 KB for ContractCall_200), because the warmup path no longer runs the full gas accounting and fee payment code paths.

@kamilchodola

Copy link
Copy Markdown
Contributor Author

Redoing tests - seems like new commits have made some regression which is weird

@kamilchodola

Copy link
Copy Markdown
Contributor Author
image Seems like latest commit (3 last rows) make the performance back on the master level

@kamilchodola

Copy link
Copy Markdown
Contributor Author

Do not merge - new findings are made and working on proper adjustment

…actionProcessor

Reverts d9ca918 which moved warmup handling into the base Execute
method with scattered HasFlag(Warmup) checks. BlockProcessingBenchmark
showed 30-50% regressions on transfer-heavy scenarios (Transfers_200
+42%, Eip1559_200 +49%) because the checks polluted the hot path and
the prewarmer took the heavier base Execute path instead of the lighter
SystemTransactionProcessor overrides.

SystemTransactionProcessor already handles all state-mutating skips
(BuyGas, IncrementNonce, PayFees, PayValue, PayRefund) via sealed
virtual overrides, keeping the base Execute path clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@smartprogrammer93 smartprogrammer93 merged commit f067c88 into master Feb 25, 2026
111 checks passed
@smartprogrammer93 smartprogrammer93 deleted the fix/warmup-system-tx-routing branch February 25, 2026 10:43
kamilchodola added a commit that referenced this pull request Feb 25, 2026
…mup flag (#10628)

* fix: warmup transactions bypass SystemTransactionProcessor due to Warmup flag

PR #10267 added a `Warmup = 8` flag to ExecutionOptions, changing warmup
opts from `SkipValidation` (4) to `Warmup | SkipValidation` (12). This
broke the equality check `opts == ExecutionOptions.SkipValidation` in
ExecuteCore, which gates routing to the lightweight
SystemTransactionProcessor. As a result, warmup transactions fell through
to the full Execute() path — performing BuyGas, IncrementNonce, PayFees,
PayRefund and other state-mutating operations unnecessarily.

The same pattern existed in SystemTransactionProcessor.Execute where
opts equality checks determined whether to add OriginalValidate flag.

Fix: mask out the Warmup flag before comparing ExecutionOptions in both
ExecuteCore and SystemTransactionProcessor.Execute.

* perf: skip BuyGas/IncrementNonce/PayFees/PayValue/PayRefund for warmup in base Execute

Instead of routing warmup through SystemTransactionProcessor (which wraps
the spec in SystemTransactionReleaseSpec with 119 virtual dispatch properties),
handle warmup directly in the base Execute method by checking the Warmup flag
and skipping state-mutating operations. This avoids the ReleaseSpecDecorator
allocation overhead while achieving the same semantic: warmup executes the EVM
for cache warming without modifying gas, nonce, or balance state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use bit-flag check for SkipValidation in SystemTransactionProcessor

Use HasFlag-style check instead of exact equality for OriginalValidate
guard, making it resilient to future ExecutionOptions flag combinations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* revert: remove warmup checks from base Execute, route via SystemTransactionProcessor

Reverts d9ca918 which moved warmup handling into the base Execute
method with scattered HasFlag(Warmup) checks. BlockProcessingBenchmark
showed 30-50% regressions on transfer-heavy scenarios (Transfers_200
+42%, Eip1559_200 +49%) because the checks polluted the hot path and
the prewarmer took the heavier base Execute path instead of the lighter
SystemTransactionProcessor overrides.

SystemTransactionProcessor already handles all state-mutating skips
(BuyGas, IncrementNonce, PayFees, PayValue, PayRefund) via sealed
virtual overrides, keeping the base Execute path clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: smartprogrammer93 <smartprogrammer@windowslive.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Ahmad Bitar <33181301+smartprogrammer93@users.noreply.github.com>
kamilchodola added a commit that referenced this pull request Feb 25, 2026
…mup flag (#10628)

* fix: warmup transactions bypass SystemTransactionProcessor due to Warmup flag

PR #10267 added a `Warmup = 8` flag to ExecutionOptions, changing warmup
opts from `SkipValidation` (4) to `Warmup | SkipValidation` (12). This
broke the equality check `opts == ExecutionOptions.SkipValidation` in
ExecuteCore, which gates routing to the lightweight
SystemTransactionProcessor. As a result, warmup transactions fell through
to the full Execute() path — performing BuyGas, IncrementNonce, PayFees,
PayRefund and other state-mutating operations unnecessarily.

The same pattern existed in SystemTransactionProcessor.Execute where
opts equality checks determined whether to add OriginalValidate flag.

Fix: mask out the Warmup flag before comparing ExecutionOptions in both
ExecuteCore and SystemTransactionProcessor.Execute.

* perf: skip BuyGas/IncrementNonce/PayFees/PayValue/PayRefund for warmup in base Execute

Instead of routing warmup through SystemTransactionProcessor (which wraps
the spec in SystemTransactionReleaseSpec with 119 virtual dispatch properties),
handle warmup directly in the base Execute method by checking the Warmup flag
and skipping state-mutating operations. This avoids the ReleaseSpecDecorator
allocation overhead while achieving the same semantic: warmup executes the EVM
for cache warming without modifying gas, nonce, or balance state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use bit-flag check for SkipValidation in SystemTransactionProcessor

Use HasFlag-style check instead of exact equality for OriginalValidate
guard, making it resilient to future ExecutionOptions flag combinations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* revert: remove warmup checks from base Execute, route via SystemTransactionProcessor

Reverts d9ca918 which moved warmup handling into the base Execute
method with scattered HasFlag(Warmup) checks. BlockProcessingBenchmark
showed 30-50% regressions on transfer-heavy scenarios (Transfers_200
+42%, Eip1559_200 +49%) because the checks polluted the hot path and
the prewarmer took the heavier base Execute path instead of the lighter
SystemTransactionProcessor overrides.

SystemTransactionProcessor already handles all state-mutating skips
(BuyGas, IncrementNonce, PayFees, PayValue, PayRefund) via sealed
virtual overrides, keeping the base Execute path clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: smartprogrammer93 <smartprogrammer@windowslive.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Ahmad Bitar <33181301+smartprogrammer93@users.noreply.github.com>
@kamilchodola

Copy link
Copy Markdown
Contributor Author

@claude This PR brings significant performance imporvement but it also increased the "Get" RocksDb method OwnTime per dottrace by 25k ms - making it not super optimal solution - any ideas how to make it even better?

@claude

claude Bot commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

Claude finished @kamilchodola's task in 5m 24s —— View job


Analysis: Why RocksDB "Get" OwnTime Increased — and How to Reduce It Further

  • Read the changed files and understand the warmup mechanism
  • Investigate RocksDB "Get" increase root cause
  • Suggest optimizations

Root Cause of the Extra RocksDB Reads

The increase in RocksDB "Get" OwnTime is expected and is actually a sign the fix is working correctly. Here's why:

Before the fix (broken path through full Execute()):

The BuyGas and IncrementNonce calls were writing the sender's account to the WorldState dirty cache (balance subtracted, nonce incremented). More critically, for EIP-1559 transactions with non-zero MaxFeePerGas, ShouldValidateGas() returns true even with SkipValidation set — meaning the balance check in BuyGas could fail and abort the transaction before EVM ran. Those aborted transactions never touched the storage-heavy EVM path, so they produced fewer RocksDB reads.

After the fix (correct path through SystemTransactionProcessor):

BuyGas is a no-op → no early abort → all transactions now run EVM fully → more contract storage slots get read → more RocksDB misses in the warmup thread. These misses populate PreBlockCaches, which is the entire point of warming. The main processing thread then hits the cache instead of RocksDB, delivering the 20–40% speedup.

The 25k ms extra cost in warmup-thread RocksDB reads is paying for useful work that saves much more than 25k ms in the main thread. The EXPB numbers confirm this — the overall latencies are similar/better even with more warmup-side reads.


Ideas for Further Reduction

1. Batch-load statically-known addresses before EVM execution (highest impact)

In WarmupSingleTransaction(), the sender is already checked via AccountExists. But the recipient and block beneficiary (coinbase) are not pre-loaded before EVM starts. Every time EVM hits a cold account, it goes: dirty-cache miss → PreBlockCaches miss → RocksDB read. Converting those to a single batch load before EVM avoids sequential round-trips:

// In WarmupSingleTransaction, before Warmup() call:
worldState.WarmUp(tx.To);          // recipient
worldState.WarmUp(header.Beneficiary); // coinbase is accessed by PayFees on main thread

The coinbase in particular is accessed by every transaction on the main thread (in PayFees), but SystemTransactionProcessor.PayFees is a no-op so it's never warmed from the transaction warmup path. AddressWarmer may handle this, but worth verifying.

2. Use RocksDB MultiGet for access lists

worldState.WarmUp(tx.AccessList) (line 275 in BlockCachePreWarmer.cs) currently reads addresses/slots sequentially. For blocks with large EIP-2930 access lists or EIP-4337 user ops, this is N individual RocksDB Get calls. Routing these through a MultiGet would batch the I/O into a single call, reducing syscall overhead significantly on high-tx-count blocks.

3. Thundering herd on SeqlockCache for popular contracts

When multiple warmup threads process transactions that all call the same popular contract (e.g., USDC, a DEX), they can all get a cache miss on the same storage slots simultaneously and all go to RocksDB for the same key. The SeqlockCache is optimistic (no locking on read), so this race isn't prevented.

A simple "in-flight" set (or first-writer-wins pattern) on PreBlockCaches would let the first thread load the slot while others skip the duplicate DB read.

4. Skip EVM for pure ETH transfers

For transactions where tx.To has no code (!WorldState.HasCode(tx.To) && !tx.IsContractCreation), EVM execution only touches sender and recipient accounts — both of which are already known before EVM runs. The warmup could skip the full Warmup() call for these and just do two targeted WarmUp(address) calls, saving EVM setup overhead and the associated reads.

5. Speculative slot pre-loading from recent block history

For high-frequency contracts (DEXs, stablecoins), the storage slots accessed are largely stable across blocks. A sliding-window bloom filter tracking which (contract, slot) pairs were accessed in the last N blocks could be used to pre-load them speculatively before transaction warmup starts — turning cold misses into pre-warmed hits without needing EVM execution to discover them.


Summary

The 25k ms RocksDB increase is the correct trade-off — the warmup is doing more useful work than before. The highest-leverage quick win is #1 (batch pre-load recipient + coinbase before EVM) and #2 (MultiGet for access lists), both of which are small, focused changes to WarmupSingleTransaction() and WorldState.WarmUp(AccessList) respectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants