consensus: standardize slow block JSON output for cross-client metrics#11400
Conversation
Implements execution metrics following the cross-client specification: https://github.com/ethereum/execution-specs/blob/main/docs/execution-metrics-spec.md - Add slow block JSON logging (threshold: 1000ms) - Include timing, throughput, and EVM operation counts - Leverage existing SLOAD/SSTORE/CALL metrics infrastructure - Output format matches cross-client standardization spec
Implement standardized JSON format for slow block logging to enable cross-client performance analysis and protocol research. This change is part of the Cross-Client Execution Metrics initiative proposed by Gary Rong and CPerezz: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ The standardized metrics enabled data-driven analysis like the EIP-7907 research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850 JSON format includes: - block: number, hash, gas_used, tx_count - timing: execution_ms, total_ms - throughput: mgas_per_sec - state_reads: accounts, storage_slots, code, code_bytes - state_writes: accounts, storage_slots - cache: account/storage/code hits, misses, hit_rate - evm: sload, sstore, calls, creates Also adds ThreadLocal accessors to Db.Metrics for per-block cache statistics tracking without global synchronization overhead.
Convert timing calculations from long to double to preserve sub-millisecond precision in slow block JSON output. Changes: - ProcessingStats: timing variables changed to double with float division - Math.Round(..., 3) for consistent 3 decimal place output - ZeroContentionCounter: add Increment(long) overload for type safety
Add tracking for EIP-7702 delegation set/cleared operations as part of the cross-client execution metrics standardization effort. New metrics in Nethermind.Evm.Metrics: - Eip7702DelegationsSet: Number of EIP-7702 delegations set - Eip7702DelegationsCleared: Number of EIP-7702 delegations cleared Both metrics include thread-local variants for use in slow block logging. The slow block JSON output now includes: - state_writes.eip7702_delegations_set - state_writes.eip7702_delegations_cleared These fields will be 0 for pre-Pectra blocks (per spec).
ContractsAnalysed tracks jump destination analysis, not cache misses. CodeReads is incremented when code is loaded from DB (cache miss).
Instead of duplicating total_ms, calculate: execution_ms = total_ms - state_read_ms - state_hash_ms - commit_ms Falls back to total_ms if timing metrics not fully captured.
- Add IncrementCodeWrites/IncrementCodeBytesWritten in InsertCode - Add IncrementAccountWrites in SetState - Add IncrementStorageWrites in Set
Move IncrementAccountReads inside GetState to only count trie/DB reads, matching storage read semantics. Previously counted all reads including in-memory cache hits.
…metrics Pipe timing data to EvmMetrics for state_read_ms, state_hash_ms, and commit_ms: - WorldStateMetricsDecorator: pipe existing timing to EvmMetrics for RecalculateStateRoot, Commit (both overloads), and CommitTree - StateProvider.GetState: add timing around DB reads (not cache hits) - PersistentStorageProvider.LoadFromTree: add timing for storage reads Uses Stopwatch.GetTimestamp()/GetElapsedTime() with TimeSpan.Ticks (100ns precision) for minimal overhead (~10-20ns per measurement).
…tandardization Add new metrics for tracking account and storage deletions during EVM execution: - AccountDeleted: incremented when SetState is called with null account - StorageDeleted: incremented when SSTORE sets a slot to zero - Both metrics flow through ProcessingStats to slow block JSON output
Wire up existing but unused EIP-7702 delegation metric increment methods: - IncrementEip7702DelegationsSet() when setting delegation (non-zero address) - IncrementEip7702DelegationsCleared() when clearing delegation (zero address)
Add comprehensive integration tests for all cross-client execution metrics: - EIP-7702 delegation tests (set, clear, multiple) - Account metrics tests (reads, writes, deleted) - Storage metrics tests (reads, writes, deleted) - Code metrics tests (loaded, updated)
Allow customization of the slow block logging threshold via constructor parameter. Default remains 1000ms for backward compatibility. - Add slowBlockThresholdMs parameter to ProcessingStats constructor - Use >= comparison so threshold=0 logs all blocks (useful for testing)
Add InternalsVisibleTo attributes to allow test projects to access ThreadLocal metric values for verification in slow block tests.
Add comprehensive unit tests for slow block JSON output: - Verify all 31 spec-required fields are present - Test configurable threshold (0 = log all blocks) - Validate field names match snake_case spec format - Test timing, state read/write, and cache metrics
Dead code — the MetricsSnapshot and MetricsDelta classes were never referenced anywhere in the codebase.
…cumulator Replace 16 separate ZeroContentionCounter instances (each with its own ThreadLocal<BoxedLong>) with a single ThreadLocal<ExecutionMetricsAccumulator>. This addresses Ben's review feedback about the number of additional thread locals on hot execution paths. The consolidated accumulator reduces ThreadLocal overhead by requiring only one TLS lookup per increment instead of one per counter. Prometheus exposure is preserved via SumExecutionMetric() which iterates the shared ThreadLocal's Values collection. Call sites now use direct field increments (e.g. ThreadExecutionMetrics.AccountReads++) instead of method calls.
…simplify test spec - Replace anonymous object in LogSlowBlock with named record types for structured JSON serialization (SlowBlockLog, SlowBlockInfo, etc.) - Simplify spec retrieval in MetricsIntegrationTests to use Prague.Instance directly instead of resolving via MainnetSpecProvider
…flicts with master - Replace ExecutionMetricsAccumulator with ZeroContentionCounter pattern - Remove Stopwatch.GetTimestamp() from per-read hot paths (StateProvider, PersistentStorageProvider) - Use Utf8JsonWriter for zero-allocation slow block JSON serialization - Collapse BlockData Start/Current pairs into pre-computed deltas - Adapt to master's refactored CodeInfoRepository, WrappedWorldState, IProcessingStats - Condense tests from ~1600 to ~530 lines
Address PR review feedback from flcl42.
…block log Add per-phase timing (blooms_ms, receipts_root_ms, storage_merkle_ms, state_root_ms, evm_ms) and extended EVM/block fields (opcodes, self_destructs, empty_calls, contracts_analyzed, cached_contracts_used, blob_count, gas_limit) to the slow block JSON output. execution_ms retains its original definition for backwards compatibility. All new timing is captured via the same ZeroContentionCounter pattern used by existing metrics — ~25μs overhead per block.
Wire slow block threshold to IBlocksConfig so it can be set via config file or CLI. Default: 1000ms (log blocks slower than 1s). Set to 0 to log all blocks, -1 to disable slow block logging entirely. When disabled (-1), CaptureStartStats and delta computation are skipped on the block-processing thread — zero overhead from slow block analysis.
Slow block logging is opt-in. Set Blocks.SlowBlockThresholdMs=1000 to enable logging blocks slower than 1s.
New config flag Blocks.SlowBlockPerTxThresholdMs (default -1 = disabled).
When enabled alongside SlowBlockThresholdMs, each transaction exceeding
the per-tx threshold is included individually in the slow block JSON:
"transactions": [
{ "index": 42, "hash": "0x...", "gas_used": 8200000,
"execution_ms": 12.3, "type": "Legacy", "to": "0x..." }
]
Uses a thread-static PerTxTimingCollector with one Stopwatch.GetTimestamp
per tx (~20ns). When disabled, zero overhead — the collector is not even
checked in the tx loop.
|
Claude finished @Marchhill's task in 3m 35s —— View job PR Review: consensus: standardize slow block JSON output for cross-client metrics Tasks:
SummaryAll five blockers from LukaszRozmej's 2026-05-15 review have been addressed in the latest commit ( LukaszRozmej Blockers — All Resolved
Also fixed from Round 5 (Low):
All Prior Findings — Confirmed Fixed
Outstanding Low Items (acknowledged, non-blocking)
Positive NotesAll structural quality from prior reviews remains fully intact:
| Branch |
There was a problem hiding this comment.
Pull request overview
This PR introduces standardized “slow block” JSON logging (intended for cross-client execution metrics) by adding new execution counters (state/code reads+writes, EIP-7702 delegation set/clear) and block-phase timing, plus configuration flags to enable slow-block and per-transaction reporting.
Changes:
- Add new EVM/state execution counters and block-phase timing sinks used by slow-block JSON logging.
- Add slow-block JSON emission in
ProcessingStatswith optional per-transaction timing capture. - Add new
Blocksconfig flags and integration tests validating metrics and JSON schema.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Nethermind/Nethermind.State/StateProvider.cs | Adds account/code read/write metric increments during state access and code insertion. |
| src/Nethermind/Nethermind.State/PersistentStorageProvider.cs | Adds storage read/write metric increments via storage provider hooks. |
| src/Nethermind/Nethermind.State/PartialStorageProviderBase.cs | Makes Set virtual to allow metric-instrumented overrides. |
| src/Nethermind/Nethermind.Evm/Metrics.cs | Adds new execution counters + compile-time flag gating. |
| src/Nethermind/Nethermind.Evm/Instructions/EvmInstructions.Storage.cs | Adds storage-deletion metric increments on SSTORE-to-zero. |
| src/Nethermind/Nethermind.Evm/CodeInfoRepository.cs | Adds code read/byte counters and EIP-7702 delegation set/clear counters. |
| src/Nethermind/Nethermind.Evm.Test/MetricsIntegrationTests.cs | New tests asserting new EVM counters are incremented during tx processing. |
| src/Nethermind/Nethermind.Db/Metrics.cs | Exposes main-thread-only counters for block-level delta computation. |
| src/Nethermind/Nethermind.Core/Metric/MetricsTimer.cs | Adds reusable timer helper for tick-based metric sinks. |
| src/Nethermind/Nethermind.Consensus/Processing/ProcessingStats.cs | Adds slow-block JSON logging, config-driven thresholds, and per-tx timing snapshotting. |
| src/Nethermind/Nethermind.Consensus/Processing/PerTxTimingCollector.cs | Adds process-wide collector for per-transaction execution timing. |
| src/Nethermind/Nethermind.Consensus/Processing/IBlockProcessor.cs | Extends executor interface with optional per-tx timing hooks (default no-ops). |
| src/Nethermind/Nethermind.Consensus/Processing/BlockProcessor.ParallelBlockValidationTransactionsExecutor.cs | Wires per-tx timers into parallel tx execution path. |
| src/Nethermind/Nethermind.Consensus/Processing/BlockProcessor.cs | Wraps block phases with MetricsTimer sinks for timing breakdown counters. |
| src/Nethermind/Nethermind.Consensus/Processing/BlockProcessor.BlockValidationTransactionsExecutor.cs | Wires per-tx timers into sequential tx execution path. |
| src/Nethermind/Nethermind.Consensus.Test/Processing/SlowBlockLogEntry.cs | Adds typed model for deserializing slow-block JSON in tests. |
| src/Nethermind/Nethermind.Consensus.Test/Processing/SlowBlockIntegrationTests.cs | Adds integration tests validating metrics flow into slow-block JSON. |
| src/Nethermind/Nethermind.Consensus.Test/Processing/ProcessingStatsTests.cs | Adds schema/threshold/timing consistency tests for slow-block JSON. |
| src/Nethermind/Nethermind.Config/IBlocksConfig.cs | Adds SlowBlockThresholdMs and SlowBlockPerTxThresholdMs config items. |
| src/Nethermind/Nethermind.Config/BlocksConfig.cs | Implements new config properties with default -1 (disabled). |
Comments suppressed due to low confidence (6)
src/Nethermind/Nethermind.Consensus/Processing/ProcessingStats.cs:351
- Slow-block threshold comparison truncates microseconds to whole milliseconds (processingMicros / 1000). This can miss blocks that are just over the configured threshold due to integer floor. Compare in microseconds (e.g. processingMicros >= thresholdMs * 1000) or use a double ms value consistently.
// Log slow blocks in JSON format for cross-client performance analysis
// Only log when slow block threshold is enabled (>= 0)
if (_slowBlockThresholdMs >= 0)
{
long processingMs = data.ProcessingMicroseconds / 1000;
if (processingMs >= _slowBlockThresholdMs)
{
LogSlowBlock(block, data, mgasPerSec);
}
src/Nethermind/Nethermind.Consensus/Processing/ProcessingStats.cs:176
- PerTxTimingCollector is a static process-wide switch, but CaptureStartStats only ever enables it (SetEnabled(true)) and never disables it when SlowBlockPerTxThresholdMs < 0. If per-tx logging is enabled once (or another ProcessingStats instance enables it), later instances/blocks with per-tx disabled can still record timings and LogSlowBlock will treat a -1 threshold as 'log all' (negative ticks threshold). SetEnabled should be called with the current config value (true/false) each block, or make the collector instance-scoped.
// Enable per-tx timing on the current block-processing thread.
// Must be set here (not in Start()) because the async processing loop
// can resume on a different ThreadPool thread after each await.
if (_slowBlockPerTxThresholdMs >= 0)
{
PerTxTimingCollector.SetEnabled(true);
}
src/Nethermind/Nethermind.Consensus/Processing/ProcessingStats.cs:699
- The slow-block JSON 'evm' object omits cached-contract/cache-usage info even though the PR description schema includes it (e.g. cached_contracts_used) and ProcessingStats already tracks CurrentCachedContractsUsed/StartCachedContractsUsed. If cross-client standardization expects this field, add it to the JSON and the test schema/model accordingly.
writer.WriteStartObject("evm");
writer.WriteNumber("opcodes", data.CurrentOpCodes - data.StartOpCodes);
writer.WriteNumber("sload", data.CurrentSLoadOps - data.StartSLoadOps);
writer.WriteNumber("sstore", data.CurrentSStoreOps - data.StartSStoreOps);
writer.WriteNumber("calls", data.CurrentCallOps - data.StartCallOps);
writer.WriteNumber("empty_calls", data.CurrentEmptyCalls - data.StartEmptyCalls);
writer.WriteNumber("creates", data.CurrentCreatesOps - data.StartCreateOps);
writer.WriteNumber("self_destructs", data.CurrentSelfDestructOps - data.StartSelfDestructOps);
writer.WriteNumber("contracts_analyzed", data.CurrentContractsAnalyzed - data.StartContractsAnalyzed);
writer.WriteEndObject();
src/Nethermind/Nethermind.Consensus/Processing/ProcessingStats.cs:739
- LogSlowBlock swallows exceptions and logs only ex.Message at debug level, losing the stack trace and exception details. Log the exception object (or include stack trace) to make failures diagnosable, especially since slow-block logging runs on the ThreadPool and issues may be intermittent.
catch (Exception ex)
{
if (_logger.IsDebug) _logger.Debug($"Error logging slow block: {ex.Message}");
}
src/Nethermind/Nethermind.State/StateProvider.cs:733
- AccountReads is only incremented on cache misses (when the account isn't in _blockChanges). On cache hits you increment StateTreeCacheHits but not AccountReads, so slow-block JSON 'state_reads.accounts' ends up representing only misses and will duplicate cache.account.misses rather than total logical reads. If the intent is total reads (as implied by having separate cache hit/miss counts), increment AccountReads on both hit and miss paths (or rename the JSON field to clarify it's DB/state-trie reads only).
ref ChangeTrace accountChanges = ref CollectionsMarshal.GetValueRefOrAddDefault(_blockChanges, addressAsKey, out bool exists);
if (!exists)
{
Metrics.IncrementStateTreeReads();
EvmMetrics.IncrementAccountReads();
Account? account = _tree.Get(address);
accountChanges = new(account, account);
}
else
{
Metrics.IncrementStateTreeCacheHits();
}
src/Nethermind/Nethermind.Evm/Metrics.cs:171
- These new execution metrics use shared Interlocked increments. Under ParallelExecution, worker threads inherit IsBlockProcessingThread=true (see ProcessingStats remarks), so multiple workers will contend on the same '_main*' counters. If the goal is minimal overhead for cross-client metrics, consider per-thread counters (e.g., ThreadLocal/ThreadStatic/striped counters) and aggregating at snapshot time to reduce contention.
[CounterMetric]
[Description("Number of account reads during execution.")]
public static long AccountReads => _mainAccountReads + _otherAccountReads;
private static long _mainAccountReads;
private static long _otherAccountReads;
// Exposed for ProcessingStats so block-level deltas exclude background prewarmer activity.
internal static long MainThreadAccountReads => _mainAccountReads;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
internal static void IncrementAccountReads()
{
if (!ExecutionMetricsFlag.IsActive) return;
Interlocked.Increment(ref IsBlockProcessingThread ? ref _mainAccountReads : ref _otherAccountReads);
}
[CounterMetric]
[Description("Number of storage slot reads during execution.")]
public static long StorageReads => _mainStorageReads + _otherStorageReads;
private static long _mainStorageReads;
private static long _otherStorageReads;
internal static long MainThreadStorageReads => _mainStorageReads;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
internal static void IncrementStorageReads()
{
if (!ExecutionMetricsFlag.IsActive) return;
Interlocked.Increment(ref IsBlockProcessingThread ? ref _mainStorageReads : ref _otherStorageReads);
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Review — focused on correctness and performanceVerdictFeature is well-designed and test coverage is excellent. However, the PR description's claim that "there is zero overhead on block processing when disabled" is not quite accurate. The new EVM-side metric counters fire unconditionally in default builds (gated only by compile-time Performance concerns1. Hot-path atomics added regardless of
|
- Document ExecutionMetricsFlag's compile-time gate and unconditional default-build cost - Cache BlobCount on BlockData; reuse in LogSlowBlock instead of re-summing transactions - Reset all Delta* fields in BlockDataPolicy.Return - Emit parallel_execution flag in the slow-block JSON (wired via BlocksConfig) - Debug-log when executionMs/evmMs clamp negative (surfaces inconsistent sub-phase deltas) - Document that the timing increments take TimeSpan ticks, not Stopwatch ticks - Document PerTxTimingCollector / _enableTxTimingMetrics reliance on the parallel join barrier - Drop redundant = false initializer and double blank line in BlockValidationTransactionsExecutor Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@claude review |
|
Claude finished @Marchhill's task in 5m 50s —— View job PR Review: consensus: standardize slow block JSON output for cross-client metrics Tasks:
SummaryThis is round 7. LukaszRozmej posted a review at 2026-05-20 14:15 UTC with 5 "Suggested fixes before merge." Marchhill committed two fixes today (56776b6 at 11:46 and ebd66dd at 16:30, the latter post-review) that address all 5 blockers. No new Critical or High issues found. LukaszRozmej's Blockers (from 2026-05-20) — All Resolved
Additional items addressed this round
All Prior Findings — Confirmed Fixed
Remaining Low Items (non-blocking, acknowledged)
Positive Notes (all still valid)
| Branch |
…eptions - Replace SpinWait.SpinUntil polling in ProcessingStatsTests / SlowBlockIntegrationTests with a Monitor.Wait/PulseAll-based WaitableTestLogger; the test thread unblocks the instant the ThreadPool worker emits the slow-block JSON, no fixed sleeps under load. - LogSlowBlock catch now logs at Error with the full Exception (stack trace included) instead of dropping it via ex.Message at Debug, matching the existing ExecuteFromThreadPool pattern. Block processing is still protected from log failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@claude re-review |
|
Claude finished @benaadams's task in 6m 3s —— View job PR Re-Review (Round 8): consensus: standardize slow block JSON output for cross-client metricsTasks:
SummaryAll High/Medium findings from Rounds 1–7 remain confirmed fixed. The latest commit ( New Findings from
|
| Finding | Round | Status |
|---|---|---|
| Storage cache hit rate bug | 1 | ✅ Fixed |
| Per-tx timing all-zeros in parallel mode | 1 | ✅ Fixed |
StorageWrites counted TSTORE |
1 | ✅ Fixed |
| Interface pollution (default no-op methods) | 1 | ✅ Fixed |
| State I/O prewarmer contamination | 3 | ✅ Fixed |
cache.code.hits numerator/denominator asymmetry |
4 | ✅ Fixed |
| Account/storage trie cache prewarmer contamination | 4 | ✅ Fixed |
PerTxTimingCollector._ticksPerTx plain heap array |
4 | ✅ Fixed |
cached_contracts_used ≡ cache.code.hits |
5 | ✅ Fixed |
SlowBlockLogEntry in wrong test assembly |
5 | ✅ Fixed |
ProcessingStats ctor default discrepancy |
5 | ✅ Fixed |
| Parallel caveat not documented | 5–6 | ✅ Fixed |
| All five LukaszRozmej blockers | 5–7 | ✅ Fixed |
Unconditional IsActive => true dead toggle |
6–7 | ✅ Fixed — #if NETHERMIND_NO_EXECUTION_METRICS |
Delta* fields not reset in BlockDataPolicy.Return |
7 | ✅ Fixed — all 18 fields reset |
parallel_execution not in JSON |
7 | ✅ Fixed |
| Negative-time fallback silently lies | 7 | ✅ Fixed — now debug-logs clamp |
| PR description "zero overhead" claim | 7 | ✅ Fixed — ExecutionMetricsFlag docs now explicit |
Duplicate AccountReads/StorageReads Interlocked overhead |
implicit | ✅ Removed in a03e6d1b |
Acknowledged Low (non-blocking, unchanged)
PerTxTimingCollectorstatics notvolatile— class<remarks>explicitly documents the join-barrier dependency onParallelUnbalancedWork.For; safe on x86/.NET.- Missing parallel path test — no coverage of
ProcessTransactionsParallelfor per-tx timing; single-block assumption is undocumented under actual concurrent workers. - Test fixture duplication —
MetricsIntegrationTests/SlowBlockIntegrationTestsshare ~30 lines of setup boilerplate;AGENTS.mdrecommends factoring this out.
Positive Notes
Everything praised in prior rounds still holds. The a03e6d1b commit is a net improvement: removing three hot-path Interlocked.Increment calls per state access (two for account reads, one for storage reads) reduces per-block overhead with no loss of accuracy, since the DB-level Db.Metrics trie counters already track this information. WaitableTestLogger with Monitor.Wait/PulseAll (replacing the old Thread.Sleep(50) polling) is a clean deterministic test fix.
| Branch
|
@claude re-review |
|
Claude finished @benaadams's task in 6m 9s —— View job PR Re-Review (Round 9): consensus: standardize slow block JSON output for cross-client metrics Tasks:
SummaryThree new commits since Round 8. All prior High/Medium findings remain fixed. The new commits are improvements; no new High or Medium issues found. This PR is ready to merge. New Commits — Analysis
|
| Finding | Status |
|---|---|
PR description example stale after a03e6d1b |
✅ Non-issue — PR description already shows state_reads.accounts: 322 = cache.account.misses: 322; numbers are self-consistent |
Asymmetric state_reads.* semantics (accounts/code = misses, storage = hits+misses) |
✅ Documented in "State read field definitions" table; intentional by design |
MetricsIntegrationTests uses combined counter |
✅ Fixed in 2c66f538 |
All Prior Findings — All Confirmed Fixed
| Finding | Round | Status |
|---|---|---|
| Storage cache hit rate bug | 1 | ✅ |
| Per-tx timing all-zeros in parallel mode | 1 | ✅ |
StorageWrites counted TSTORE |
1 | ✅ |
| Interface pollution (default no-op methods) | 1 | ✅ |
| State I/O prewarmer contamination | 3 | ✅ |
cache.code.hits numerator/denominator asymmetry |
4 | ✅ |
| Account/storage trie cache prewarmer contamination | 4 | ✅ |
PerTxTimingCollector._ticksPerTx plain heap array |
4 | ✅ |
cached_contracts_used ≡ cache.code.hits (removed) |
5 | ✅ (restored with doc in 9) |
SlowBlockLogEntry in wrong test assembly |
5 | ✅ |
ProcessingStats ctor default discrepancy |
5 | ✅ |
| Parallel caveat not documented | 5–6 | ✅ |
| All five LukaszRozmej Round 7 blockers | 7 | ✅ |
ExecutionMetricsFlag.IsActive dead toggle |
7 | ✅ |
Delta* fields not reset in BlockDataPolicy.Return |
7 | ✅ |
| Negative-time clamp silently lies | 7 | ✅ |
| Prewarmer contamination at source | 8→9 | ✅ Fixed in 68d6f79e |
MetricsIntegrationTests combined counter |
8 | ✅ Fixed in 2c66f538 |
Remaining Low Items (non-blocking, acknowledged)
evm.cached_contracts_usedexample inconsistency — the JSON example in the PR body showscached_contracts_used: 97andcache.code.hits: 180, but at runtime they are always equal (sameMainThreadCodeDbCachedelta). The "EVM field definitions" table documents the equivalence; consider updating the example to show equal values.PerTxTimingCollectorstatics notvolatile— documented in class<remarks>as relying on theParallelUnbalancedWork.Forjoin-barrier; safe on x86/.NET.- Missing parallel path test — no test covers
ProcessTransactionsParallelfor per-tx timing under actual concurrent workers. - Test fixture duplication —
MetricsIntegrationTests/SlowBlockIntegrationTestsshare ~30 lines of setup boilerplate.
Positive Notes
All structural quality from prior reviews still holds, and the [ThreadStatic] change in 68d6f79e makes the whole metrics attribution model materially cleaner: the MainThread* accessors now truly reflect main-thread-only work at the source, not just at the query boundary.
| Branch
- Factor the duplicated SpecProvider / WorldState / VM / TxProcessor / Ecdsa setup out of MetricsIntegrationTests + SlowBlockIntegrationTests into an IDisposable EvmTestHarness in Nethermind.Evm.Test/Helpers/. Add Nethermind.Evm.Test as a ProjectReference of Nethermind.Consensus.Test so the harness is visible to both fixtures. - Add PerTxTimingCollectorTests covering ProcessTransactionsParallel's contract on the static collector: 512 concurrent ParallelUnbalancedWork.For workers writing unique values via Record, plus zero-tx prepare, disabled-collector returns null, ownership-transfer cleanliness across blocks, and out-of-range Record being silently dropped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Updates #10288
Summary
Implement standardized JSON format for slow block logging to enable cross-client performance analysis and protocol research.
This change is part of the Cross-Client Execution Metrics initiative proposed by Gary Rong and CPerezz.
Motivation
Standardized execution metrics are critical for:
Real-world example: The EIP-7907 analysis used execution metrics to measure code read latency, per-call overhead scaling, and block execution breakdown. Without standardized metrics across clients, such analysis cannot be validated cross-client.
References
Configuration
Two new flags in
Blocksconfig section:Blocks.SlowBlockThresholdMs-1(disabled)1000to log blocks slower than 1s. Set to0to log all blocks.Blocks.SlowBlockPerTxThresholdMs-1(disabled)0to log all transactions.Both flags default to
-1(disabled). When slow-block logging is disabled, the block-level delta capture, per-transaction timing, and JSON serialization paths are skipped.The exported execution counters added by this PR are controlled by the compile-time
NETHERMIND_NO_EXECUTION_METRICSsymbol, not by the runtime slow-block flags. Default builds keep those counters enabled for Prometheus; benchmark builds can defineNETHERMIND_NO_EXECUTION_METRICSto let the JIT fold the gated counter/timer paths away.Example usage
JSON Output Format
{ "level": "warn", "msg": "Slow block", "parallel_execution": false, "block": { "number": 22075123, "hash": "0xabc...", "gas_used": 29850000, "gas_limit": 30000000, "tx_count": 142, "blob_count": 6 }, "timing": { "execution_ms": 52.1, "evm_ms": 45.2, "blooms_ms": 1.3, "receipts_root_ms": 3.8, "commit_ms": 2.1, "storage_merkle_ms": 18.4, "state_root_ms": 5.2, "state_hash_ms": 23.6, "total_ms": 82.0 }, "throughput": { "mgas_per_sec": 364.1 }, "state_reads": { "accounts": 322, "storage_slots": 3201, "code": 35, "code_bytes": 482310 }, "state_writes": { "accounts": 312, "accounts_deleted": 0, "storage_slots": 1805, "storage_slots_deleted": 12, "code": 3, "code_bytes": 8420, "eip7702_delegations_set": 0, "eip7702_delegations_cleared": 0 }, "cache": { "account": { "hits": 1520, "misses": 322, "hit_rate": 82.52 }, "storage": { "hits": 2800, "misses": 401, "hit_rate": 87.48 }, "code": { "hits": 180, "misses": 35, "hit_rate": 83.72 } }, "evm": { "opcodes": 1542000, "sload": 3201, "sstore": 1805, "calls": 2412, "empty_calls": 48, "creates": 3, "self_destructs": 0, "contracts_analyzed": 118, "cached_contracts_used": 97 }, "transactions": [ { "index": 42, "hash": "0xdef...", "gas_used": 8200000, "execution_ms": 12.3, "type": "Legacy", "to": "0x7a25..." }, { "index": 107, "hash": "0x123...", "gas_used": 5100000, "execution_ms": 8.7, "type": "EIP1559", "to": "0xdead..." } ] }State read field definitions
The
state_readsfields are derived from existing cache counters where possible so the hot path does not record the same event twice:state_reads.accountscache.account.misses.state_reads.storage_slotscache.storage.hits + cache.storage.misses.state_reads.codecache.code.misses.state_reads.code_bytesEVM field definitions
evm.cached_contracts_usedcache.code.hits; retained because it is part of the original cross-client metrics spec.Timing field definitions
total_msexecution_mstotal - state_hash - commit(backwards-compatible)evm_mstotal - state_hash - commit - blooms - receipts_root(pure EVM)blooms_msreceipts_root_mscommit_msstorage_merkle_msCommit(commitRoots: true))state_root_msstate_hash_msstorage_merkle + state_root(total merkleization)The
transactionsarray is only present whenSlowBlockPerTxThresholdMs >= 0.Implementation
Db.Metricstrie cache/read counters.ExecutionMetricsFlag.Stopwatch.GetTimestamp()only at coarse block-phase boundaries and for optional per-transaction timing.Utf8JsonWriter+ArrayBufferWriter<byte>on the ThreadPool, not the block-processing thread.ArrayPoolList<long>storage and oneStopwatch.GetTimestamp()per transaction when enabled.UnsafeQueueUserWorkItem.