Skip to content

db/state/kvmetrics: process-level channel-fed KV-read metrics collector#21663

Merged
mh0lt merged 5 commits into
mh/perf-statecache-lru-prfrom
mh/metrics-collector-pr
Jun 10, 2026
Merged

db/state/kvmetrics: process-level channel-fed KV-read metrics collector#21663
mh0lt merged 5 commits into
mh/perf-statecache-lru-prfrom
mh/metrics-collector-pr

Conversation

@mh0lt

@mh0lt mh0lt commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Stacked on #21386 (the state-cache PR); this is the metrics-only change.

What

Replaces the per-task lock-merge of KV-read metrics with a process-level, channel-fed collector owned by the Aggregator, so every read path contributes (not just block execution) and the metrics carry a source label.

Why

changeset.DomainMetrics was execution-bound: only exec workers and the commitment fold (during exec) produced metrics, folding into a SharedDomains-scoped aggregate under a per-task lock. RPC/engine AsGetter reads collected nothing, so "KV read IO" really meant "IO during block execution." This makes IO observable process-wide, lock-free on the aggregate, and broken out by subsystem.

Design

  • New leaf package db/state/kvmetrics — relocates DomainIOMetrics/DomainMetrics + ctx helpers out of db/state/changeset (they never belonged there) and adds the Source enum, the Collector, and a shared LogMetrics(level, source, detail) formatter. Imports only kv + stdlib + the metrics façade → no import cycle.
  • Collector owned by the Aggregator (process lifetime): Start on a.wg, Stop+drain in Close. A single goroutine folds {source, metrics} samples into map[Source]*DomainMetrics with no lock/atomics on the aggregate, and self-publishes source-labelled Prometheus gauges (kv_read_count / kv_read_duration_ns, labels {source,domain,op}) on a ticker — additive to the existing mxExec* gauges. Reached from SharedDomains via the duck-typed MetricsCollectorProvider on *AggregatorRoTx (same pattern as BranchCacheProvider).
  • Never block, never drop. Exec workers retain an accumulator and hand it off with a non-blocking TrySend; on a full buffer they keep adding and retry next task, with one blocking flush at worker exit. Boundary producers (commitment fold, warmup teardown, RPC request close) use the blocking Send (off the hot path, lossless).
  • Sources: exec, commitment, warmup, and RPC (eth_simulation's SimulateV1, via a request-scoped accumulator flushed at Close). Engine block execution is already covered as exec (it runs through the exec workers).
  • The per-batch log line (sd.metrics) is kept unchanged via LogMergeMetrics.

Deliberately scoped out (follow-ups)

  • memBatch put-path (CachePut*) stays on the existing shared aggregate: those counters are load-bearing for SizeEstimate's flush accounting, so moving them belongs in a separate change.
  • SourceEngine and per-read-getter paths (exec_module CacheView, vm/runtime) build a getter per read and need a view-level accumulator to meter — left for a follow-up.

Verification

  • make erigon / go build ./... (the import-cycle gate), make lint 0 issues.
  • go test -race ./db/state/kvmetrics — incl. a test proving TrySend never blocks on a full buffer, concurrent Send+Snapshot+Close-drain, and correct grouped folding.
  • hive ethereum/engine cancun with KV_READ_METRICS=true at the CI-pinned hive ref = 226/0.

🤖 Generated with Claude Code

mh0lt and others added 4 commits June 7, 2026 10:13
Move DomainIOMetrics, DomainMetrics, the Update*/Merge/Reset methods, and the
context helpers out of db/state/changeset into a new leaf package
db/state/kvmetrics. The metrics types never belonged in the changeset package
(they describe KV-read IO, not changesets) and the move is a prerequisite for
the process-level collector: kvmetrics imports only kv + stdlib + the metrics
façade, so it can be imported from every layer that does state reads without an
import cycle.

Pure relocation — no behavior change. Also factors the per-batch log formatting
(formerly SharedDomains.LogMetrics/DomainLogMetrics) into a reusable
kvmetrics.FormatSummary/FormatPerDomain/LogMetrics that takes a log level,
source, and detail indicator, ready for the collector to drive.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a Collector, owned by the Aggregator (process lifetime), that aggregates
KV-read metrics across every read path. Producers fill their own *DomainMetrics
lock-free and hand ownership over a buffered channel tagged with a Source
(exec/commitment/warmup/rpc/engine); a single collector goroutine folds them
into map[Source]*DomainMetrics with no lock or atomics on the aggregate. The
goroutine also self-publishes source-labelled Prometheus gauges
(kv_read_count / kv_read_duration_ns, labels {source,domain,op}) on a ticker —
process-level and independent of whether a block is executing.

Wiring:
- Aggregator owns the Collector: Start in newAggregator (on a.wg), Stop+drain
  in Close before wg.Wait. Exposed to SharedDomains via the duck-typed
  kvmetrics.MetricsCollectorProvider on *AggregatorRoTx (same pattern as
  BranchCacheProvider), so the leaf kvmetrics package stays cycle-free.
- SharedDomains.MergeMetrics(source, wm) now hands a finished worker's metrics
  to BOTH sinks: the per-batch sd.metrics (under one lock, for the existing log
  line) and the collector (lock-free, for Prometheus). Ownership of wm transfers
  to the collector, so producers allocate a fresh instance afterwards.
- Producers tagged: exec workers (SourceExec, per task), the commitment fold
  (SourceCommitment), trie warmup and concurrent mount (SourceWarmup).
- AsGetterCollected(tx, source) gives concurrent short-lived callers (RPC,
  engine) a per-getter instance + flush closure; gated on KVReadLevelledMetrics.

The new gauges are additive — the existing exec-scoped mxExec* gauges and the
per-batch log line are unchanged. The memBatch put-path (CachePut*) is left on
the existing shared aggregate deliberately: those counters are load-bearing for
SizeEstimate's flush accounting, so moving them belongs in a separate change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a request-scoped read accumulator to SharedDomains
(StartRequestMetrics(source) / flush at Close) so a single-goroutine RPC handler
that reads through the plain AsGetter is metered without a shared accumulator or
a per-getter flush. getLatestMetered folds nil-metrics reads into it; this
short-circuits for exec workers (which pass their own per-worker instance), so
there is no cross-goroutine access to the request accumulator.

Wire eth_simulation's SimulateV1 to tag its reads as SourceRPC. Engine block
execution is already metered as SourceExec (it runs through the exec workers);
SourceEngine and the per-read-getter paths (exec_module CacheView, vm/runtime,
which build a getter per read) are left for a follow-up that needs a view-level
accumulator.

Also make Collector.Snapshot drain the buffered samples first so a snapshot
reflects everything sent so far, and add a -race collector test covering
concurrent Send + Snapshot + Close-drain.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…roducers

The exec hot path must never block on metrics and must never lose counts. A
buffered channel alone satisfies neither at the boundary: a full buffer blocks a
plain send, and a non-blocking send drops. Resolve both with retain-on-full.

- Collector.TrySend is non-blocking and returns whether the sample was queued.
  Exec workers keep a retained accumulator (collectorAcc): each task folds its
  reads in and TrySends; on a full buffer the send is skipped and the worker
  keeps adding to the same accumulator, retrying next task. A single blocking
  flush at worker exit drains the remainder (off the hot path, lossless). So
  execution never waits on metrics and no count is dropped.
- Collector.Send is the blocking variant for low-frequency boundary producers
  (commitment fold, warmup teardown, an RPC request closing) where a rare brief
  wait is fine and losing the sample is not.
- SharedDomains.LogMergeMetrics folds a task into the per-batch sd.metrics log
  aggregate only; the exec path uses it each task and feeds the collector
  separately via the retained accumulator, so the two sinks (which reset on
  different conditions) never double-count.
- Dropped the drop counter and its gauge — nothing is dropped now.

Validated: build, make lint (0 issues), go test -race ./db/state/kvmetrics
(incl. a test proving TrySend never blocks on a full buffer), and hive cancun
with KV_READ_METRICS=true at the pinned ref = 226/0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Ripples the statecache-lru main merge up. Resolved 4 conflicts by combining
the metrics-collector additions with the rippled cache/commitment changes:

- domain_shared.go: keep the kvmetrics MetricsCollectorProvider lookup +
  statecache-lru's new TrieConfig commitment-context ctor (cfg, branchCache).
- aggregator.go: keep the kvmetrics collector init + statecache-lru's
  oldestVisible; the MeteredGetLatestWithTxN/getLatestWithTxN methods now take
  *kvmetrics.DomainMetrics (the metrics relocation changeset -> kvmetrics).
- commitment_context.go: Metrics() returns *kvmetrics.DomainMetrics.
- eth_simulation.go: keep StartRequestMetrics(SourceRPC); the defer toggle is
  now sharedDomains.SetDeferCommitmentUpdates(false) (renamed in the refactor).

Validated: go build ./...; commitment + kvmetrics + execmodule reorg tests;
make lint clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mh0lt mh0lt merged commit 864e593 into mh/perf-statecache-lru-pr Jun 10, 2026
@mh0lt mh0lt deleted the mh/metrics-collector-pr branch June 10, 2026 14:25
mh0lt added a commit that referenced this pull request Jun 11, 2026
…ters

The sstoreInsert/Update/Delete/Noop and hasStorageMiss package-global atomics
were incremented on the commitment/exec hot paths but their getters
(SstoreClassificationCounts, HasStorageMissCount) have no consumer anywhere in
the stack (#21380, #21386, the perfviz view) — write-only prototype perf-debug
scaffolding. The canonical metrics framework (kvmetrics, #21663) is in main and
covers KV reads, not these. Remove the counters, their Record* funcs/getters,
and the call sites; this also drops execution/state's only import of
execution/commitment.

If SSTORE classification is wanted in production later, express it via the
kvmetrics collector rather than bespoke globals.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant