feat(metrics): implement new metric groups for consumer and smart-router#2238
Conversation
Review Summary by QodoImplement metric groups for consumer and smart-router with incident tracking and histogram-based latency
WalkthroughsDescription• Implemented comprehensive metric groups for consumer and smart-router with organized semantic categories: cross-validation, request, latency, cache, and incident metrics • Replaced individual gauge metrics with histogram-based latency measurements (HistogramVec) for end-to-end and provider latency tracking • Simplified cross-validation metrics from 8 labels to 3 labels with separate provider agreement/disagreement counters • Added incident tracking methods: RecordIncidentRetry, RecordIncidentConsistency, RecordIncidentHedgeResult for better error recovery visibility • Extended RelayMetrics struct with request classification fields (IsWrite, IsArchive, IsDebugTrace, IsBatch, ProviderAddress, HedgeCount) • Added helper functions for request classification: IsArchiveRequest, IsDebugOrTraceRequest, IsBatchRequest • Removed deprecated metrics and methods: LatencyTracker, error recovery metrics, processing latency gauges, and ticker metric setter interfaces • Updated method signatures across metrics managers and relay state machines for consistency • Added 8 new comprehensive test files covering request groups, incident metrics (retry, consistency, hedge), cross-validation, cache, latency, and error metrics • Removed obsolete error recovery metric tests and simplified test assertions • Defined shared LatencyBuckets configuration for histogram-based latency metrics Diagramflowchart LR
A["Old Metrics<br/>Individual Gauges<br/>LatencyTracker"] -->|"Refactor"| B["New Metric Groups<br/>Cross-validation<br/>Request<br/>Latency<br/>Cache<br/>Incident"]
C["RelayMetrics<br/>Basic Fields"] -->|"Extend"| D["RelayMetrics<br/>+ Request Classification<br/>+ Provider Address<br/>+ Hedge Count"]
E["Error Recovery<br/>Metrics"] -->|"Replace"| F["Incident Tracking<br/>Retry/Consistency<br/>Hedge Results"]
G["Ticker Metric<br/>Setter Interface"] -->|"Remove"| H["Direct Analytics<br/>HedgeCount"]
B -->|"Test Coverage"| I["8 New Test Files<br/>Request/Incident/Cache<br/>Cross-validation/Latency"]
File Changes1. protocol/metrics/consumer_metrics_manager.go
|
Code Review by Qodo
1. Chainlib tests missing underscore
|
c16d690 to
4b79dee
Compare
586bccb to
52913ae
Compare
52913ae to
17c1813
Compare
17c1813 to
ee2fb69
Compare
ee2fb69 to
68abdb6
Compare
68abdb6 to
36bc401
Compare
45a85d4 to
1ac60e0
Compare
1ac60e0 to
12c0a37
Compare
12c0a37 to
a15af12
Compare
a15af12 to
1ce3aee
Compare
User description
Description
Closes: #XXXX
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!in the type prefix if API or client breaking changemainbranchReviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
I have...
Generated description
Below is a concise technical summary of the changes proposed in this PR:
Rearchitect
ConsumerMetricsManagerandSmartRouterMetricsManagerto register structured groups of counters/histograms for request, cache, incident, and latency dimensions so every relay can report spec/api/method/provider labels with shared bucket sets. Enable the RPC consumer and smart-router flows to populate the newRelayMetricsclassifications, cache stats, retries/hedges, and cross-validation metadata so headers and logs expose richer telemetry for downstream monitoring.Modified files (14)
Latest Contributors(2)
Modified files (1)
Latest Contributors(2)
Modified files (21)
Latest Contributors(2)