Skip to content

feat(metrics): implement new metric groups for consumer and smart-router#2238

Merged
nimrod-teich merged 1 commit into
mainfrom
pr/metrics-groups
Mar 19, 2026
Merged

feat(metrics): implement new metric groups for consumer and smart-router#2238
nimrod-teich merged 1 commit into
mainfrom
pr/metrics-groups

Conversation

@NadavLevi

@NadavLevi NadavLevi commented Mar 17, 2026

Copy link
Copy Markdown
Contributor

User description

Description

Closes: #XXXX


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • read the contribution guide
  • included the correct type prefix in the PR title, you can find examples of the prefixes below:
  • confirmed ! in the type prefix if API or client breaking change
  • targeted the main branch
  • provided a link to the relevant issue or specification
  • reviewed "Files changed" and left comments if necessary
  • included the necessary unit and integration tests
  • updated the relevant documentation or specification, including comments for documenting Go code
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

Generated description

Below is a concise technical summary of the changes proposed in this PR:
Rearchitect ConsumerMetricsManager and SmartRouterMetricsManager to register structured groups of counters/histograms for request, cache, incident, and latency dimensions so every relay can report spec/api/method/provider labels with shared bucket sets. Enable the RPC consumer and smart-router flows to populate the new RelayMetrics classifications, cache stats, retries/hedges, and cross-validation metadata so headers and logs expose richer telemetry for downstream monitoring.

TopicDetails
Structured metrics Describe request, cache, incident, and latency metrics through grouped counters, histograms, and helper buckets so the consumer/smart-router can expose per-spec/api/method/provider telemetry consistently.
Modified files (14)
  • protocol/metrics/analytics.go
  • protocol/metrics/buckets.go
  • protocol/metrics/cache_metrics_test.go
  • protocol/metrics/consumer_metrics_manager.go
  • protocol/metrics/consumer_metrics_manager_inf.go
  • protocol/metrics/cross_validation_metrics_test.go
  • protocol/metrics/incident_consistency_metrics_test.go
  • protocol/metrics/incident_error_metrics_test.go
  • protocol/metrics/incident_hedge_metrics_test.go
  • protocol/metrics/incident_retry_metrics_test.go
  • protocol/metrics/latency_metrics_test.go
  • protocol/metrics/request_group_metrics_test.go
  • protocol/metrics/rpcconsumer_logs.go
  • protocol/metrics/smartrouter_metrics_manager.go
Latest Contributors(2)
UserCommitDate
nimrod.teich@gmail.comchore-remove-badge-ser...January 19, 2026
oren-lavafeat-Add-Prometheus-me...July 23, 2024
Other Other files
Modified files (1)
  • protocol/metrics/consumer_metrics_manager_test.go
Latest Contributors(2)
UserCommitDate
NadavLevifeat-smart-router-impl...March 11, 2026
shleikesFile-name-changes-and-...October 27, 2024
RPC telemetry Expand RPC consumer and smart-router flows (and supporting relay/core helpers) to feed the new metrics, report cache/provider totals, emit cross-validation/status headers, and track incident retries, consistency, and hedges for both direct and smartrouter relays.
Modified files (21)
  • protocol/chainlib/grpc.go
  • protocol/chainlib/jsonRPC.go
  • protocol/chainlib/rest.go
  • protocol/chainlib/tendermintRPC.go
  • protocol/internal/chainqueries/chain_message_queries.go
  • protocol/internal/chainqueries/chain_message_queries_test.go
  • protocol/lavasession/consumer_session_manager.go
  • protocol/relaycore/analytics_propagation_test.go
  • protocol/relaycore/interfaces.go
  • protocol/relaycore/relay_processor.go
  • protocol/relaycore/relay_processor_test.go
  • protocol/relaycore/testing_helpers_test.go
  • protocol/relaycoretest/testing_helpers.go
  • protocol/rpcconsumer/consumer_relay_state_machine.go
  • protocol/rpcconsumer/consumer_relay_state_machine_test.go
  • protocol/rpcconsumer/rpcconsumer_server.go
  • protocol/rpcconsumer/rpcconsumer_server_test.go
  • protocol/rpcsmartrouter/rpcsmartrouter_server.go
  • protocol/rpcsmartrouter/rpcsmartrouter_server_test.go
  • protocol/rpcsmartrouter/smartrouter_relay_state_machine.go
  • protocol/rpcsmartrouter/smartrouter_relay_state_machine_test.go
Latest Contributors(2)
UserCommitDate
nimrod.teich@gmail.comfix-single-element-JSO...March 19, 2026
NadavLevifeat-smart-router-impl...March 11, 2026
This pull request is reviewed by Baz. Review like a pro on (Baz).

@qodo-code-review

Copy link
Copy Markdown

Review Summary by Qodo

Implement metric groups for consumer and smart-router with incident tracking and histogram-based latency

✨ Enhancement 🧪 Tests

Grey Divider

Walkthroughs

Description
• Implemented comprehensive metric groups for consumer and smart-router with organized semantic
  categories: cross-validation, request, latency, cache, and incident metrics
• Replaced individual gauge metrics with histogram-based latency measurements (HistogramVec) for
  end-to-end and provider latency tracking
• Simplified cross-validation metrics from 8 labels to 3 labels with separate provider
  agreement/disagreement counters
• Added incident tracking methods: RecordIncidentRetry, RecordIncidentConsistency,
  RecordIncidentHedgeResult for better error recovery visibility
• Extended RelayMetrics struct with request classification fields (IsWrite, IsArchive,
  IsDebugTrace, IsBatch, ProviderAddress, HedgeCount)
• Added helper functions for request classification: IsArchiveRequest, IsDebugOrTraceRequest,
  IsBatchRequest
• Removed deprecated metrics and methods: LatencyTracker, error recovery metrics, processing
  latency gauges, and ticker metric setter interfaces
• Updated method signatures across metrics managers and relay state machines for consistency
• Added 8 new comprehensive test files covering request groups, incident metrics (retry,
  consistency, hedge), cross-validation, cache, latency, and error metrics
• Removed obsolete error recovery metric tests and simplified test assertions
• Defined shared LatencyBuckets configuration for histogram-based latency metrics
Diagram
flowchart LR
  A["Old Metrics<br/>Individual Gauges<br/>LatencyTracker"] -->|"Refactor"| B["New Metric Groups<br/>Cross-validation<br/>Request<br/>Latency<br/>Cache<br/>Incident"]
  C["RelayMetrics<br/>Basic Fields"] -->|"Extend"| D["RelayMetrics<br/>+ Request Classification<br/>+ Provider Address<br/>+ Hedge Count"]
  E["Error Recovery<br/>Metrics"] -->|"Replace"| F["Incident Tracking<br/>Retry/Consistency<br/>Hedge Results"]
  G["Ticker Metric<br/>Setter Interface"] -->|"Remove"| H["Direct Analytics<br/>HedgeCount"]
  B -->|"Test Coverage"| I["8 New Test Files<br/>Request/Incident/Cache<br/>Cross-validation/Latency"]
Loading

Grey Divider

File Changes

1. protocol/metrics/consumer_metrics_manager.go ✨ Enhancement +408/-422

Refactor metrics into organized groups with histograms

• Removed LatencyTracker struct and related latency tracking logic; replaced with histogram-based
 metrics
• Reorganized metrics into semantic groups: cross-validation, request, latency, cache, and incident
 metrics
• Replaced individual gauge metrics with HistogramVec for latency measurements (end-to-end and
 provider)
• Simplified cross-validation metric from 8 labels to 3 labels, with separate provider
 agreement/disagreement counters
• Removed deprecated metrics: totalRelaysRequestedMetric, totalErroredMetric, latencyMetric,
 endToEndLatencyMetric, and processing latency gauges
• Updated method signatures: SetCrossValidationMetric, SetRelayNodeErrorMetric,
 RecordEndToEndLatency, RecordProviderLatency with new parameters
• Added new incident tracking methods: RecordIncidentRetry, RecordIncidentConsistency,
 RecordIncidentHedgeResult
• Added cache metrics recording via RecordCacheResult method

protocol/metrics/consumer_metrics_manager.go


2. protocol/metrics/smartrouter_metrics_manager.go ✨ Enhancement +391/-96

Implement metric groups for smart router with incident tracking

• Moved LatencyBuckets constant to shared location (referenced as LatencyBuckets instead of
 defaultLatencyBuckets)
• Reorganized metrics into semantic groups: cross-validation, incident, request, and cache metrics
• Replaced routerCrossValidation single metric with 5 separate metrics for better cardinality
 control
• Removed routerTotalRelaysServiced and routerTotalErrored metrics; request tracking now via
 routerRequestsTotal and routerRequestsFailed
• Added incident group metrics: node errors, protocol errors, retries, consistency, and hedge
 tracking
• Added request group metrics with labels for read/write/batch/archive/debug-trace classification
• Added cache metrics group with hit/miss/latency tracking
• Updated SetProviderSelected signature to include apiInterface parameter
• Implemented StartSelectionStatsUpdater to periodically update endpoint selection scores from
 optimizer
• Updated RecordDirectRelayEnd to accept RelayMetrics parameter for request classification
• Added optimizerQoSClient field to manager for QoS score updates

protocol/metrics/smartrouter_metrics_manager.go


3. protocol/rpcconsumer/rpcconsumer_server.go ✨ Enhancement +76/-64

Update consumer relay metrics recording with new incident groups

• Removed rpcConsumerLogs parameter from NewRelayStateMachine calls (simplified state machine
 initialization)
• Added request classification fields to analytics: IsWrite, IsArchive, IsDebugTrace,
 IsBatch, ProviderAddress
• Added cache latency measurement and recording via RecordCacheResult with hit/miss tracking
• Replaced SetRelaySentToProviderMetric call with provider latency recording via
 RecordProviderLatency
• Updated SetProtocolError call to include apiInterface and method parameters
• Simplified cross-validation metric recording: replaced 8-parameter call with 5-parameter call
 using success boolean and provider lists
• Added incident metric recording for retries, consistency enforcement, and hedge operations
• Removed processing latency metric calls (AddMetricForProcessingLatencyBeforeProvider)
• Updated appendHeadersToRelayResult signature to accept analytics parameter for incident
 tracking

protocol/rpcconsumer/rpcconsumer_server.go


View more (33)
4. protocol/rpcsmartrouter/rpcsmartrouter_server.go ✨ Enhancement +57/-31

Update smart router relay metrics with incident tracking

• Removed rpcSmartRouterLogs parameter from NewSmartRouterRelayStateMachine calls
• Added request classification to analytics: IsWrite, IsArchive, IsDebugTrace, IsBatch
• Updated RecordDirectRelayEnd call to pass analytics parameter for request classification
• Added cache latency measurement and RecordCacheResult calls for cache hit/miss tracking
• Simplified cross-validation metric recording with new signature (5 parameters instead of 8)
• Added incident metric recording for retries, consistency, and hedge operations
• Removed processing latency metric call (AddMetricForProcessingLatencyBeforeProvider)
• Updated appendHeadersToRelayResult signature to accept analytics parameter

protocol/rpcsmartrouter/rpcsmartrouter_server.go


5. protocol/rpcsmartrouter/smartrouter_relay_state_machine.go Refactor +3/-9

Remove ticker metric setter interface from state machine

• Removed tickerMetricSetterInf interface and tickerMetricSetter field from state machine
• Removed tickerMetricSetter parameter from NewSmartRouterRelayStateMachine function
• Replaced SetRelaySentByNewBatchTickerMetric call with direct HedgeCount increment on
 analytics object
• Simplified batch ticker metric tracking by moving responsibility to caller

protocol/rpcsmartrouter/smartrouter_relay_state_machine.go


6. protocol/relaycore/relay_processor_test.go 🧪 Tests +3/-406

Remove obsolete error recovery metric tests

• Removed 403 lines of test code for error recovery metrics
 (TestNodeErrorsRecoveryMetricWithCrossValidation,
 TestProtocolErrorsRecoveryMetricWithCrossValidation, etc.)
• Simplified MockMetricsTracker struct by removing fields for tracking node and protocol error
 recovery calls
• Updated SetRelayNodeErrorMetric mock method signature to include method parameter
• Removed helper methods GetNodeErrorRecoveryCalls and GetProtocolErrorRecoveryCalls from mock

protocol/relaycore/relay_processor_test.go


7. protocol/rpcsmartrouter/rpcsmartrouter_server_test.go 🧪 Tests +18/-18

Update appendHeadersToRelayResult calls with nil parameter

• Added nil parameter to all appendHeadersToRelayResult method calls (12 occurrences)
• Updated method signature to accept an additional parameter at the end
• No changes to test logic or assertions

protocol/rpcsmartrouter/rpcsmartrouter_server_test.go


8. protocol/rpcconsumer/rpcconsumer_server_test.go 🧪 Tests +18/-18

Update appendHeadersToRelayResult calls with nil parameter

• Added nil parameter to all appendHeadersToRelayResult method calls (12 occurrences)
• Updated method signature to accept an additional parameter at the end
• No changes to test logic or assertions

protocol/rpcconsumer/rpcconsumer_server_test.go


9. protocol/metrics/request_group_metrics_test.go 🧪 Tests +252/-0

Add comprehensive request group metrics tests

• New test file with 252 lines covering request-group metric counters
• Tests for both ConsumerMetricsManager and SmartRouterMetricsManager using shared test cases
• Validates counters for total, success, failed, read, write, archive, debug, and batch request
 types
• Tests partition invariants and nil-manager safety

protocol/metrics/request_group_metrics_test.go


10. protocol/rpcsmartrouter/smartrouter_relay_state_machine_test.go 🧪 Tests +10/-11

Remove metrics parameter from state machine constructor calls

• Removed relaycoretest.RelayProcessorMetrics parameter from 11 NewSmartRouterRelayStateMachine
 calls
• Updated constructor signature to no longer require metrics parameter
• No changes to test logic or assertions

protocol/rpcsmartrouter/smartrouter_relay_state_machine_test.go


11. protocol/rpcconsumer/consumer_relay_state_machine_test.go 🧪 Tests +9/-10

Remove metrics parameter from state machine constructor calls

• Removed relaycoretest.RelayProcessorMetrics parameter from 11 NewRelayStateMachine calls
• Updated constructor signature to no longer require metrics parameter
• No changes to test logic or assertions

protocol/rpcconsumer/consumer_relay_state_machine_test.go


12. protocol/metrics/incident_hedge_metrics_test.go 🧪 Tests +168/-0

Add incident hedge metrics test coverage

• New test file with 168 lines covering incident hedge metrics
• Tests for both ConsumerMetricsManager and SmartRouterMetricsManager
• Validates RecordIncidentHedgeResult method with success/failure tracking and attempts histogram
• Tests partition invariants (total == success + failed) and nil-manager safety

protocol/metrics/incident_hedge_metrics_test.go


13. protocol/metrics/incident_retry_metrics_test.go 🧪 Tests +169/-0

Add incident retry metrics test coverage

• New test file with 169 lines covering incident retry metrics
• Tests for both ConsumerMetricsManager and SmartRouterMetricsManager
• Validates RecordIncidentRetry method with success/failure tracking and attempts histogram
• Tests partition invariants (total == success + failed) and nil-manager safety

protocol/metrics/incident_retry_metrics_test.go


14. protocol/metrics/cross_validation_metrics_test.go 🧪 Tests +178/-0

Add cross-validation metrics test coverage

• New test file with 178 lines covering cross-validation metrics
• Tests for both ConsumerMetricsManager and SmartRouterMetricsManager
• Validates SetCrossValidationMetric with agreeing/disagreeing provider tracking
• Tests partition invariants (success + failed == total) and nil-manager safety

protocol/metrics/cross_validation_metrics_test.go


15. protocol/metrics/cache_metrics_test.go 🧪 Tests +150/-0

Add cache metrics test coverage

• New test file with 150 lines covering cache metrics
• Tests for both ConsumerMetricsManager and SmartRouterMetricsManager
• Validates RecordCacheResult method for cache hits/misses with latency tracking
• Tests partition invariants and nil-manager safety

protocol/metrics/cache_metrics_test.go


16. protocol/metrics/rpcconsumer_logs.go ✨ Enhancement +26/-41

Refactor metrics interface with new incident tracking methods

• Removed calls to SetRelaySentToProviderMetric and SetRequestPerProvider from
 SetRelaySentToProviderMetric method
• Updated SetRelayNodeErrorMetric signature to accept `chainId, apiInterface, providerAddress,
 method` parameters
• Removed SetNodeErrorRecoveredSuccessfullyMetric and
 SetProtocolErrorRecoveredSuccessfullyMetric methods
• Updated SetCrossValidationMetric signature to use success boolean and provider lists instead
 of status string and participant counts
• Updated SetProtocolError signature to include apiInterface and method parameters
• Added new methods: RecordIncidentRetry, RecordIncidentConsistency,
 RecordIncidentHedgeResult, RecordEndToEndLatency, RecordCacheResult, RecordProviderLatency
• Removed old latency methods: AddMetricForProcessingLatencyBeforeProvider,
 AddMetricForProcessingLatencyAfterProvider, SetEndToEndLatency

protocol/metrics/rpcconsumer_logs.go


17. protocol/metrics/incident_error_metrics_test.go 🧪 Tests +140/-0

Add incident error metrics test coverage

• New test file with 140 lines covering incident error metrics
• Tests for both ConsumerMetricsManager and SmartRouterMetricsManager
• Validates SetRelayNodeErrorMetric and SetProtocolError methods with provider and method labels
• Tests accumulation across calls and nil-manager safety

protocol/metrics/incident_error_metrics_test.go


18. protocol/metrics/consumer_metrics_manager_inf.go ✨ Enhancement +26/-32

Update metrics interface with new method signatures

• Updated NoOpConsumerMetrics implementations to match new interface signatures
• Removed old methods: SetRelaySentToProviderMetric, SetRelaySentByNewBatchTickerMetric,
 SetRequestPerProvider, SetRelayProcessingLatencyBeforeProvider,
 SetRelayProcessingLatencyAfterProvider, SetEndToEndLatency,
 SetNodeErrorRecoveredSuccessfullyMetric, SetProtocolErrorRecoveredSuccessfullyMetric
• Added new methods: RecordEndToEndLatency, RecordProviderLatency, RecordCacheResult,
 RecordIncidentRetry, RecordIncidentConsistency, RecordIncidentHedgeResult
• Updated method signatures for SetRelayNodeErrorMetric, SetProtocolError,
 SetCrossValidationMetric, and SetProviderSelected
• Updated ConsumerMetricsManagerInf interface documentation with new method groupings

protocol/metrics/consumer_metrics_manager_inf.go


19. protocol/metrics/analytics.go ✨ Enhancement +19/-14

Extend RelayMetrics with request classification fields

• Added new fields to RelayMetrics struct: ProviderAddress, IsWrite, IsArchive,
 IsDebugTrace, IsBatch, HedgeCount
• Removed MeasureAfterProviderProcessingTime field
• Reorganized struct fields with comments explaining request classification and incident tracking

protocol/metrics/analytics.go


20. protocol/metrics/incident_consistency_metrics_test.go 🧪 Tests +126/-0

Add comprehensive tests for incident consistency metrics

• New test file with 126 lines of test coverage for incident consistency metrics
• Tests for both ConsumerMetricsManager and SmartRouterMetricsManager implementations
• Validates success/failure tracking, total metric calculations, and nil-safety
• Helper functions to instantiate test managers with Prometheus counter vectors

protocol/metrics/incident_consistency_metrics_test.go


21. protocol/metrics/latency_metrics_test.go 🧪 Tests +98/-0

Add latency metrics tests for consumer and smart-router

• New test file with 98 lines covering end-to-end and provider latency metrics
• Tests RecordEndToEndLatency and RecordProviderLatency methods for ConsumerMetricsManager
• Validates histogram observations, sample counts, and sample sums using Prometheus registry
• Includes nil-safety tests for both consumer and smart-router managers

protocol/metrics/latency_metrics_test.go


22. protocol/rpcconsumer/consumer_relay_state_machine.go Refactor +3/-9

Refactor ticker metrics to use analytics hedge count

• Removed tickerMetricSetterInf interface definition and related field from
 ConsumerRelayStateMachine
• Removed tickerMetricSetter parameter from NewRelayStateMachine constructor
• Replaced ticker metric setter call with direct analytics.HedgeCount++ increment
• Simplified metric recording by removing goroutine-based async metric setter invocation

protocol/rpcconsumer/consumer_relay_state_machine.go


23. protocol/chainlib/chain_message_queries_test.go 🧪 Tests +84/-0

Add tests for chain message query helper functions

• New test file with 84 lines covering chain message query helper functions
• Tests IsArchiveRequest, IsDebugOrTraceRequest, and IsBatchRequest functions
• Uses mocks to validate extension detection, addon classification, and batch detection
• Covers edge cases including nil extensions, multiple extensions, and various addon types

protocol/chainlib/chain_message_queries_test.go


24. protocol/relaycore/relay_processor.go Refactor +1/-16

Simplify relay processor metrics and update signatures

• Removed retry metrics recording logic for node and protocol errors in HasRequiredNodeResults
• Updated SetRelayNodeErrorMetric call signature to include method parameter and reorder
 arguments
• Removed unused strconv import
• Simplified error handling by removing conditional metric recording for stateless selection

protocol/relaycore/relay_processor.go


25. protocol/lavasession/consumer_session_manager.go ✨ Enhancement +2/-2

Add API interface parameter to provider selection metrics

• Updated two SetProviderSelected method calls to include csm.rpcEndpoint.ApiInterface parameter
• Ensures consistent metric recording with API interface information for provider selection
• Changes applied in both single-provider and multi-provider selection paths

protocol/lavasession/consumer_session_manager.go


26. protocol/relaycoretest/testing_helpers.go Refactor +2/-8

Update relay processor metrics mock interface signatures

• Updated SetRelayNodeErrorMetric mock signature to accept `chainId, apiInterface,
 providerAddress, method` parameters
• Removed SetNodeErrorRecoveredSuccessfullyMetric and
 SetProtocolErrorRecoveredSuccessfullyMetric mock methods
• Replaced SetRelaySentByNewBatchTickerMetric with RecordHedgeRelaySent method accepting
 method parameter

protocol/relaycoretest/testing_helpers.go


27. protocol/chainlib/chain_message_queries.go ✨ Enhancement +23/-0

Add chain message query helper functions

• Added three new helper functions: IsArchiveRequest, IsDebugOrTraceRequest, and
 IsBatchRequestIsArchiveRequest checks for archive extension in chain message extensions
• IsDebugOrTraceRequest validates if request belongs to debug or trace addon
• IsBatchRequest checks if message is a batch request
• Added import for extensionslib package

protocol/chainlib/chain_message_queries.go


28. protocol/relaycore/testing_helpers_test.go Refactor +2/-8

Update relay processor test mock interface signatures

• Updated SetRelayNodeErrorMetric mock signature to include method parameter
• Removed SetNodeErrorRecoveredSuccessfullyMetric and
 SetProtocolErrorRecoveredSuccessfullyMetric mock methods
• Replaced SetRelaySentByNewBatchTickerMetric with RecordHedgeRelaySent method

protocol/relaycore/testing_helpers_test.go


29. protocol/relaycore/analytics_propagation_test.go 🧪 Tests +2/-6

Simplify analytics propagation test assertions

• Simplified test assertions by removing checks for MeasureAfterProviderProcessingTime field
• Updated test comments to be more concise and descriptive
• Maintained core validation of ProcessingTimestamp before and after relay

protocol/relaycore/analytics_propagation_test.go


30. protocol/relaycore/interfaces.go Refactor +1/-3

Refactor metrics interface signatures and remove retry metrics

• Updated MetricsInterface by changing SetRelayNodeErrorMetric signature to accept `chainId,
 apiInterface, providerAddress, method` parameters
• Removed SetNodeErrorRecoveredSuccessfullyMetric and
 SetProtocolErrorRecoveredSuccessfullyMetric method definitions
• Simplified interface to focus on relay node error metrics only

protocol/relaycore/interfaces.go


31. protocol/chainlib/jsonRPC.go Refactor +1/-3

Remove post-response latency metric recording

• Removed call to apil.logger.AddMetricForProcessingLatencyAfterProvider in response handler
• Removed call to apil.logger.SetEndToEndLatency after sending response
• Simplified post-response processing by removing metric recording logic

protocol/chainlib/jsonRPC.go


32. protocol/metrics/buckets.go ✨ Enhancement +7/-0

Define shared latency histogram buckets configuration

• New file defining LatencyBuckets variable with 14 histogram bucket thresholds in milliseconds
• Buckets range from 1ms to 30000ms to capture latency distribution across cache hits, in-process
 operations, and network delays
• Shared bucket configuration for all latency metrics in the metrics package

protocol/metrics/buckets.go


33. protocol/chainlib/grpc.go Additional files +0/-2

...

protocol/chainlib/grpc.go


34. protocol/chainlib/rest.go Additional files +0/-4

...

protocol/chainlib/rest.go


35. protocol/chainlib/tendermintRPC.go Additional files +0/-4

...

protocol/chainlib/tendermintRPC.go


36. protocol/metrics/consumer_metrics_manager_test.go Additional files +0/-179

...

protocol/metrics/consumer_metrics_manager_test.go


Grey Divider

Qodo Logo

@qodo-code-review

qodo-code-review Bot commented Mar 17, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (1) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Chainlib tests missing underscore 📘 Rule violation ⛯ Reliability
Description
The newly added chainlib tests use names like TestIsArchiveRequest/TestIsBatchRequest that do
not follow the required TestComponent_Scenario naming convention. This violates the repository
test naming standard.
Code

protocol/chainlib/chain_message_queries_test.go[R12-70]

+func TestIsArchiveRequest(t *testing.T) {
+	t.Run("no extensions returns false", func(t *testing.T) {
+		ctrl := gomock.NewController(t)
+		msg := NewMockChainMessage(ctrl)
+		msg.EXPECT().GetExtensions().Return(nil)
+		require.False(t, IsArchiveRequest(msg))
+	})
+
+	t.Run("archive extension returns true", func(t *testing.T) {
+		ctrl := gomock.NewController(t)
+		msg := NewMockChainMessage(ctrl)
+		msg.EXPECT().GetExtensions().Return([]*spectypes.Extension{{Name: extensionslib.ArchiveExtension}})
+		require.True(t, IsArchiveRequest(msg))
+	})
+
+	t.Run("unrelated extension returns false", func(t *testing.T) {
+		ctrl := gomock.NewController(t)
+		msg := NewMockChainMessage(ctrl)
+		msg.EXPECT().GetExtensions().Return([]*spectypes.Extension{{Name: "other"}})
+		require.False(t, IsArchiveRequest(msg))
+	})
+
+	t.Run("archive among multiple extensions returns true", func(t *testing.T) {
+		ctrl := gomock.NewController(t)
+		msg := NewMockChainMessage(ctrl)
+		msg.EXPECT().GetExtensions().Return([]*spectypes.Extension{
+			{Name: "other"},
+			{Name: extensionslib.ArchiveExtension},
+		})
+		require.True(t, IsArchiveRequest(msg))
+	})
+}
+
+func TestIsDebugOrTraceRequest(t *testing.T) {
+	cases := []struct {
+		addon    string
+		expected bool
+	}{
+		{"debug", true},
+		{"trace", true},
+		{"", false},
+		{"eth", false},
+		{"debugx", false},
+	}
+
+	for _, tc := range cases {
+		tc := tc
+		t.Run("addon="+tc.addon, func(t *testing.T) {
+			ctrl := gomock.NewController(t)
+			msg := NewMockChainMessage(ctrl)
+			msg.EXPECT().GetApiCollection().Return(&spectypes.ApiCollection{
+				CollectionData: spectypes.CollectionData{AddOn: tc.addon},
+			})
+			require.Equal(t, tc.expected, IsDebugOrTraceRequest(msg))
+		})
+	}
+}
+
+func TestIsBatchRequest(t *testing.T) {
Evidence
PR Compliance ID 6 requires TestComponent_Scenario naming. The added tests TestIsArchiveRequest,
TestIsDebugOrTraceRequest, and TestIsBatchRequest contain no underscore-delimited scenario
portion.

AGENTS.md
protocol/chainlib/chain_message_queries_test.go[12-12]
protocol/chainlib/chain_message_queries_test.go[45-45]
protocol/chainlib/chain_message_queries_test.go[70-70]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Newly added tests do not follow the required `TestComponent_Scenario` naming convention (missing underscore scenario).

## Issue Context
Compliance requires consistent test naming across the repo for clarity and discoverability.

## Fix Focus Areas
- protocol/chainlib/chain_message_queries_test.go[12-70]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Incidents always marked success 🐞 Bug ✧ Quality
Description
RPCConsumerServer.appendHeadersToRelayResult passes relayResult != nil as the success flag for
incident metrics, but RelayProcessor can return a non-nil RelayResult while still returning an
error. This misclassifies failed relays as successful for retry/consistency/hedge incident metrics.
Code

protocol/rpcconsumer/rpcconsumer_server.go[R1949-1953]

+			// Record retry incident metrics
+			if rpccs.listenEndpoint != nil && rpccs.rpcConsumerLogs != nil {
+				chainId := rpccs.listenEndpoint.ChainID
+				apiInterface := rpccs.listenEndpoint.ApiInterface
+				go rpccs.rpcConsumerLogs.RecordIncidentRetry(chainId, apiInterface, apiName, totalRetries, relayResult != nil)
Evidence
appendHeadersToRelayResult uses relayResult != nil as the incident success indicator; however,
buildFailureResult returns a non-nil RelayResult along with an error, so relayResult != nil is not
equivalent to success.

protocol/rpcconsumer/rpcconsumer_server.go[1949-2016]
protocol/relaycore/relay_processor.go[785-810]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`appendHeadersToRelayResult` records incident metrics using `relayResult != nil` as the success flag. `RelayProcessor.buildFailureResult` returns a non-nil placeholder `RelayResult` together with an error, so failures are incorrectly recorded as successes.

### Issue Context
`SendParsedRelay` calls `appendHeadersToRelayResult` immediately after `relayProcessor.ProcessingResult()`, and only afterwards checks `err`. Therefore the success/failure information is available as `err == nil` at the call site.

### Fix Focus Areas
- protocol/rpcconsumer/rpcconsumer_server.go[476-499]
- protocol/rpcconsumer/rpcconsumer_server.go[1949-2016]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@codecov

codecov Bot commented Mar 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 25.25773% with 580 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
protocol/metrics/consumer_metrics_manager.go 21.89% 238 Missing and 1 partial ⚠️
protocol/metrics/smartrouter_metrics_manager.go 23.98% 223 Missing and 2 partials ⚠️
protocol/rpcsmartrouter/rpcsmartrouter_server.go 35.21% 39 Missing and 7 partials ⚠️
protocol/rpcconsumer/rpcconsumer_server.go 35.71% 31 Missing and 5 partials ⚠️
protocol/metrics/rpcconsumer_logs.go 0.00% 17 Missing ⚠️
protocol/metrics/consumer_metrics_manager_inf.go 0.00% 11 Missing ⚠️
...otocol/rpcconsumer/consumer_relay_state_machine.go 0.00% 1 Missing and 1 partial ⚠️
.../rpcsmartrouter/smartrouter_relay_state_machine.go 0.00% 1 Missing and 1 partial ⚠️
protocol/lavasession/consumer_session_manager.go 50.00% 1 Missing ⚠️
protocol/relaycoretest/testing_helpers.go 50.00% 1 Missing ⚠️
Flag Coverage Δ
consensus 8.71% <ø> (ø)
protocol 33.23% <25.25%> (-0.42%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
protocol/chainlib/grpc.go 45.94% <ø> (+0.27%) ⬆️
protocol/chainlib/jsonRPC.go 44.84% <ø> (+0.19%) ⬆️
protocol/chainlib/rest.go 42.99% <ø> (+0.55%) ⬆️
protocol/chainlib/tendermintRPC.go 40.58% <ø> (+0.31%) ⬆️
...col/internal/chainqueries/chain_message_queries.go 100.00% <100.00%> (ø)
protocol/metrics/analytics.go 0.00% <ø> (ø)
protocol/relaycore/interfaces.go 0.00% <ø> (ø)
protocol/relaycore/relay_processor.go 58.99% <100.00%> (-0.62%) ⬇️
protocol/lavasession/consumer_session_manager.go 60.84% <50.00%> (-0.19%) ⬇️
protocol/relaycoretest/testing_helpers.go 98.91% <50.00%> (-1.09%) ⬇️
... and 8 more

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread protocol/chainlib/chain_message_queries_test.go Outdated
Comment thread protocol/rpcconsumer/rpcconsumer_server.go Outdated
@NadavLevi NadavLevi force-pushed the pr/metrics-groups branch 2 times, most recently from 586bccb to 52913ae Compare March 17, 2026 12:05
Comment thread protocol/rpcconsumer/rpcconsumer_server.go
Comment thread protocol/rpcsmartrouter/rpcsmartrouter_server.go
Comment thread protocol/chainlib/chain_message_queries.go Outdated
Comment thread protocol/metrics/consumer_metrics_manager.go
Comment thread protocol/metrics/consumer_metrics_manager.go
@github-actions

github-actions Bot commented Mar 17, 2026

Copy link
Copy Markdown

Test Results

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
7 files   ±0   0 ❌ ±0 

Results for commit 1ce3aee. ± Comparison against base commit 9d5d2e2.

♻️ This comment has been updated with latest results.

Comment thread protocol/metrics/consumer_metrics_manager.go
Comment thread protocol/rpcconsumer/rpcconsumer_server.go
Comment thread protocol/rpcconsumer/rpcconsumer_server.go
Comment thread protocol/metrics/smartrouter_metrics_manager.go
Comment thread protocol/metrics/consumer_metrics_manager.go
@NadavLevi NadavLevi requested a review from avitenzer March 18, 2026 15:44
Comment thread protocol/rpcconsumer/rpcconsumer_server.go
avitenzer
avitenzer previously approved these changes Mar 19, 2026
@NadavLevi NadavLevi force-pushed the pr/metrics-groups branch 2 times, most recently from 45a85d4 to 1ac60e0 Compare March 19, 2026 09:45
Comment thread protocol/metrics/consumer_metrics_manager.go
Comment thread protocol/rpcconsumer/rpcconsumer_server.go
Comment thread protocol/rpcconsumer/rpcconsumer_server.go
Comment thread protocol/rpcconsumer/rpcconsumer_server.go
Comment thread protocol/rpcsmartrouter/rpcsmartrouter_server.go
Comment thread protocol/rpcconsumer/rpcconsumer_server.go
@nimrod-teich nimrod-teich merged commit 50e9a40 into main Mar 19, 2026
32 checks passed
@nimrod-teich nimrod-teich deleted the pr/metrics-groups branch March 19, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants