feat: add duration and traceId to degraded events#8455
Conversation
e3eb1a1 to
c4cf4d5
Compare
|
@metamaskbot publish-preview |
|
Preview builds have been published. Learn how to use preview builds in other projects. Expand for full list of packages and versions. |
b9ab587 to
e8262db
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 4fd9aedc6833f2458d4ee8749632ebaa137599a8. Configure here.
|
@metamaskbot publish-preview |
1 similar comment
|
@metamaskbot publish-preview |
|
Preview builds have been published. Learn how to use preview builds in other projects. Expand for full list of packages and versions. |
e61cfa5 to
10f285d
Compare
When a request succeeds but takes longer than degradedThreshold,
the onDegraded event now emits { duration } instead of void, so
higher layers can include the actual request duration in their
event payloads.
…stable onDegraded type
…ugh RpcServiceChain
|
@metamaskbot publish-preview |
| const commonFields = { | ||
| endpointUrl: this.endpointUrl.toString(), | ||
| rpcMethodName: this.#currentRpcMethodName, | ||
| traceId: this.#currentTraceId, | ||
| }; | ||
| if (hasProperty(data, 'duration') && typeof data.duration === 'number') { | ||
| listener({ | ||
| endpointUrl: this.endpointUrl.toString(), | ||
| rpcMethodName: this.#currentRpcMethodName, | ||
| ...commonFields, | ||
| duration: data.duration, | ||
| }); | ||
| } else { | ||
| listener({ | ||
| ...data, | ||
| endpointUrl: this.endpointUrl.toString(), | ||
| rpcMethodName: this.#currentRpcMethodName, | ||
| ...commonFields, | ||
| duration: undefined, | ||
| }); | ||
| } |
There was a problem hiding this comment.
There still seem to be some type errors here. I think this can just be:
| const commonFields = { | |
| endpointUrl: this.endpointUrl.toString(), | |
| rpcMethodName: this.#currentRpcMethodName, | |
| traceId: this.#currentTraceId, | |
| }; | |
| if (hasProperty(data, 'duration') && typeof data.duration === 'number') { | |
| listener({ | |
| endpointUrl: this.endpointUrl.toString(), | |
| rpcMethodName: this.#currentRpcMethodName, | |
| ...commonFields, | |
| duration: data.duration, | |
| }); | |
| } else { | |
| listener({ | |
| ...data, | |
| endpointUrl: this.endpointUrl.toString(), | |
| rpcMethodName: this.#currentRpcMethodName, | |
| ...commonFields, | |
| duration: undefined, | |
| }); | |
| } | |
| listener({ | |
| ...data, | |
| endpointUrl: this.endpointUrl.toString(), | |
| rpcMethodName: this.#currentRpcMethodName, | |
| traceId: this.#currentTraceId, | |
| }); |
|
|
||
| ### Changed | ||
|
|
||
| - **BREAKING:** The `RpcServiceRequestable` type's `onDegraded` listener now receives `duration?: number` and `traceId?: string` in its data parameter ([#8455](https://github.com/MetaMask/core/pull/8455)) |
There was a problem hiding this comment.
I've been meaning to get rid of this type as we don't really need it anymore. If we change RpcService not to implement this interface anymore then we don't need to change this type at all, and we can avoid the breaking change here. May be worth doing that before merging this.
There was a problem hiding this comment.
I've removed the changes from RpcServiceRequestable as well as the changelog here: 0e695d0
| payload: [ | ||
| { | ||
| chainId: Hex; | ||
| duration?: number; |
There was a problem hiding this comment.
I have to check to make sure that this also isn't a breaking change. I thought at first it wasn't, but then given that event payloads show up as arguments to callbacks, I am having second doubts.
There was a problem hiding this comment.
Traditionally we have not treated additional payload properties as breaking, but I think we should, so that's what I've done here.

Explanation
RPC endpoint degraded events (
NetworkController:rpcEndpointDegradedandNetworkController:rpcEndpointChainDegraded) now include two new optional properties in their payloads:duration(number | undefined): The policy execution time in milliseconds when the request succeeded but was slow (i.e., exceeded thedegradedThreshold). This isundefinedwhen the degraded event was caused by retries being exhausted.traceId(string | undefined): The value of thex-trace-idresponse header from the last request attempt. This enables correlating degraded events with backend traces for debugging RPC health issues. It isundefinedwhen no response was received or the header was not present.How it works
The
durationvalue originates from Cockatiel'sretryPolicy.onSuccesscallback increateServicePolicy. Previously, it was only used for the threshold comparison — now it is also emitted as part of the degraded event data.The
traceIdis captured inRpcServicefromresponse.headers.get('x-trace-id')in the samefinallyblock that tracksrpcMethodName, ensuring it reflects the last completed request attempt (handling concurrent request race conditions correctly).Both values are threaded through the event chain:
createServicePolicy→RpcService→RpcServiceChain→createNetworkClient→ messenger events.Breaking change
The
RpcServiceRequestabletype'sonDegradedlistener signature now includesduration?: numberandtraceId?: stringin its data parameter. Implementors of this interface will need to accept the new fields in theironDegradedcallback signature.References
Checklist
Note
Medium Risk
This is a breaking, cross-package event/type change that affects all
onDegraded/messenger subscribers and could cause downstream TypeScript or runtime assumptions to fail if payload handling isn’t updated. Logic changes are small but touch retry/degraded signaling and request metadata capture.Overview
Degraded-event payloads now include more diagnostics.
createServicePolicy’sonDegradedevent no longer emitsvoidfor slow successes; it emits{ duration: number }, and tests were updated accordingly.Network degraded messenger events are extended.
NetworkController:rpcEndpointDegradedandNetworkController:rpcEndpointChainDegradedpayloads add optionalduration(slow success) andtraceId(last responseX-Trace-Idheader), with plumbing added fromRpcServicethroughRpcServiceChain/createNetworkClientand expanded test coverage (including avoiding staletraceIdwhen a retry throws before a response).Reviewed by Cursor Bugbot for commit 9867d45. Bugbot is set up for automated code reviews on this repo. Configure here.