A51 update: add ServerMetricRecorder API and QPS field#345
A51 update: add ServerMetricRecorder API and QPS field#345ejona86 merged 8 commits intogrpc:masterfrom
Conversation
|
Updated after today's sync:
|
A51-custom-backend-metrics.md
Outdated
| function recordUtilizationMetric(String name, double value); | ||
|
|
||
| // Records an opaque named metric measurement for the call. | ||
| function recordNamedMetricsMetric(String name, double value); |
There was a problem hiding this comment.
The named_metrics and eps changes do not belong in this PR. We'll make those changes in a separate PR.
There was a problem hiding this comment.
Removed. I think I accidentally rebased on top of the follow-up change (eps/named metrics).
| // Records number of queries per second to the server in the range [0, infy). | ||
| // Values outside of the valid range are rejected. |
There was a problem hiding this comment.
Looking at the WIP grpc java implementation, they are ignored, not rejected. Should we clarify here if by "rejected" we mean "ignored" - because rejected may imply the method could throw.
On the other hand, I'm a bit afraid if we just ignore the value. I agree that throwing is a bad idea, but just swallowing an error leads to reporting incorrect data. I don't think we can judge if skewing a particular metric is critical to a customer.
One example that come to mind: a customer is storing qps internally in an int, the service gets a huge number of requests, and the int gets overflowed. In this case, they'll call setQps() with a negative value, when in fact there was a huge spike. If another service goes down because of a qps spike, but our service doesn't report the qps increase, an engineer responding to the incident won't be able to identify the source of the request spike correctly.
I wonder if it'd make sense require logging a warning when an incorrect value is presented to the MetricRecorder/CallMetricRecorder. However, depending on how often the data is reported, it might lead to the logs getting flooded. Obviously, we can restrict how often to produce these warnings, but this may be going to far out-of-scope.
There was a problem hiding this comment.
This section is for C++ and it's from comments in code so I don't think "rejected" may confuse. Java snippet does not have this word.
Regarding the range validation, I believe the caller can validate or log in such cases. Note that recorder APIs already ignore with no feedback today (before this change) and this PR just clarifies that. Today the value will be rejected during proto's range validation and/or when consumed by the client where neither is explicit.
No description provided.