Skip to content

Stats broken out by response code and class lead to duplicated metrics in tagged stats formats #2141

@JonathanO

Description

@JonathanO

Description:
Stats such as upstream_rq which are broken out by response code (e.g. 404) and response code class (e.g. 4xx) lead to duplicated data in metrics.
For example:

cluster.FOO.upstream_rq_403: 2
cluster.FOO.upstream_rq_404: 1726
cluster.FOO.upstream_rq_4xx: 1728

Currently leads to the following in Prometheus format:

cluster_upstream_rq{cluster_name="FOO",envoy_response_code="403"} 2
cluster_upstream_rq{cluster_name="FOO",envoy_response_code="404"} 1726
cluster_upstream_rq{cluster_name="FOO",envoy_response_code_class="4xx"} 1728

Which means that aggregates over the cluster_upstream_rq metric are meaningless (e.g. sum(cluster_upstream_rq) = 3456).

The output should be metrics with both a response_code and response_code_class label:

cluster_upstream_rq{cluster_name="FOO",envoy_response_code="403",envoy_response_code_class="4xx"} 2
cluster_upstream_rq{cluster_name="FOO",envoy_response_code="404",envoy_response_code_class="4xx"} 1726

Or, alternatively, the response_code_class labels could be dropped entirely since they can be calculated by Prometheus from the response_code labels if needed.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions