Skip to content

add per-cluster histograms with remaining timeout budget #6122

@danielhochman

Description

@danielhochman

Since timeouts are dynamic (e.g. controllable via header values), it is difficult to observe how close requests are to hitting the timeout in production.

A histogram of remaining timeout budget would allow the user to identify latencies that are trending towards the timeout cliff as well as identify poorly tuned timeouts.

Likely a percentage histogram would be the most useful stat, i.e. p95 of requests to upstream are within 80% of their timeout budget.

Also see #6121: add computed-timeout header.

Metadata

Metadata

Assignees

Labels

enhancementFeature requests. Not bugs or questions.help wantedNeeds help!

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions