-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Closed
Description
Title: Server stat gauges missing
Description:
we just started receiving some alerts on our dev clusters claiming envoy is down, it looks like as of 1f9e2fd (and probably earlier) the
envoy.server.livemetric is not being published
Repro steps:
Start the server.
curl localhost:9901/stats/prometheus | grep live
From slack:
Nicholas [5:14 PM]
we just started receiving some alerts on our dev clusters claiming envoy is down, it looks like as of 1f9e2fd690e3747017f9e7aa8e6368592f5c71e7 (and probably earlier) the `envoy.server.live` metric is not being published
was this an intended deprecation or a bug? (edited)
stephan [6:33 PM]
looks like a bug
stephan [6:51 PM]
67d1eb4474461f41ee4cf388a98043b84f671b1f added “bool indicators” but doesn’t emit that from the prometheus stats end point
Nicholas [6:53 PM]
Hmm I tried checking the regular stats endpoint and it seems to also be absent, I looked through the code and it isn’t obvious that it has changed with recent commits. I’ll take another look tonight unless someone else finds it first
stephan [6:57 PM]
it’s probably missing from everything.
stephan [7:04 PM]
Definitely need to pass `server_.stats().boolIndicators()` to PrometheusStatsFormatter::statsAsPrometheus in admin.cc (and then emit them). The plain stats output needs the same in `AdminImpl::handlerStats`. Seems missing from `UdpStatsdSink` and `MetricsServiceSink` in extensions/stats_sink. (I think the Hysterix one only emits specific stats and doesn’t need to be fixed).
@fredlas :point_up: I think that boolIndicators() change dropped all the converted gauges from stats output.
Reactions are currently unavailable