Prom stats perf improvements#24998
Conversation
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
Signed-off-by: rulex123 <erica.manno@gmail.com>
|
@ggreenway can you PTAL? |
ggreenway
left a comment
There was a problem hiding this comment.
This is looking really good. Thanks for making the test changes more minimal; that made reviewing much easier.
/wait
source/server/admin/stats_render.cc
Outdated
| const std::string& prefixed_tag_extracted_name) { | ||
| auto parent_histogram = dynamic_cast<Stats::ParentHistogram*>(metric.get()); | ||
| if (parent_histogram != nullptr) { | ||
| // TODO(rulex123): support for the 3 histogram "bucket modes" offered in other |
There was a problem hiding this comment.
I don't think this should be a TODO. There is only one valid way to represent histograms in the prometheus exposition format, and this is it. I think you can just delete the TODO comment.
There was a problem hiding this comment.
This makes sense.
FYI in #26472 I have added a new bucket mode tentatively called 'detailed' which will have the raw bucket data. That may also be useful for Prom as you can see a lot more interesting data with this IMO, but it might complicate aggregation in Prom across multiple Envoys.
Don't worry Greg -- that PR is still a ways away from being atomized into probably around 4 for review, but I think the first PR will be the addition of a new bucket-mode to show the original buckets computed by circlhist.
Nevertheless I agree removing this TODO is fine for now.
Signed-off-by: rulex123 <erica.manno@gmail.com>
|
/retest |
|
Retrying Azure Pipelines: |
Commit Message: Reproduces a scenario where it's difficult to use a streaming optimization for Prometheus stats based on scopes without introducing a bug. Context: Issue that /stats/prometheus endpoint takes too much much memory: Prometheus stats handler used too much memory. #16139 Solution for non-Prometheus endpoints to use less memory for /stats and run faster: admin: Streaming /stats implementation #19693 which I believe is working well. This solution mapped to Prometheus: Prom stats perf improvements #24998 A case where this solution has a duplicate# TYPE line due to two stats with the same tag-extracted-stat name from two different scopes: stats: prometheus scrape failed #27173 Note that the existing unit tests did not exercise that scenario so this PR adds a testcase. Signed-off-by: Joshua Marantz <jmarantz@google.com>
(i.e. doesn't cater for stats with identical tag extracted name but from different scopes). Signed-off-by: rulex123 <erica.manno@gmail.com>
Add support for streaming/chunking of prometheus stats - basically adapts the algorithm introduced in envoyproxy#19693 for prometheus stats (while keeping common parts of the algorithm in a shared class to avoid code duplication). With this patch we should be able to improve the issue described here envoyproxy#16139 Signed-off-by: rulex123 <erica.manno@gmail.com> Signed-off-by: Ryan Eskin <ryan.eskin89@protonmail.com>
Commit Message: Reverts envoyproxy#24998 See envoyproxy#27173 Additional Description: Risk Level: n/a Testing: n/a Docs Changes: n/a Release Notes: n/a Platform Specific Features: n/a Signed-off-by: Ryan Eskin <ryan.eskin89@protonmail.com>
…xy#27239) Commit Message: Reproduces a scenario where it's difficult to use a streaming optimization for Prometheus stats based on scopes without introducing a bug. Context: Issue that /stats/prometheus endpoint takes too much much memory: Prometheus stats handler used too much memory. envoyproxy#16139 Solution for non-Prometheus endpoints to use less memory for /stats and run faster: admin: Streaming /stats implementation envoyproxy#19693 which I believe is working well. This solution mapped to Prometheus: Prom stats perf improvements envoyproxy#24998 A case where this solution has a duplicate# TYPE line due to two stats with the same tag-extracted-stat name from two different scopes: stats: prometheus scrape failed envoyproxy#27173 Note that the existing unit tests did not exercise that scenario so this PR adds a testcase. Signed-off-by: Joshua Marantz <jmarantz@google.com> Signed-off-by: Ryan Eskin <ryan.eskin89@protonmail.com>
Commit Message: patch to add support for streaming/chunking of prometheus stats - basically adapts the algorithm introduced in #19693 for prometheus stats (while keeping common parts of the algorithm in a shared class to avoid code duplication).
With this patch we should be able to improve the issue described here #16139
Risk Level: only underlying implementation changes, but otherwise no change of behaviour/features. So low risk I would say.
Testing: current tests should remain unchanged (though they might be moved around due to refactoring of code); new tests are also added.
Docs Changes: n/a
Release Notes: n/a
[Optional Runtime guard:]
[Optional Fixes #Issue]: #16139
Draft PR (already went through some review) -> #24716
Benchmark data from running
stats_handler_speed_test.ccon a dev box:BEFORE
AFTER