[BUGFIX] Scraping: Naive fixes and optimzations for CreatedTimestamp function#14965
[BUGFIX] Scraping: Naive fixes and optimzations for CreatedTimestamp function#14965bwplotka merged 15 commits intoprometheus:mainfrom
CreatedTimestamp function#14965Conversation
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
CreatedTimestamp functionCreatedTimestamp function
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
CreatedTimestamp functionCreatedTimestamp function
Signed-off-by: Manik Rana <manikrana54@gmail.com>
|
this PR depends on #14933 |
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
…prometheus into ct-om-optimization
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
|
Replicated the benchmark run with the new proposed benchmark #15083 Command I used: I did that for:
~50% improvement for all metrics, so it's a great iteration, I think we could start with that, will do readability review later on. If we could improve and NOT change tests too much (except deep copy testing), it would be amazing. |
|
We still allocate a lot Capture pprof profiles: mem and cpu You can see that even with optimizations CT parsing takes 70% of CPU for parsing: .. mostly due to use of This is similar for mem cc @bboreham let me comment in the code for some suggestions |
| return nil | ||
| } | ||
|
|
||
| var peekedLset labels.Labels |
There was a problem hiding this comment.
easy optimization - move that to start of this function or even parser, then here do peekedLset =peekedLset[:0].
This allows to reuse internal array we use for label bytes (hopefully) (:
Signed-off-by: Manik Rana <manikrana54@gmail.com>
| func findBaseMetricName(name string) string { | ||
| suffixes := []string{"_created", "_count", "_sum", "_bucket", "_total", "_gcount", "_gsum", "_info"} | ||
| for _, suffix := range suffixes { | ||
| if strings.HasSuffix(name, suffix) { | ||
| return strings.TrimSuffix(name, suffix) | ||
| } | ||
| } | ||
| return name | ||
| } |
There was a problem hiding this comment.
Not sure if we can do that without checking the appropriate types.
E.g. _total is only a suffix for counters, but it is still possible to have other metrics with this suffix and it will be part of the metric name. As far as I understood there is nothing in OM that prevents that from happening 🤔
There was a problem hiding this comment.
Yes, that's true
This function is used in one place atm (beginning of created timestamp function). otherwise we check with hashsets like before.
There's a weird case where if we compare a metric line that had no extra labels with another of created line, it would generate the same hash (because we don't include __name__ when comparing hashsets)
There was a problem hiding this comment.
I feel there is a way to remove this logic (the whole function).
We use currName to figure family name essentially, but isn't this parsed somewhere from metadata? 🤔
There was a problem hiding this comment.
I'd like to find a way too to get rid of this function.
But even if we find the currName won't it have suffixes attached to it?
There was a problem hiding this comment.
Most of the time yes, but unfortunately all metadata fields are optional in OM, including type. But bartek has a good point, we don't need to do any of this if metadata was set, and just call this function if metadata isn't present.
This brings another problem though, if we call this function it means we don't know the type and therefore we don't know which suffixes we were supposed to remove 😬
There was a problem hiding this comment.
This brings another problem though, if we call this function it means we don't know the type and therefore we don't know which suffixes we were supposed to remove 😬
we already check for that using typeRequiresCT
There was a problem hiding this comment.
but it is still possible to have other metrics with this suffix and it will be part of the metric name.
but might still leave room for this
There was a problem hiding this comment.
We will never be perfect, but it would be nice to have exact test case for where we "know" we will be wrong. Ideally "wrong" means failure, not corrupted CT (:
There was a problem hiding this comment.
okay to better illustrate why we have this function I've updated tests
if you replace line 280 with
if currName == p.visitedName && currFamilyLsetHash == p.ctHashSet && p.visitedName != "" && p.ctHashSet > 0 && p.ct > 0 {and debug it with TestOpenMetricsParse
and focus on currFamilyLsetHash vs p.ctHashSet you'll notice we'll have the same hash in this case and it would incorrectly early return CT
Signed-off-by: Manik Rana <manikrana54@gmail.com>
bwplotka
left a comment
There was a problem hiding this comment.
Ok, we synced and let's merge this first. It's not ideal as I have a feeling it introduces some regressions (around the things we discussed with thing_total gauges after thing counter). It's a small thing though AND we did optimize things 2x (still lot's to go). With benchmark and testing refactors (#15095), might be better to merge and iterate
CreatedTimestamp functionCreatedTimestamp function
|
Small nit: "feat" in the headline denotes a feature, something a user might read and think "that's new!". |
|
This looks like it should be backported to release-2.55 branch. |
…rometheus#14965) * enhance: wip ct parse optimizations Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: further work on optimization Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: further improvements and remove unused code Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: improve optimizations and fix some CT parse errors Signed-off-by: Manik Rana <manikrana54@gmail.com> * fix: check for LsetHash along with name Signed-off-by: Manik Rana <manikrana54@gmail.com> * chore: cleanup and documentation Signed-off-by: Manik Rana <manikrana54@gmail.com> * enhance: improve comments and add cleaner functions Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: improve comments and add cleaner functions Signed-off-by: Manik Rana <manikrana54@gmail.com> * chore: rename to resetCTParseValues Signed-off-by: Manik Rana <manikrana54@gmail.com> * fix: post-merge fixes Signed-off-by: Manik Rana <manikrana54@gmail.com> * fix: add all possible reserved suffixes Signed-off-by: Manik Rana <manikrana54@gmail.com> * test: separate CT values for each metric Signed-off-by: Manik Rana <manikrana54@gmail.com> --------- Signed-off-by: Manik Rana <manikrana54@gmail.com> Signed-off-by: Manik Rana <Manikrana54@gmail.com>
…rometheus#14965) * enhance: wip ct parse optimizations Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: further work on optimization Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: further improvements and remove unused code Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: improve optimizations and fix some CT parse errors Signed-off-by: Manik Rana <manikrana54@gmail.com> * fix: check for LsetHash along with name Signed-off-by: Manik Rana <manikrana54@gmail.com> * chore: cleanup and documentation Signed-off-by: Manik Rana <manikrana54@gmail.com> * enhance: improve comments and add cleaner functions Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: improve comments and add cleaner functions Signed-off-by: Manik Rana <manikrana54@gmail.com> * chore: rename to resetCTParseValues Signed-off-by: Manik Rana <manikrana54@gmail.com> * fix: post-merge fixes Signed-off-by: Manik Rana <manikrana54@gmail.com> * fix: add all possible reserved suffixes Signed-off-by: Manik Rana <manikrana54@gmail.com> * test: separate CT values for each metric Signed-off-by: Manik Rana <manikrana54@gmail.com> --------- Signed-off-by: Manik Rana <manikrana54@gmail.com> Signed-off-by: Manik Rana <Manikrana54@gmail.com>
…rometheus#14965) * enhance: wip ct parse optimizations Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: further work on optimization Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: further improvements and remove unused code Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: improve optimizations and fix some CT parse errors Signed-off-by: Manik Rana <manikrana54@gmail.com> * fix: check for LsetHash along with name Signed-off-by: Manik Rana <manikrana54@gmail.com> * chore: cleanup and documentation Signed-off-by: Manik Rana <manikrana54@gmail.com> * enhance: improve comments and add cleaner functions Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: improve comments and add cleaner functions Signed-off-by: Manik Rana <manikrana54@gmail.com> * chore: rename to resetCTParseValues Signed-off-by: Manik Rana <manikrana54@gmail.com> * fix: post-merge fixes Signed-off-by: Manik Rana <manikrana54@gmail.com> * fix: add all possible reserved suffixes Signed-off-by: Manik Rana <manikrana54@gmail.com> * test: separate CT values for each metric Signed-off-by: Manik Rana <manikrana54@gmail.com> --------- Signed-off-by: Manik Rana <manikrana54@gmail.com> Signed-off-by: Manik Rana <Manikrana54@gmail.com>


addresses #14808 and #14823
ran this command:
Benchstat result of running this test on main (old.txt) and this PR (new.txt):
Note: results are updated with each commit