[BUGFIX] Scraping: Naive fixes and optimzations for `CreatedTimestamp` function by Maniktherana · Pull Request #14965 · prometheus/prometheus

Maniktherana · 2024-09-23T13:15:52Z

addresses #14808 and #14823
ran this command:

go test -benchmem -run=^$ -bench ^BenchmarkParse$ -count=50 github.com/prometheus/prometheus/model/textparse

Benchstat result of running this test on main (old.txt) and this PR (new.txt):

goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/model/textparse
cpu: Apple M1
                                                    │   old.txt   │           new.txt           │
                                                    │   sec/op    │   sec/op     vs base        │
Parse/openmetrics-skip-ct/parse-ct/omtestdata.txt-8   350.7n ± 1%   106.3n ± 2%  -69.70% (n=50)

                                                    │   old.txt    │                new.txt                │
                                                    │     B/s      │     B/s       vs base                 │
Parse/openmetrics-skip-ct/parse-ct/omtestdata.txt-8   27.20Mi ± 2%   89.76Mi ± 3%  +230.03% (p=0.000 n=50)

                                                    │   old.txt   │          new.txt           │
                                                    │    B/op     │    B/op     vs base        │
Parse/openmetrics-skip-ct/parse-ct/omtestdata.txt-8   668.00 ± 0%   60.00 ± 0%  -91.02% (n=50)

                                                    │  old.txt   │          new.txt           │
                                                    │ allocs/op  │ allocs/op   vs base        │
Parse/openmetrics-skip-ct/parse-ct/omtestdata.txt-8   2.000 ± 0%   1.000 ± 0%  -50.00% (n=50)

Note: results are updated with each commit

Signed-off-by: Manik Rana <manikrana54@gmail.com>

Maniktherana · 2024-09-25T14:14:19Z

this PR depends on #14933

Signed-off-by: Manik Rana <manikrana54@gmail.com>

…prometheus into ct-om-optimization

Signed-off-by: Manik Rana <manikrana54@gmail.com>

Signed-off-by: Manik Rana <Manikrana54@gmail.com>

Signed-off-by: Manik Rana <manikrana54@gmail.com>

model/textparse/openmetricsparse_test.go

Signed-off-by: bwplotka <bwplotka@gmail.com>

bwplotka · 2024-10-03T10:07:26Z

Replicated the benchmark run with the new proposed benchmark #15083

Command I used:

 export bench=testv2 && go test ./model/textparse/... \
                 -run '^$' -bench '^BenchmarkParse/data=(promtestdata.txt|omtestdata.txt)/parser=(omtext|promtext)' \
                 -benchtime 2s -count 6 -cpu 2 -benchmem -timeout 999m \
         | tee ${bench}.txt

I did that for:

testv1: state from main (commit).
testv1-noct: state from main with CreatedTimestamp commented out.
testv2: state of om text parser from this PR (commit)

benchstat -row=".name /data /parser" -filter /parser:omtext testv1.txt testv1-noct.txt testv2.txt 
goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/model/textparse
cpu: Apple M1 Pro
                              │  testv1.txt  │          testv1-noct.txt           │             testv2.txt              │
                              │    sec/op    │   sec/op     vs base               │    sec/op     vs base               │
Parse omtestdata.txt omtext     144.11µ ± 2%   20.41µ ± 1%  -85.83% (p=0.002 n=6)    41.28µ ± 3%  -71.36% (p=0.002 n=6)
Parse promtestdata.txt omtext   5137.7µ ± 1%   152.3µ ± 1%  -97.04% (p=0.002 n=6)   3639.6µ ± 4%  -29.16% (p=0.002 n=6)
geomean                          860.5µ        55.75µ       -93.52%                  387.6µ       -54.96%

                              │  testv1.txt  │             testv1-noct.txt             │              testv2.txt               │
                              │     B/s      │      B/s        vs base                 │      B/s       vs base                │
Parse omtestdata.txt omtext     29.63Mi ± 2%    209.15Mi ± 1%   +605.97% (p=0.002 n=6)   103.45Mi ± 3%  +249.17% (p=0.002 n=6)
Parse promtestdata.txt omtext   6.194Mi ± 1%   208.964Mi ± 1%  +3273.60% (p=0.002 n=6)    8.740Mi ± 3%   +41.11% (p=0.002 n=6)
geomean                         13.55Mi          209.1Mi       +1443.26%                  30.07Mi       +121.97%

                              │   testv1.txt   │           testv1-noct.txt           │              testv2.txt              │
                              │      B/op      │     B/op      vs base               │     B/op       vs base               │
Parse omtestdata.txt omtext      274.31Ki ± 0%   14.59Ki ± 0%  -94.68% (p=0.002 n=6)    27.08Ki ± 0%  -90.13% (p=0.002 n=6)
Parse promtestdata.txt omtext   17634.6Ki ± 0%   113.0Ki ± 0%  -99.36% (p=0.002 n=6)   2158.5Ki ± 0%  -87.76% (p=0.002 n=6)
geomean                           2.148Mi        40.61Ki       -98.15%                  241.8Ki       -89.01%

                              │  testv1.txt  │          testv1-noct.txt           │             testv2.txt              │
                              │  allocs/op   │  allocs/op   vs base               │  allocs/op    vs base               │
Parse omtestdata.txt omtext      1278.0 ± 0%    216.0 ± 0%  -83.10% (p=0.002 n=6)     446.0 ± 0%  -65.10% (p=0.002 n=6)
Parse promtestdata.txt omtext   32.577k ± 0%   1.649k ± 0%  -94.94% (p=0.002 n=6)   30.826k ± 0%   -5.37% (p=0.002 n=6)
geomean                          6.452k         596.8       -90.75%                  3.708k       -42.53%

~50% improvement for all metrics, so it's a great iteration, I think we could start with that, will do readability review later on. If we could improve and NOT change tests too much (except deep copy testing), it would be amazing.

bwplotka · 2024-10-03T10:16:55Z

We still allocate a lot 113.0Ki ± 0% (no-ct) vs 2158.5Ki ± 0%

Capture pprof profiles: mem and cpu

You can see that even with optimizations CT parsing takes 70% of CPU for parsing:

.. mostly due to use of Metric (odd, considering we use Metric in benchmark for each series already - can we cache this?) and Next.

This is similar for mem

cc @bboreham

let me comment in the code for some suggestions

bwplotka · 2024-10-03T10:19:48Z

model/textparse/openmetricsparse.go

 			return nil
 		}

 		var peekedLset labels.Labels


easy optimization - move that to start of this function or even parser, then here do peekedLset =peekedLset[:0].

This allows to reuse internal array we use for label bytes (hopefully) (:

Signed-off-by: Manik Rana <manikrana54@gmail.com>

ArthurSens · 2024-10-03T12:37:07Z

model/textparse/openmetricsparse.go

+func findBaseMetricName(name string) string {
+	suffixes := []string{"_created", "_count", "_sum", "_bucket", "_total", "_gcount", "_gsum", "_info"}
+	for _, suffix := range suffixes {
+		if strings.HasSuffix(name, suffix) {
+			return strings.TrimSuffix(name, suffix)
+		}
+	}
+	return name
+}


Not sure if we can do that without checking the appropriate types.

E.g. _total is only a suffix for counters, but it is still possible to have other metrics with this suffix and it will be part of the metric name. As far as I understood there is nothing in OM that prevents that from happening 🤔

Yes, that's true
This function is used in one place atm (beginning of created timestamp function). otherwise we check with hashsets like before.

There's a weird case where if we compare a metric line that had no extra labels with another of created line, it would generate the same hash (because we don't include __name__ when comparing hashsets)

I feel there is a way to remove this logic (the whole function).

We use currName to figure family name essentially, but isn't this parsed somewhere from metadata? 🤔

I'd like to find a way too to get rid of this function.

But even if we find the currName won't it have suffixes attached to it?

Most of the time yes, but unfortunately all metadata fields are optional in OM, including type. But bartek has a good point, we don't need to do any of this if metadata was set, and just call this function if metadata isn't present.

This brings another problem though, if we call this function it means we don't know the type and therefore we don't know which suffixes we were supposed to remove 😬

This brings another problem though, if we call this function it means we don't know the type and therefore we don't know which suffixes we were supposed to remove 😬

we already check for that using typeRequiresCT

but it is still possible to have other metrics with this suffix and it will be part of the metric name.

but might still leave room for this

We will never be perfect, but it would be nice to have exact test case for where we "know" we will be wrong. Ideally "wrong" means failure, not corrupted CT (:

okay to better illustrate why we have this function I've updated tests
if you replace line 280 with

if currName == p.visitedName && currFamilyLsetHash == p.ctHashSet && p.visitedName != "" && p.ctHashSet > 0 && p.ct > 0 {

and debug it with TestOpenMetricsParse

and focus on currFamilyLsetHash vs p.ctHashSet you'll notice we'll have the same hash in this case and it would incorrectly early return CT

model/textparse/openmetricsparse_test.go

Signed-off-by: Manik Rana <manikrana54@gmail.com>

bwplotka

Ok, we synced and let's merge this first. It's not ideal as I have a feeling it introduces some regressions (around the things we discussed with thing_total gauges after thing counter). It's a small thing though AND we did optimize things 2x (still lot's to go). With benchmark and testing refactors (#15095), might be better to merge and iterate

bboreham · 2024-10-08T15:57:54Z

Small nit: "feat" in the headline denotes a feature, something a user might read and think "that's new!".
(and we have historically written "[FEATURE]" for those things).
Fixing a bug is [BUGFIX] and optimisations are [PERF].

bboreham · 2024-10-08T17:15:19Z

This looks like it should be backported to release-2.55 branch.

…rometheus#14965) * enhance: wip ct parse optimizations Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: further work on optimization Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: further improvements and remove unused code Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: improve optimizations and fix some CT parse errors Signed-off-by: Manik Rana <manikrana54@gmail.com> * fix: check for LsetHash along with name Signed-off-by: Manik Rana <manikrana54@gmail.com> * chore: cleanup and documentation Signed-off-by: Manik Rana <manikrana54@gmail.com> * enhance: improve comments and add cleaner functions Signed-off-by: Manik Rana <manikrana54@gmail.com> * feat: improve comments and add cleaner functions Signed-off-by: Manik Rana <manikrana54@gmail.com> * chore: rename to resetCTParseValues Signed-off-by: Manik Rana <manikrana54@gmail.com> * fix: post-merge fixes Signed-off-by: Manik Rana <manikrana54@gmail.com> * fix: add all possible reserved suffixes Signed-off-by: Manik Rana <manikrana54@gmail.com> * test: separate CT values for each metric Signed-off-by: Manik Rana <manikrana54@gmail.com> --------- Signed-off-by: Manik Rana <manikrana54@gmail.com> Signed-off-by: Manik Rana <Manikrana54@gmail.com>

…rometheus#14965)

Maniktherana added 3 commits September 23, 2024 17:53

enhance: wip ct parse optimizations

34dcf04

Signed-off-by: Manik Rana <manikrana54@gmail.com>

feat: further work on optimization

3c66759

Signed-off-by: Manik Rana <manikrana54@gmail.com>

feat: further improvements and remove unused code

ae8cbd9

Signed-off-by: Manik Rana <manikrana54@gmail.com>

Maniktherana changed the title ~~feat: naive optimzations for CreatedTimestamp function~~ feat: (WIP) naive optimzations for CreatedTimestamp function Sep 23, 2024

Maniktherana added 2 commits September 23, 2024 21:27

feat: improve optimizations and fix some CT parse errors

91029cd

Signed-off-by: Manik Rana <manikrana54@gmail.com>

fix: check for LsetHash along with name

06dc527

Signed-off-by: Manik Rana <manikrana54@gmail.com>

Maniktherana changed the title ~~feat: (WIP) naive optimzations for CreatedTimestamp function~~ feat: naive fixes and optimzations for CreatedTimestamp function Sep 24, 2024

Tusenka approved these changes Sep 24, 2024

View reviewed changes

chore: cleanup and documentation

dcf5b76

Signed-off-by: Manik Rana <manikrana54@gmail.com>

Maniktherana and others added 7 commits September 25, 2024 23:47

enhance: improve comments and add cleaner functions

21427b7

Signed-off-by: Manik Rana <manikrana54@gmail.com>

feat: improve comments and add cleaner functions

ae62895

Signed-off-by: Manik Rana <manikrana54@gmail.com>

Merge branch 'ct-om-optimization' of https://github.com/Maniktherana/…

dccb725

…prometheus into ct-om-optimization

chore: rename to resetCTParseValues

78f8415

Signed-off-by: Manik Rana <manikrana54@gmail.com>

Merge branch 'main' into ct-om-optimization

0b0d44c

Signed-off-by: Manik Rana <Manikrana54@gmail.com>

Merge branch 'main' into ct-om-optimization

1669c29

Signed-off-by: Manik Rana <Manikrana54@gmail.com>

fix: post-merge fixes

32d218a

Signed-off-by: Manik Rana <manikrana54@gmail.com>

bwplotka reviewed Oct 3, 2024

View reviewed changes

model/textparse/openmetricsparse_test.go Show resolved Hide resolved

bwplotka mentioned this pull request Oct 3, 2024

textparse: Refactored benchmark #15083

Merged

bwplotka added a commit that referenced this pull request Oct 3, 2024

State from #14965

66a6323

Signed-off-by: bwplotka <bwplotka@gmail.com>

bwplotka reviewed Oct 3, 2024

View reviewed changes

fix: add all possible reserved suffixes

4a40f2e

Signed-off-by: Manik Rana <manikrana54@gmail.com>

Maniktherana marked this pull request as ready for review October 3, 2024 10:39

ArthurSens reviewed Oct 3, 2024

View reviewed changes

test: separate CT values for each metric

97521bf

Signed-off-by: Manik Rana <manikrana54@gmail.com>

bwplotka mentioned this pull request Oct 4, 2024

textparse: Refactored main testing utils for reusability; fixed proto Units. #15095

Merged

bwplotka approved these changes Oct 4, 2024

View reviewed changes

bwplotka merged commit 47aeca9 into prometheus:main Oct 4, 2024

Maniktherana mentioned this pull request Oct 4, 2024

[PERF] textparse: further optimzations for OM CreatedTimestamps #15097

Merged

bboreham changed the title ~~feat: naive fixes and optimzations for CreatedTimestamp function~~ [BUGFIX] Scraping: Naive fixes and optimzations for CreatedTimestamp function Oct 8, 2024

beorn7 mentioned this pull request Oct 8, 2024

Prepare 3.0.0-beta.1 #14976

Merged

Maniktherana mentioned this pull request Oct 12, 2024

Talk proposal: Breaking and optimizing the Prometheus parser pydelhi/talks#300

Closed

squat mentioned this pull request Nov 8, 2024

perf: Backport Created Timestamp Parsing Optimizations #15364

Closed

julienduchesne pushed a commit to julienduchesne/prometheus that referenced this pull request Dec 13, 2024

feat: naive fixes and optimzations for CreatedTimestamp function (p…

4bf8620

…rometheus#14965)

Conversation

Maniktherana commented Sep 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Maniktherana commented Sep 25, 2024

Uh oh!

Uh oh!

bwplotka commented Oct 3, 2024

Uh oh!

bwplotka commented Oct 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bwplotka Oct 3, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurSens Oct 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Maniktherana Oct 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bwplotka Oct 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Maniktherana Oct 3, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurSens Oct 3, 2024

Choose a reason for hiding this comment

Uh oh!

Maniktherana Oct 3, 2024

Choose a reason for hiding this comment

Uh oh!

Maniktherana Oct 3, 2024

Choose a reason for hiding this comment

Uh oh!

bwplotka Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

Maniktherana Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bwplotka left a comment

Choose a reason for hiding this comment

Uh oh!

bboreham commented Oct 8, 2024

Uh oh!

bboreham commented Oct 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Maniktherana commented Sep 23, 2024 •

edited

Loading

bwplotka commented Oct 3, 2024 •

edited

Loading

ArthurSens Oct 3, 2024 •

edited

Loading

Maniktherana Oct 3, 2024 •

edited

Loading

bwplotka Oct 3, 2024 •

edited

Loading

Maniktherana Oct 4, 2024 •

edited

Loading