Skip to content

[TEST] Scraping: Add microbenchmarks for OM CT parsing#14933

Merged
bwplotka merged 9 commits intoprometheus:mainfrom
Maniktherana:benchmark-ct-om
Oct 2, 2024
Merged

[TEST] Scraping: Add microbenchmarks for OM CT parsing#14933
bwplotka merged 9 commits intoprometheus:mainfrom
Maniktherana:benchmark-ct-om

Conversation

@Maniktherana
Copy link
Contributor

@Maniktherana Maniktherana commented Sep 18, 2024

partially address #14823

Adds openmetrics with ct parsing enabled to BenchmarkParse

cc: @bwplotka @ArthurSens

Signed-off-by: Manik Rana <manikrana54@gmail.com>
@Maniktherana Maniktherana changed the title Microbenchmark OM CT parsing feat: add microbenchmarks for OM CT parsing Sep 18, 2024
Signed-off-by: Manik Rana <manikrana54@gmail.com>
@Maniktherana Maniktherana marked this pull request as ready for review September 20, 2024 06:26
Maniktherana and others added 3 commits September 20, 2024 19:18
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
@Maniktherana
Copy link
Contributor Author

Still unsure about approach here
I believe it makes sense that I keep benchmarking for CreatedTimestamp() function separate from BenchmarkParse since the former only benchmark's a specific portion of the parser

But I can still add the OM parser with skipCT enabled in BenchmarkParse function

@ArthurSens
Copy link
Member

ArthurSens commented Sep 25, 2024

I've spent some time looking at the benchmarks we have today and it seems like we benchmark both parsers with Prometheus-text exposition format. See how the test data doesn't have _created lines, doesn't have # UNIT lines, etc. It's fine benchmarking both Prometheus and OpenMetrics parsers since the Prometheus text format is a subset of OpenMetrics.

Fundamentally what we want to benchmark with this PR are things that differentiate OM from Prometheus format, so I think it makes sense to write a separate Benchmark function.

What do you think of renaming BenchmarkParse to BenchmarkPrometheusTxtParse and then writing a BenchmarkOpenMetricsTxtParse? The new benchmark will not use the current testdata, and the old benchmarks will not use the new testdata, this way we avoid a lot of conditionals that you had to add at the moment

@Maniktherana
Copy link
Contributor Author

What do you think of renaming BenchmarkParse to BenchmarkPrometheusTxtParse and then writing a BenchmarkOpenMetricsTxtParse? The new benchmark will not use the current testdata, and the old benchmarks will not use the new testdata, this way we avoid a lot of conditionals that you had to add at the moment

Sounds good to me

will BenchmarkPrometheusTxtParse have both OM parsers (CT enabled and disabled)? or is that just for BenchmarkOpenMetricsTxtParse

@ArthurSens
Copy link
Member

Sounds good to me

will BenchmarkPrometheusTxtParse have both OM parsers (CT enabled and disabled)? or is that just for BenchmarkOpenMetricsTxtParse

I would keep only the default behavior (not skipping CT) for PrometheusTxtParse.

Signed-off-by: Manik Rana <manikrana54@gmail.com>
@Maniktherana
Copy link
Contributor Author

Sounds good to me
will BenchmarkPrometheusTxtParse have both OM parsers (CT enabled and disabled)? or is that just for BenchmarkOpenMetricsTxtParse

I would keep only the default behavior (not skipping CT) for PrometheusTxtParse.

pushed the changes

Signed-off-by: Manik Rana <manikrana54@gmail.com>
ArthurSens
ArthurSens previously approved these changes Sep 26, 2024
Copy link
Member

@ArthurSens ArthurSens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit, but LGTM!

@bwplotka could you take a look as well?

Signed-off-by: Manik Rana <manikrana54@gmail.com>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, nice!

I wonder if it would make sense to actually clean BenchmarkParse. Perhaps put it in separate benchmark_test.go file, and have a clear 3 cases that always call Created Timestamp (with https://www.bwplotka.dev/2024/go-microbenchmarks-benchstat/ case naming):

for _, in := range []string{"omtestdata.txt", "promtestdata.txt", "promtestdata.nometa.txt"} {
b.Run(fmt.Sprintf("input=%v"), in, func....)
}

Then chose the parser based on input format.

Then you can have cases like case=Series, case=CreatedTimestamp and case=Series+Metric based on methods you are calling.

Does it make sense to have benchmark for skip true and false? Is this really that different between those? I think the main problem is CreatedTimestamp implementation right now, no matter if we skip or not, and I would capture the benchmark for default path (skip = true). We can do adhoc benchmark run with skip = false for testing, but not need to capture this perhaps?

@Maniktherana
Copy link
Contributor Author

Does it make sense to have benchmark for skip true and false? Is this really that different between those? I think the main problem is CreatedTimestamp implementation right now, no matter if we skip or not, and I would capture the benchmark for default path (skip = true). We can do adhoc benchmark run with skip = false for testing, but not need to capture this perhaps?

Yes this makes sense. I'll make some changes

Although I still feel we can keep a benchmarkParse in omparse and promparse based off which parser we use
(But that's just me and I could be wrong)

@bwplotka
Copy link
Member

bwplotka commented Oct 1, 2024

Although I still feel we can keep a benchmarkParse in omparse and promparse based off which parser we use
(But that's just me and I could be wrong)

Is this duplication of benchmark code (that ages quickly) worth it? Why?

bwplotka
bwplotka previously approved these changes Oct 1, 2024
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, let's do it like that, just name the omtext benchmark a bit more explicitly (: We can improve this one later.

Thanks! LGTM

"openmetrics": func(b []byte, st *labels.SymbolTable) Parser {
return NewOpenMetricsParser(b, st)
},
"openmetrics-skip-ct": func(b []byte, st *labels.SymbolTable) Parser {
Copy link
Member

@bwplotka bwplotka Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would make sense here to always do openmetrics (no case like that) but have b.Run(fmt.Sprintf("skip-ct=%v", skipCT), ....) case, WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I agree with that bit
virtually no difference here for both parsers

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to do in separare PR

Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
"openmetrics": func(b []byte, st *labels.SymbolTable) Parser {
return NewOpenMetricsParser(b, st)
},
"openmetrics-skip-ct": func(b []byte, st *labels.SymbolTable) Parser {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to do in separare PR

@bwplotka bwplotka merged commit 98cd80b into prometheus:main Oct 2, 2024
@Maniktherana Maniktherana deleted the benchmark-ct-om branch October 2, 2024 11:17
@bboreham bboreham changed the title feat: add microbenchmarks for OM CT parsing [TEST] Scraping: Add microbenchmarks for OM CT parsing Oct 8, 2024
julienduchesne pushed a commit to julienduchesne/prometheus that referenced this pull request Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants