Skip to content

Scraping: optimise sync of targets#12048

Merged
bboreham merged 6 commits intoprometheus:mainfrom
bboreham:faster-targets
Mar 9, 2023
Merged

Scraping: optimise sync of targets#12048
bboreham merged 6 commits intoprometheus:mainfrom
bboreham:faster-targets

Conversation

@bboreham
Copy link
Member

@bboreham bboreham commented Mar 2, 2023

Scraping targets are synced by creating the full set, then adding/removing any which have changed.
This PR speeds up the process of creating the full set.

I added a benchmark for TargetsFromGroup; it uses configuration from a typical Kubernetes SD.

The crux of the change is to do relabeling inside labels.Builder instead of converting to labels.Labels and back again for every rule. The change is broken into several commits for easier review.

This is a breaking change to scrape.PopulateLabels(), but relabel.Process is left as-is, with a new relabel.ProcessBuilder option.

Benchmarks, first with the slice version of labels.Labels:

                               │    sec/op    │    sec/op     vs base               │
TargetsFromGroup/1_targets-4     51.35µ ± 14%   34.24µ ± 48%  -33.31% (p=0.032 n=5)
TargetsFromGroup/10_targets-4    515.8µ ± 37%   318.0µ ± 14%  -38.35% (p=0.008 n=5)
TargetsFromGroup/100_targets-4   5.890m ± 44%   3.204m ± 11%  -45.61% (p=0.008 n=5)

                               │     B/op      │     B/op      vs base               │
TargetsFromGroup/1_targets-4     10.599Ki ± 1%   3.734Ki ± 0%  -64.77% (p=0.008 n=5)
TargetsFromGroup/10_targets-4    105.94Ki ± 0%   37.60Ki ± 1%  -64.50% (p=0.008 n=5)
TargetsFromGroup/100_targets-4   1056.1Ki ± 0%   374.5Ki ± 0%  -64.54% (p=0.008 n=5)

                               │  allocs/op   │  allocs/op   vs base               │
TargetsFromGroup/1_targets-4       65.00 ± 0%    43.00 ± 0%  -33.85% (p=0.008 n=5)
TargetsFromGroup/10_targets-4      641.0 ± 0%    430.0 ± 0%  -32.92% (p=0.008 n=5)
TargetsFromGroup/100_targets-4    6.419k ± 0%   4.310k ± 0%  -32.86% (p=0.008 n=5)

and then with -tags stringlabels:

goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/scrape
cpu: Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz
                               │    sec/op    │    sec/op     vs base               │
TargetsFromGroup/1_targets-4     62.81µ ± 17%   32.47µ ± 14%  -48.31% (p=0.008 n=5)
TargetsFromGroup/10_targets-4    560.6µ ± 29%   333.0µ ± 11%  -40.59% (p=0.008 n=5)
TargetsFromGroup/100_targets-4   5.696m ± 22%   3.234m ±  7%  -43.22% (p=0.008 n=5)

                               │     B/op      │     B/op      vs base               │
TargetsFromGroup/1_targets-4     19.099Ki ± 0%   4.365Ki ± 1%  -77.14% (p=0.008 n=5)
TargetsFromGroup/10_targets-4    191.46Ki ± 0%   43.55Ki ± 1%  -77.26% (p=0.008 n=5)
TargetsFromGroup/100_targets-4   1907.5Ki ± 0%   435.2Ki ± 1%  -77.18% (p=0.008 n=5)

                               │  allocs/op  │  allocs/op   vs base               │
TargetsFromGroup/1_targets-4      61.00 ± 0%    43.00 ± 0%  -29.51% (p=0.008 n=5)
TargetsFromGroup/10_targets-4     602.0 ± 0%    430.0 ± 0%  -28.57% (p=0.008 n=5)
TargetsFromGroup/100_targets-4   6.025k ± 0%   4.309k ± 0%  -28.48% (p=0.008 n=5)

`loadConfiguration` is made more general.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
@bboreham
Copy link
Member Author

bboreham commented Mar 7, 2023

Rebased on master and added another optimisation, which is a breaking change to the interface of TargetsFromGroup.
Will update benchmarks shortly. done

bboreham added 5 commits March 7, 2023 17:20
This lets relabelling work on a `Builder` rather than converting to and
from `Labels` on every rule.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Save work converting between Builder and Labels.

Also expose ProcessBuilder, so callers can supply a Builder.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Save work converting to and fro.

Uses the recently-added relabel.ProcessBuilder variant.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Save work converting to `Labels` then to `Builder`.
`PopulateLabels()` now takes as Builder as input.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Common service discovery mechanisms such as Kubernetes can generate a
lot of target groups, so this function was allocating a lot of memory
which then immediately became garbage. Re-using the structures across
an entire Sync saves effort.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
@roidelapluie
Copy link
Member

Thank you! I like how you used a for loop with Get in resolveConflictingExposedLabels.

@bboreham bboreham merged commit b96b89e into prometheus:main Mar 9, 2023
@bboreham bboreham deleted the faster-targets branch March 9, 2023 11:10
@bboreham
Copy link
Member Author

bboreham commented Mar 9, 2023

Thank you! I like how you used a for loop with Get in resolveConflictingExposedLabels.

That's in #12084

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants