OTLP: label caching for OTLP-to-Prometheus conversion to reduce allocations and improve latency by aknuds1 · Pull Request #17860 · prometheus/prometheus

aknuds1 · 2026-01-14T12:39:52Z

Which issue(s) does the PR fix:

Does this PR introduce a user-facing change?

[PERF] otlptranslator: Add label caching for OTLP-to-Prometheus conversion to reduce allocations and improve latency

Summary

This PR includes the following changes:

otlptranslator: Add label caching for OTLP-to-Prometheus conversion
- Per-request label sanitization cache to avoid repeated string allocations
- Resource-level label caching (job, instance, promoted resource attributes, external labels)
- Scope-level label caching (otel_scope_name, otel_scope_version, etc.)
- LabelNamer instance caching across datapoints
- Add benchmarks demonstrating the perf improvements

Benchmark Results

OTLP-to-Prometheus label caching benchmarks (Apple M4 Pro):

goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/storage/remote/otlptranslator/prometheusremotewrite
cpu: Apple M4 Pro
                                                                                             │   main.txt   │         optimizations.txt          │
                                                                                             │    sec/op    │   sec/op     vs base               │
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=5/scopes=1/metrics=10-14       19.07µ ± 2%   13.27µ ± 4%  -30.44% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=5/scopes=1/metrics=100-14      177.4µ ± 3%   108.1µ ± 8%  -39.06% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=5/scopes=10/metrics=10-14      180.0µ ± 4%   106.2µ ± 3%  -41.00% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=5/scopes=10/metrics=100-14     2.063m ± 5%   1.174m ± 2%  -43.06% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=50/scopes=1/metrics=10-14     122.66µ ± 6%   89.18µ ± 2%  -27.30% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=50/scopes=1/metrics=100-14    1040.5µ ± 2%   713.8µ ± 2%  -31.40% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=50/scopes=10/metrics=10-14    1054.3µ ± 2%   713.8µ ± 2%  -32.30% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=50/scopes=10/metrics=100-14   10.833m ± 1%   7.112m ± 3%  -34.35% (p=0.002 n=6)
FromMetrics_LabelCaching_RepeatedLabelNames/unique_labels=5/datapoints=100-14                   83.63µ ± 3%   62.62µ ± 3%  -25.12% (p=0.002 n=6)
FromMetrics_LabelCaching_RepeatedLabelNames/unique_labels=5/datapoints=1000-14                  939.4µ ± 3%   696.7µ ± 2%  -25.84% (p=0.002 n=6)
FromMetrics_LabelCaching_RepeatedLabelNames/unique_labels=50/datapoints=100-14                  304.5µ ± 3%   215.2µ ± 3%  -29.33% (p=0.002 n=6)
FromMetrics_LabelCaching_RepeatedLabelNames/unique_labels=50/datapoints=1000-14                 3.163m ± 3%   2.200m ± 2%  -30.44% (p=0.002 n=6)
FromMetrics_LabelCaching_ScopeMetadata/scope_attrs=0/metrics=10-14                             11.446µ ± 3%   9.322µ ± 4%  -18.56% (p=0.002 n=6)
FromMetrics_LabelCaching_ScopeMetadata/scope_attrs=0/metrics=100-14                            103.86µ ± 4%   72.75µ ± 4%  -29.96% (p=0.002 n=6)
FromMetrics_LabelCaching_ScopeMetadata/scope_attrs=10/metrics=10-14                             24.59µ ± 2%   15.53µ ± 1%  -36.86% (p=0.002 n=6)
FromMetrics_LabelCaching_ScopeMetadata/scope_attrs=10/metrics=100-14                            234.5µ ± 4%   117.6µ ± 4%  -49.85% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=1/metrics=10-14                            32.36µ ± 3%   21.82µ ± 4%  -32.57% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=1/metrics=100-14                           299.2µ ± 3%   173.5µ ± 3%  -42.02% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=10/metrics=10-14                           314.9µ ± 1%   191.5µ ± 1%  -39.20% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=10/metrics=100-14                          3.217m ± 2%   1.881m ± 3%  -41.51% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=50/metrics=10-14                           1.692m ± 4%   1.035m ± 4%  -38.81% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=50/metrics=100-14                         15.552m ± 3%   8.938m ± 2%  -42.53% (p=0.002 n=6)
geomean                                                                                         365.7µ        237.7µ       -35.02%

                                                                                             │   main.txt    │          optimizations.txt           │
                                                                                             │     B/op      │     B/op       vs base               │
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=5/scopes=1/metrics=10-14       13.57Ki ± 0%    16.01Ki ± 0%  +17.93% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=5/scopes=1/metrics=100-14     111.40Ki ± 0%    88.43Ki ± 0%  -20.62% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=5/scopes=10/metrics=10-14     111.40Ki ± 0%    88.43Ki ± 0%  -20.62% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=5/scopes=10/metrics=100-14    1239.5Ki ± 0%    961.9Ki ± 0%  -22.40% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=50/scopes=1/metrics=10-14      45.45Ki ± 0%    41.98Ki ± 0%   -7.64% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=50/scopes=1/metrics=100-14     327.6Ki ± 0%    235.3Ki ± 0%  -28.17% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=50/scopes=10/metrics=10-14     327.7Ki ± 0%    235.3Ki ± 0%  -28.17% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=50/scopes=10/metrics=100-14    3.222Mi ± 0%    2.264Mi ± 0%  -29.73% (p=0.002 n=6)
FromMetrics_LabelCaching_RepeatedLabelNames/unique_labels=5/datapoints=100-14                   69.16Ki ± 0%    66.35Ki ± 0%   -4.06% (p=0.002 n=6)
FromMetrics_LabelCaching_RepeatedLabelNames/unique_labels=5/datapoints=1000-14                  844.2Ki ± 0%    771.1Ki ± 0%   -8.66% (p=0.002 n=6)
FromMetrics_LabelCaching_RepeatedLabelNames/unique_labels=50/datapoints=100-14                  171.1Ki ± 0%    145.1Ki ± 0%  -15.20% (p=0.002 n=6)
FromMetrics_LabelCaching_RepeatedLabelNames/unique_labels=50/datapoints=1000-14                 1.830Mi ± 0%    1.530Mi ± 0%  -16.39% (p=0.002 n=6)
FromMetrics_LabelCaching_ScopeMetadata/scope_attrs=0/metrics=10-14                              9.759Ki ± 0%   13.383Ki ± 0%  +37.14% (p=0.002 n=6)
FromMetrics_LabelCaching_ScopeMetadata/scope_attrs=0/metrics=100-14                             79.43Ki ± 0%    70.34Ki ± 0%  -11.45% (p=0.002 n=6)
FromMetrics_LabelCaching_ScopeMetadata/scope_attrs=10/metrics=10-14                             18.42Ki ± 0%    18.16Ki ± 0%   -1.40% (p=0.002 n=6)
FromMetrics_LabelCaching_ScopeMetadata/scope_attrs=10/metrics=100-14                            159.9Ki ± 0%    104.6Ki ± 0%  -34.55% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=1/metrics=10-14                            20.53Ki ± 0%    21.67Ki ± 0%   +5.58% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=1/metrics=100-14                           161.3Ki ± 0%    123.6Ki ± 0%  -23.35% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=10/metrics=10-14                           168.7Ki ± 0%    133.0Ki ± 0%  -21.16% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=10/metrics=100-14                          1.685Mi ± 0%    1.271Mi ± 0%  -24.57% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=50/metrics=10-14                           906.2Ki ± 0%    703.2Ki ± 0%  -22.40% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=50/metrics=100-14                          8.063Mi ± 0%    5.948Mi ± 0%  -26.23% (p=0.002 n=6)
geomean                                                                                         208.4Ki         176.5Ki       -15.30%

                                                                                             │   main.txt   │         optimizations.txt          │
                                                                                             │  allocs/op   │  allocs/op   vs base               │
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=5/scopes=1/metrics=10-14       239.00 ± 0%    91.00 ± 0%  -61.92% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=5/scopes=1/metrics=100-14      2135.0 ± 0%    547.0 ± 0%  -74.38% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=5/scopes=10/metrics=10-14      2135.0 ± 0%    547.0 ± 0%  -74.38% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=5/scopes=10/metrics=100-14    21.051k ± 0%   5.058k ± 0%  -75.97% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=50/scopes=1/metrics=10-14       739.0 ± 0%    141.0 ± 0%  -80.92% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=50/scopes=1/metrics=100-14     6685.0 ± 0%    597.0 ± 0%  -91.07% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=50/scopes=10/metrics=10-14     6685.0 ± 0%    597.0 ± 0%  -91.07% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleDatapointsPerResource/res_attrs=50/scopes=10/metrics=100-14   66.105k ± 0%   5.108k ± 0%  -92.27% (p=0.002 n=6)
FromMetrics_LabelCaching_RepeatedLabelNames/unique_labels=5/datapoints=100-14                   1023.0 ± 0%    533.0 ± 0%  -47.90% (p=0.002 n=6)
FromMetrics_LabelCaching_RepeatedLabelNames/unique_labels=5/datapoints=1000-14                 10.034k ± 0%   5.044k ± 0%  -49.73% (p=0.002 n=6)
FromMetrics_LabelCaching_RepeatedLabelNames/unique_labels=50/datapoints=100-14                  2527.0 ± 0%    552.0 ± 0%  -78.16% (p=0.002 n=6)
FromMetrics_LabelCaching_RepeatedLabelNames/unique_labels=50/datapoints=1000-14                25.038k ± 0%   5.063k ± 0%  -79.78% (p=0.002 n=6)
FromMetrics_LabelCaching_ScopeMetadata/scope_attrs=0/metrics=10-14                              158.00 ± 0%    87.00 ± 0%  -44.94% (p=0.002 n=6)
FromMetrics_LabelCaching_ScopeMetadata/scope_attrs=0/metrics=100-14                             1334.0 ± 0%    543.0 ± 0%  -59.30% (p=0.002 n=6)
FromMetrics_LabelCaching_ScopeMetadata/scope_attrs=10/metrics=10-14                              359.0 ± 0%    109.0 ± 0%  -69.64% (p=0.002 n=6)
FromMetrics_LabelCaching_ScopeMetadata/scope_attrs=10/metrics=100-14                            3335.0 ± 0%    565.0 ± 0%  -83.06% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=1/metrics=10-14                             346.0 ± 0%    103.0 ± 0%  -70.23% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=1/metrics=100-14                           3142.0 ± 0%    559.0 ± 0%  -82.21% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=10/metrics=10-14                           3280.0 ± 0%    607.0 ± 0%  -81.49% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=10/metrics=100-14                         31.196k ± 0%   5.118k ± 0%  -83.59% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=50/metrics=10-14                          16.293k ± 0%   2.817k ± 0%  -82.71% (p=0.002 n=6)
FromMetrics_LabelCaching_MultipleResources/resources=50/metrics=100-14                         155.83k ± 0%   25.35k ± 0%  -83.73% (p=0.002 n=6)
geomean                                                                                         3.648k         810.6       -77.78%

Summary: 34% faster, 17% less memory, 81% fewer allocations (geomean).

aknuds1 · 2026-01-14T16:23:08Z

Putting this in draft while I compare existing benchmarks.

ArthurSens

that's some great improvements, nice job! I have just some comments

storage/remote/otlptranslator/prometheusremotewrite/metrics_to_prw.go

Add per-request caching to reduce redundant computation and allocations during OTLP metric conversion: 1. Per-request label sanitization cache: Cache sanitized label names within a request to avoid repeated string allocations for commonly repeated labels like __name__, job, instance. 2. Resource-level label caching: Precompute and cache job, instance, promoted resource attributes, and external labels once per ResourceMetrics boundary instead of for each datapoint. 3. Scope-level label caching: Precompute and cache scope metadata labels (otel_scope_name, otel_scope_version, etc.) once per ScopeMetrics boundary. 4. LabelNamer instance caching: Reuse the LabelNamer struct across datapoints within the same resource context. These optimizations significantly reduce allocations and improve latency for OTLP ingestion workloads with many datapoints per resource/scope. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Add regression tests to ensure: 1. Scope labels are not added to target_info when PromoteScopeMetadata is enabled. The fix ensures scope labels are only merged when both the cache exists AND promoteScope is true. 2. Promoted resource attributes are not added to target_info. Added getPromotedAttributeNames() method to get the list of promoted attributes, which are then added to ignoreAttrs in addResourceTargetInfo() to prevent them from appearing in target_info. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Remove resource and scope parameters from createAttributes since these are now cached via setResourceContext and setScopeContext. Update all call sites and tests to properly initialize context before calling internal add* functions. Also fix target_info to not include scope labels by temporarily clearing c.scopeLabels during target_info generation, since target_info is a resource-level metric. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

The sanitizedLabels cache stores label name sanitization results which depend only on the label name and settings. Since settings are constant within a FromMetrics call, clearing the cache at each resource boundary is unnecessary and reduces caching effectiveness. Label names like __name__, job, instance will now remain cached across all resources in a request instead of being re-sanitized for each one. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

…arget_info Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

…ed methods Remove the redundant clearResourceContext() call at the end of the ResourceMetrics loop since setResourceContext() unconditionally overwrites the cached labels at the start of each iteration. Also remove two methods that became unused after the earlier change to include promoted attributes in target_info: - addPromotedAttributes - getPromotedAttributeNames Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Use labels.Builder instead of ScratchBuilder for building promoted resource attribute labels. Builder.Get()/Set() properly handles duplicate label names that can arise when different OTLP attribute names sanitize to the same Prometheus label name (e.g., "foo.bar" and "foo_bar" both become "foo_bar"). Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Use labels.Builder with Set() instead of ScratchBuilder.Add() for scope attributes caching. This handles duplicate label names that can arise when different OTLP attribute names sanitize to the same Prometheus label name. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Since the caching refactor now stores resource and scope labels in the converter's cached state, these parameters are no longer needed in the datapoint-level functions. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

ArthurSens

I realized that building the labelBuilder in NewPrometheusConverter is a bit harder than I initially expected, since settings aren't passed as an argument there.

LGTM

jesusvazquez

LGTM nice work arve

storage/remote/otlptranslator/prometheusremotewrite/helper.go

Add a nil check for resourceLabels at the start of createAttributes to return a clear error instead of panicking if the caller forgets to call setResourceContext first. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

krajorama

LGTM, nice job

storage/remote/otlptranslator/prometheusremotewrite/helper.go

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

aknuds1 requested review from ArthurSens and jesusvazquez as code owners January 14, 2026 12:39

aknuds1 force-pushed the arve/optimizations branch from ebd647c to c868373 Compare January 14, 2026 12:42

aknuds1 added kind/optimization area/opentelemetry labels Jan 14, 2026

aknuds1 changed the title ~~Various optimizations and improvements~~ OTLP: label caching for OTLP-to-Prometheus conversion to reduce allocations and improve latency Jan 14, 2026

aknuds1 requested a review from krajorama January 14, 2026 14:11

aknuds1 force-pushed the arve/optimizations branch from a8dc0d8 to b4f73e5 Compare January 14, 2026 14:25

aknuds1 marked this pull request as draft January 14, 2026 16:22

aknuds1 marked this pull request as ready for review January 14, 2026 17:33

ArthurSens reviewed Jan 14, 2026

View reviewed changes

aknuds1 requested a review from ArthurSens January 15, 2026 10:09

aknuds1 added 10 commits January 15, 2026 11:26

Clear caches in case of error

4d3b15a

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Clean up

494ba8a

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

otlptranslator: ensure promoted resource attributes are included in t…

fcce07c

…arget_info Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

aknuds1 force-pushed the arve/optimizations branch from 6ba9dcd to 18a502e Compare January 15, 2026 10:27

otlptranslator: remove unused resource and scope parameters

75638c3

Since the caching refactor now stores resource and scope labels in the converter's cached state, these parameters are no longer needed in the datapoint-level functions. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

aknuds1 force-pushed the arve/optimizations branch from d2a79f0 to 75638c3 Compare January 15, 2026 10:32

ArthurSens approved these changes Jan 15, 2026

View reviewed changes

jesusvazquez approved these changes Jan 15, 2026

View reviewed changes

storage/remote/otlptranslator/prometheusremotewrite/helper.go Show resolved Hide resolved

otlptranslator: validate resource context in createAttributes

3a4c703

Add a nil check for resourceLabels at the start of createAttributes to return a clear error instead of panicking if the caller forgets to call setResourceContext first. Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

krajorama approved these changes Jan 16, 2026

View reviewed changes

storage/remote/otlptranslator/prometheusremotewrite/helper.go Show resolved Hide resolved

aknuds1 and others added 2 commits January 16, 2026 10:56

Update storage/remote/otlptranslator/prometheusremotewrite/helper.go

25bf343

Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com> Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

Fix import

bda40aa

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>

aknuds1 enabled auto-merge (squash) January 16, 2026 10:09

aknuds1 merged commit 4afa76d into prometheus:main Jan 16, 2026
32 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OTLP: label caching for OTLP-to-Prometheus conversion to reduce allocations and improve latency#17860

OTLP: label caching for OTLP-to-Prometheus conversion to reduce allocations and improve latency#17860
aknuds1 merged 14 commits intoprometheus:mainfrom
aknuds1:arve/optimizations

aknuds1 commented Jan 14, 2026 •

edited

Loading

Uh oh!

aknuds1 commented Jan 14, 2026

Uh oh!

ArthurSens left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ArthurSens left a comment

Uh oh!

jesusvazquez left a comment

Uh oh!

Uh oh!

krajorama left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

aknuds1 commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue(s) does the PR fix:

Does this PR introduce a user-facing change?

Summary

Benchmark Results

Uh oh!

aknuds1 commented Jan 14, 2026

Uh oh!

ArthurSens left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ArthurSens left a comment

Choose a reason for hiding this comment

Uh oh!

jesusvazquez left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

krajorama left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aknuds1 commented Jan 14, 2026 •

edited

Loading