Skip to content

Conversation

@bwplotka
Copy link
Member

@bwplotka bwplotka commented Oct 7, 2025

This change is introducing the drafted (more or less prometheus/docs#2679) OM2 format with the complex types.

This is mostly for demo purposes, exploring the benefits (and downsides) of complex type format, but it could be merged as an initial step towards OM2 implementation (it's not used by Prometheus scrape yet).

Related to our PromCon talk with @krajorama

Example OM2 text:

# HELP golang_manual_histogram_seconds This is a histogram with manually selected parameters
# TYPE golang_manual_histogram_seconds histogram
golang_manual_histogram_seconds{address="0.0.0.0",generation="20",port="5001"} {count:1,sum:10.0,bucket:[0.005:0,0.01:0,0.025:0,0.05:0,0.1:0,0.25:0,0.5:0,1.0:0,2.5:0,5.0:0,10.0:1,+Inf:1]}
golang_manual_histogram_seconds{address="0.0.0.0",generation="20",port="5002"} {count:1,sum:10.1,bucket:[0.005:0,0.01:0,0.025:0,0.05:0,0.1:0,0.25:0,0.5:0,1.0:0,2.5:1,5.0:1,10.0:1,+Inf:1]}
golang_manual_histogram_seconds{address="0.0.0.0",generation="20",port="5003"} {count:6,sum:20.04,bucket:[0.005:0,0.01:0,0.025:0,0.05:0,0.1:0,0.25:0,0.5:1,1.0:2,2.5:3,5.0:4,10.0:5,+Inf:6]}
# EOF

With ST/CT:

# HELP go_build_info Build information about the main Go module.
# TYPE go_build_info gauge
go_build_info{checksum="",path="",version=""} 1.0
# HELP promhttp_metric_handler_errors Total number of internal errors encountered by the promhttp metric handler.
# TYPE promhttp_metric_handler_errors counter
promhttp_metric_handler_errors_total{cause="encoding"} 0.0 st@1.726839813016397e+09
promhttp_metric_handler_errors_total{cause="gathering"} 0.0 st@1.726839813016395e+09
# HELP rpc_requests Total number of RPC requests received.
# TYPE rpc_requests counter
rpc_requests_total{service="exponential"} 22.0 st@1.726839813016893e+09
rpc_requests_total{service="normal"} 15.0 st@1.726839813016717e+09
rpc_requests_total{service="uniform"} 11.0 st@1.7268398130168471e+09
# EOF

Initial observations:

  • Complex type parsing is ~10x faster and allocates ~8x less memory (without further micro-optimizations)
  • CT parsing seems to be 2x faster and allocate 20% less
  • Histogram with 12 buckets is ~6x smaller in text bytes
  • No more magic suffix parsing reliability issues
  • Less or same code lines for parsing? (400 less code lines currently, assuming we add nhcb parser code, but I didn't implement summaries and CT yet 🙃)
  • Buckets looks bit dense to read by humans... space might help as @krajorama suggested, should we try it?

Benchmarks

NHCB

This benchmarks shows the difference between parsing histograms in OM 1.x vs OM 2.0 complex types, if Prometheus would only store histograms as NHCB. This assumes Prometheus is intending to store summaries and histograms as NS and NHCB (and NH) going forward (for best case efficiency).

benchstat -col /parser out.txt
goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/model/textparse
cpu: Apple M1 Pro
                              │ omtext_with_nhcb │         om2text_with_nhcb          │
                              │      sec/op      │   sec/op     vs base               │
ParseOpenMetricsNHCB_OM1vs2-2      52.704µ ± 12%   5.726µ ± 3%  -89.14% (p=0.002 n=6)

                              │ omtext_with_nhcb │          om2text_with_nhcb           │
                              │       B/s        │      B/s       vs base               │
ParseOpenMetricsNHCB_OM1vs2-2      75.87Mi ± 11%   118.77Mi ± 3%  +56.53% (p=0.002 n=6)

                              │ omtext_with_nhcb │          om2text_with_nhcb          │
                              │       B/op       │     B/op      vs base               │
ParseOpenMetricsNHCB_OM1vs2-2      20.018Ki ± 0%   3.219Ki ± 0%  -83.92% (p=0.002 n=6)

                              │ omtext_with_nhcb │         om2text_with_nhcb         │
                              │    allocs/op     │ allocs/op   vs base               │
ParseOpenMetricsNHCB_OM1vs2-2        353.00 ± 0%   28.00 ± 0%  -92.07% (p=0.002 n=6)

NOTE: OM1 file with 1 histogram is 4.1KB size, OM 2 + complex type file with the same data is 713B size.

CT/ST

benchstat -col /parser out.txt                     
goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/model/textparse
cpu: Apple M1 Pro
                            │   omtext    │              om2text               │
                            │   sec/op    │   sec/op     vs base               │
ParseOpenMetricsST_OM1vs2-2   4.637µ ± 1%   2.678µ ± 9%  -42.25% (p=0.002 n=6)

                            │    omtext    │               om2text               │
                            │     B/s      │     B/s       vs base               │
ParseOpenMetricsST_OM1vs2-2   204.6Mi ± 1%   275.0Mi ± 8%  +34.36% (p=0.002 n=6)

                            │    omtext    │               om2text               │
                            │     B/op     │     B/op      vs base               │
ParseOpenMetricsST_OM1vs2-2   2.414Ki ± 0%   1.955Ki ± 0%  -19.01% (p=0.002 n=6)

                            │   omtext   │              om2text              │
                            │ allocs/op  │ allocs/op   vs base               │
ParseOpenMetricsST_OM1vs2-2   40.00 ± 0%   21.00 ± 0%  -47.50% (p=0.002 n=6)

Does this PR introduce a user-facing change?

NONE

This change is for demo purposes, exploring the benefits (and downsides)
for the complex type format for OM2 captured in
prometheus/docs#2679.

This assumes Prometheus stores NS and NHCB (and NH) going forward (for
best case efficiency), but is expected to work for classic mode too with
little overhead (benchmarks will tell us).

Part of the PromCon talk we do with @krajorama

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: bwplotka <bwplotka@gmail.com>
@github-actions github-actions bot added the stale label Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants