Skip to content

model/labels: improve performance of regex matchers like .*-.*-.*#17707

Merged
bboreham merged 4 commits intoprometheus:mainfrom
charleskorn:charleskorn/optimise-chained-regexp-contains
Jan 8, 2026
Merged

model/labels: improve performance of regex matchers like .*-.*-.*#17707
bboreham merged 4 commits intoprometheus:mainfrom
charleskorn:charleskorn/optimise-chained-regexp-contains

Conversation

@charleskorn
Copy link
Contributor

@charleskorn charleskorn commented Dec 17, 2025

Which issue(s) does the PR fix:

#14173 introduced an optimisation to better handle regex patterns like .*-.*-.*. It identifies strings the pattern cannot possibly match (because they do not contain all of the literal values) and returns false from MatchString early.

However, if the string does contain all literal values, then the Go regex engine is used to confirm that the string does match the pattern. But this is not necessary in the case where the start and end of the pattern is .* and everything in between is either a literal or .*: if the string contains all of the literals in order, then it matches the pattern, and invoking Go's regex engine to confirm this is unnecessary and quite slow.

So this PR introduces a fast path for patterns like this.

I think we could apply something similar to patterns that don't start or end with .*, but this would be a slightly more complex change, and I wanted to keep this PR as small as possible while I get my head around FastRegexMatcher.

The existing FastRegexMatcher benchmark didn't cover the case where a pattern like .*-.*-.* matches the string, so I added a specific benchmark for this case:

goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/model/labels
cpu: Apple M1 Pro
                                        │   main.txt   │               pr.txt               │
                                        │    sec/op    │   sec/op     vs base               │
FastRegexMatcher_ConcatenatedPattern-10   3327.5n ± 1%   126.3n ± 2%  -96.20% (p=0.002 n=6)

                                        │  main.txt  │            pr.txt             │
                                        │    B/op    │    B/op     vs base           │
FastRegexMatcher_ConcatenatedPattern-10   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=6) ¹
¹ all samples are equal

                                        │  main.txt  │            pr.txt             │
                                        │ allocs/op  │ allocs/op   vs base           │
FastRegexMatcher_ConcatenatedPattern-10   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=6) ¹
¹ all samples are equal

Does this PR introduce a user-facing change?

[PERF] Improve performance of regex matchers like `.*-.*-.*`

Signed-off-by: Charles Korn <charles.korn@grafana.com>
Signed-off-by: Charles Korn <charles.korn@grafana.com>
Signed-off-by: Charles Korn <charles.korn@grafana.com>
@charleskorn charleskorn force-pushed the charleskorn/optimise-chained-regexp-contains branch from 3ad2a3a to cb5120e Compare December 17, 2025 04:23
@krajorama krajorama requested a review from bboreham December 17, 2025 13:32
Copy link
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly fine; one question about the structure.

Also could you post your benchmark results for the cases added to BenchmarkFastRegexMatcher ?

Signed-off-by: Charles Korn <charles.korn@grafana.com>
@charleskorn charleskorn force-pushed the charleskorn/optimise-chained-regexp-contains branch from a4dc256 to db9cc23 Compare January 8, 2026 01:40
@charleskorn
Copy link
Contributor Author

Also could you post your benchmark results for the cases added to BenchmarkFastRegexMatcher ?

That benchmark doesn't show much impact given most of the random test strings will not match the patterns, so they don't benefit from this optimisation:

goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/model/labels
cpu: Apple M1 Pro
                                                        │   before.txt   │              after.txt              │
                                                        │     sec/op     │    sec/op     vs base               │
FastRegexMatcher/.*-.*-.*-.*-.*-10                         192.8n ±   0%   192.2n ±  0%   -0.31% (p=0.041 n=6)
FastRegexMatcher/.+-.*-.*-.*-.+-10                         192.1n ±   0%   192.2n ±  0%        ~ (p=0.372 n=6)
FastRegexMatcher/-.*-.*-.*-.*-10                           101.4n ±   1%   101.2n ±  0%        ~ (p=0.054 n=6)
FastRegexMatcher/.*-.*-.*-.*--10                           119.0n ±   0%   119.3n ±  0%   +0.21% (p=0.024 n=6)

I added the BenchmarkFastRegexMatcher_ConcatenatedPattern benchmark for this reason (see PR description for results).

Copy link
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that seems better to me.

@bboreham bboreham merged commit a919e6d into prometheus:main Jan 8, 2026
28 checks passed
56quarters added a commit to 56quarters/prometheus that referenced this pull request Jan 9, 2026
This change fixes an issue introduced in prometheus#17707. When a regex
with a wildcard, literal, and final wildcard surounded by a
capture group was parsed - the capture group was not removed
first preventing `optimizeConcatRegex` from running.

Found via fuzz testing.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
56quarters added a commit to 56quarters/prometheus that referenced this pull request Jan 9, 2026
This change fixes an issue introduced in prometheus#17707. When a regex
with a wildcard, literal, and final wildcard surounded by a
capture group was parsed - the capture group was not removed
first preventing `optimizeConcatRegex` from running.

Found via fuzz testing.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
bboreham pushed a commit that referenced this pull request Jan 12, 2026
This change fixes an issue introduced in #17707. When a regex
with a wildcard, literal, and final wildcard surounded by a
capture group was parsed - the capture group was not removed
first preventing `optimizeConcatRegex` from running.

Found via fuzz testing.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
charleskorn added a commit to grafana/mimir-prometheus that referenced this pull request Jan 15, 2026
Signed-off-by: Charles Korn <charles.korn@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants