PromQL: Add `fill*()` binop modifiers to provide default values for missing series by juliusv · Pull Request #17644 · prometheus/prometheus

juliusv · 2025-12-03T18:04:17Z

This starts implementing the proposal for default value binop modifiers throughout the stack, from the PromQL parser+engine all the way to the UI and PromLens visualizer.

So you can now do something like:

# Fill in missing series from either side with a default value of 0.
metric1 + fill(0) metric2

Or:

# Fill in missing custom rate thresholds with a default value of 23.
some_rates > fill_right(23) some_rate_thresholds

A few special considerations to note:

Other binop modifiers like on(), group_left(), etc. are reserved words that are NOT allowed as metric names. I did not want to break existing users, so I could not adopt the same behavior for the new fill modifiers. For example, a user might have an expression like foo + fill, where both are metric names, and that still needs to be parseable as a metric name. Thus I had to extend the lexer to only emit fill modifier tokens if there is a parenthesis following them. Maybe there's a better way of doing this directly in the grammar, but I think yacc works makes that hard? Input / experimentation by others welcome.
I am not implementing the parameterless variants for now, so you always have to include the parentheses with a value inside.
The current binOp fill implementation in the engine introduces a closure function in a place that might be performance-sensitive. It was the easiest way to implement things for now, but we need to benchmark this properly and determine if this is ok or whether the binop evaluation should be changed in such a way that it is centered around the match label hashbucket (similar to how Thanos does it), with both sides (LHS + RHS) being filled in for each bucket before performing the final binop operation (or filling in of missing series).

Other than those notes, progress is roughly:

Here's a screenshot of how it looks in the PromLens tree view and the matching visualization in the "Explain" tab:

Fixes #13625

Does this PR introduce a user-facing change?

[FEATURE] PromQL: Add `fill()` / `fill_left()` / `fill_right()` binop modifiers for specifying default values for missing series

promql/engine.go

juliusv · 2025-12-11T11:51:05Z

I added tests and documentation for the modifiers now and removed the [WIP] from the PR description, so I think it's ready for broader review.

For the closure that I introduced in the binop computation, it seems fine, or at least it doesn't seem to change the BenchmarkJoinQuery benchmark at all in terms of allocs or speed.

Before:

BenchmarkJoinQuery/expr=rpc_request_success_total_+_rpc_request_error_total/steps=10000-20         	       1	2612443093 ns/op	1580342576 B/op	  577780 allocs/op
BenchmarkJoinQuery/expr=rpc_request_success_total_+_ON_(job,_instance)_GROUP_LEFT_rpc_request_error_total/steps=10000-20         	       1	3083784580 ns/op	1977043056 B/op	20567736 allocs/op
BenchmarkJoinQuery/expr=rpc_request_success_total_AND_rpc_request_error_total{instance=~"0.*"}/steps=10000-20                    	       2	 521685498 ns/op	39702924 B/op	  306865 allocs/op
BenchmarkJoinQuery/expr=rpc_request_success_total_OR_rpc_request_error_total{instance=~"0.*"}/steps=10000-20                     	       1	1077586308 ns/op	518800192 B/op	  311608 allocs/op
BenchmarkJoinQuery/expr=rpc_request_success_total_UNLESS_rpc_request_error_total{instance=~"0.*"}/steps=10000-20                 	       2	 986880906 ns/op	39771716 B/op	  306864 allocs/op

After:

BenchmarkJoinQuery/expr=rpc_request_success_total_+_rpc_request_error_total/steps=10000-20         	       1	2619260793 ns/op	1580348944 B/op	  577797 allocs/op
BenchmarkJoinQuery/expr=rpc_request_success_total_+_ON_(job,_instance)_GROUP_LEFT_rpc_request_error_total/steps=10000-20         	       1	3126463664 ns/op	1977042880 B/op	20567734 allocs/op
BenchmarkJoinQuery/expr=rpc_request_success_total_AND_rpc_request_error_total{instance=~"0.*"}/steps=10000-20                    	       2	 509301824 ns/op	39703108 B/op	  306866 allocs/op
BenchmarkJoinQuery/expr=rpc_request_success_total_OR_rpc_request_error_total{instance=~"0.*"}/steps=10000-20                     	       1	1054013104 ns/op	518800112 B/op	  311607 allocs/op
BenchmarkJoinQuery/expr=rpc_request_success_total_UNLESS_rpc_request_error_total{instance=~"0.*"}/steps=10000-20                 	       2	 971900134 ns/op	39771844 B/op	  306863 allocs/op

A remaining issue is the interplay of delayed metric name removal (--enable-feature=promql-delayed-name-removal) and fill modifiers that @vpranckaitis pointed out: If a series exists at some time steps and not others, it will exist with its metric name (metric_a{case="2"}) when it is present and without the metric name ({case="2"}) when it has to be filled in. This is no problem at each single resolution step (as only one variant of the series will be present at any given time, either the real one or the filled-in one), but apparently the output vectors from multiple resolution steps are gathered into a single result matrix, and with delayed metric name removal, the metric name is removed only after that matrix has been gathered, causing collisions at that point in the process, when metric_a{case="2"} gets reduced to {case="2"}, which already exists in the matrix. Any advice on how to correctly address this is welcome.

juliusv · 2025-12-11T12:01:56Z

CC @jcreixell as the author of the delayed metric name removal feature, would you be able to advise on the following?

A remaining issue is the interplay of delayed metric name removal (--enable-feature=promql-delayed-name-removal) and fill modifiers that @vpranckaitis pointed out: If a series exists at some time steps and not others, it will exist with its metric name (metric_a{case="2"}) when it is present and without the metric name ({case="2"}) when it has to be filled in. This is no problem at each single resolution step (as only one variant of the series will be present at any given time, either the real one or the filled-in one), but apparently the output vectors from multiple resolution steps are gathered into a single result matrix, and with delayed metric name removal, the metric name is removed only after that matrix has been gathered, causing collisions at that point in the process, when metric_a{case="2"} gets reduced to {case="2"}, which already exists in the matrix. Any advice on how to correctly address this is welcome.

There's never really a collision at each individual time step - only when merging different time steps together.

bboreham · 2025-12-11T20:01:05Z

I believe Jorge is on paternity leave for a few months, so may not reply.

roidelapluie · 2025-12-12T09:14:43Z

@juliusv do you have a unit test highlighting the issue so we can better help?

juliusv · 2025-12-12T11:22:20Z

@roidelapluie Yes, I've noticed now that the problem is actually more general and does not require the new fill modifiers to be provoked. The problem is in general that we:

First gather all resolution steps over time in a big matrix,
...only then apply modifications to that matrix (removing the metric name) that could result in some series names colliding across steps

...and only then detecting that as a collision in ContainsSameLabelset:

prometheus/promql/value.go

Lines 321 to 341 in 583bc01

    
           // ContainsSameLabelset checks if a matrix has samples with the same labelset. 
        
           // Such a behavior is semantically undefined. 
        
           // https://github.com/prometheus/prometheus/issues/4562 
        
           func (m Matrix) ContainsSameLabelset() bool { 
        
           	switch len(m) { 
        
           	case 0, 1: 
        
           		return false 
        
           	case 2: 
        
           		return m[0].Metric.Hash() == m[1].Metric.Hash() 
        
           	default: 
        
           		l := make(map[uint64]struct{}, len(m)) 
        
           		for _, ss := range m { 
        
           			hash := ss.Metric.Hash() 
        
           			if _, ok := l[hash]; ok { 
        
           				return true 
        
           			} 
        
           			l[hash] = struct{}{} 
        
           		} 
        
           		return false 
        
           	} 
        
           }

So you can also provoke it like this:

load 10m
    metric_a   1  _
    metric_b   3  4

eval range from 0 to 20m step 10m -metric_a or -metric_b
    {} -1  -4

Or if you want to provoke it with the new fill modifiers, you could do:

load 10m
    metric_a   1  _
    metric_b   _  4

eval range from 0 to 20m step 10m metric_a <= bool fill(0) metric_b
    {} 0  1

There's also more ways to provoke this, like:

load 10m
    metric_a   1  _
    metric_b   _  4

eval range from 0 to 20m step 10m -metric_a or on(__name__) -metric_b
    {} -1  -4

All these queries run fine without delayed metric name removal, but cause collision errors when delayed metric name removal is enabled. I mentioned the same example queries in #15855 already as well. So I think this is generally an issue that can and should be fixed separately from the new fill modifiers. I guess you'd either have to:

In the big final matrix, merge together series with the same label set that don't have overlapping sample values at the same timestamp.
Or do any transformations that could cause collisions before gathering everything into one big final matrix.

See #17644 (comment) Signed-off-by: Julius Volz <julius.volz@gmail.com>

juliusv · 2025-12-12T12:29:49Z

Btw. I built a tenative fix for merging colliding series after delayed metric name removal in this branch: https://github.com/prometheus/prometheus/commits/delayed-name-removal-series-merge/ - maybe I can still simplify it though, so not opening a PR yet. In general, I'm wondering whether we should keep delayed metric name removal at all. It just makes everything more confusing and less clean IMO. But that's again not really related to the fill modifiers :)

roidelapluie · 2025-12-12T13:53:51Z

WDYT of #17678 ?

jcreixell · 2025-12-13T11:26:48Z

Hi, as @bboreham mentioned, I am on parental leave until February and nowhere near a laptop to look into this in depth right now. I wanted to keep @beorn7 in the loop as he is really the brain behind this feature (I mostly followed his advice while implementing it).

In terms of whether to keep the feature or not, I can see that this has come up a few times, and I think it's a philosophical question on whether every processing step should be self contained and independent (which helps tooling like promlens, and might be more elegant and intuitive, particularly for devs) or whether we should defer some processing to the last step for efficiency reasons and to resolve some otherwise surprising (to the less experience user) query errors due to implementation details of the query engine.

I don't have a definitive answer and don't know much about the specifics on how multiple steps are merged together during query processing (intuitively, it sounds like having duplicated labels for a single step at the end of processing should indeed cause an error, but i need to look into this in more detail to fully understand it).

I just wanted to say that I am not married to this feature and I fully trust the team to make the right call on this, so please don't be blocked by me and do not hesitate to remove it if it doesn't make sense or creates too many problems.

And thank you @juliusv and @roidelapluie looking into this and proposing solutions 🙏

juliusv · 2025-12-15T10:53:19Z

Hi, as @bboreham mentioned, I am on parental leave until February and nowhere near a laptop to look into this in depth right now. I wanted to keep @beorn7 in the loop as he is really the brain behind this feature (I mostly followed his advice while implementing it).

Thanks for chiming in, and I hope you are enjoying your parental leave :)

I just wanted to say that I am not married to this feature and I fully trust the team to make the right call on this, so please don't be blocked by me and do not hesitate to remove it if it doesn't make sense or creates too many problems.

Good to know! And yep, my opinion is also not super strong on this, but I do feel it has some complexity and understandability drawbacks at least that people should consider when deciding to keep it or not. But it's probably better discussed on the issues related to delayed metric name removal in due time.

…issing series Signed-off-by: Julius Volz <julius.volz@gmail.com>

Signed-off-by: Julius Volz <julius.volz@gmail.com>

linasm · 2026-01-15T07:19:36Z

Did a basic search, did not find anything, perhaps I has missed something - my question is:
was there a discussion on whether this should be rolled out as an experimental feature first, or is is going straight for GA?

juliusv · 2026-01-15T07:57:24Z

@linasm I wanted to bring up that question today as well. While I am tempted to just merge it without a flag immediately, I think putting it behind an feature flag might actually be the safer and more responsible way to go. So far it feels like only a small handful of people have taken a look at the details, and it's probably best to not set things 100% in stone yet. You probably would agree?

linasm · 2026-01-15T08:45:06Z

@linasm I wanted to bring up that question today as well. While I am tempted to just merge it without a flag immediately, I think putting it behind an feature flag might actually be the safer and more responsible way to go. So far it feels like only a small handful of people have taken a look at the details, and it's probably best to not set things 100% in stone yet. You probably would agree?

In general, it is better to be safe than sorry, but I don't have a strong opinion on this one. More interested in what the maintainers would say.

Signed-off-by: Julius Volz <julius.volz@gmail.com>

juliusv · 2026-01-15T10:20:05Z

Alright, I added a new commit that puts the new modifiers behind a new promql-binop-fill-modifiers feature flag.

bwplotka · 2026-01-15T11:09:19Z

nit: Why not reusing experimental-promql-functions flag for this? It's is a function in some way (:

https://prometheus.io/docs/prometheus/latest/feature_flags/#experimental-promql-functions

bwplotka · 2026-01-15T11:15:35Z

Chatted on Slack, it is modifier, so fine for another flag. I wanted to try to reduce the ff surface, but it's a bit of an abuse to reuse.

docs/querying/operators.md

docs/feature_flags.md

promql/parser/lex.go

bwplotka

Looks good to me.

The only nit is around test case for the metric names like `"fill + fill" that hits the special case we added.

promql/parser/lex.go

Signed-off-by: Julius Volz <julius.volz@gmail.com>

juliusv · 2026-01-19T19:05:48Z

Will merge for now to get it in before more merge conflicts, since it's already approved and I only made the minimal requested changes. Happy to make follow-up changes of course, if desired.

juliusv requested review from Nexucis and roidelapluie as code owners December 3, 2025 18:04

juliusv force-pushed the binop-fill-modifier branch 3 times, most recently from 4560728 to 2831d40 Compare December 3, 2025 18:22

juliusv mentioned this pull request Dec 3, 2025

Proposal: PromQL arithmetic with default value, or outer join #13625

Closed

juliusv force-pushed the binop-fill-modifier branch from 2831d40 to a1b8c38 Compare December 3, 2025 18:27

vpranckaitis reviewed Dec 5, 2025

View reviewed changes

promql/engine.go Outdated Show resolved Hide resolved

juliusv mentioned this pull request Dec 9, 2025

promql: Make promql-delayed-name-removal the default #15855

Open

juliusv force-pushed the binop-fill-modifier branch from a1b8c38 to f03a45a Compare December 10, 2025 18:25

juliusv changed the title ~~[WIP] PromQL: Add fill*() binop modifiers to provide default values for missing series~~ PromQL: Add fill*() binop modifiers to provide default values for missing series Dec 11, 2025

juliusv added a commit that referenced this pull request Dec 12, 2025

PromQL: Fix erroneous series collision with delayed metric name removal

c092603

See #17644 (comment) Signed-off-by: Julius Volz <julius.volz@gmail.com>

juliusv added a commit that referenced this pull request Dec 12, 2025

PromQL: Fix erroneous series collision with delayed metric name removal

858934c

See #17644 (comment) Signed-off-by: Julius Volz <julius.volz@gmail.com>

juliusv added a commit that referenced this pull request Dec 12, 2025

PromQL: Fix erroneous series collision with delayed metric name removal

3bf4b17

See #17644 (comment) Signed-off-by: Julius Volz <julius.volz@gmail.com>

juliusv added 5 commits January 15, 2026 07:56

PromQL: Add fill*() binop modifiers to provide default values for m…

af3277f

…issing series Signed-off-by: Julius Volz <julius.volz@gmail.com>

Add fill modifier PromQL tests

57dd1f1

Signed-off-by: Julius Volz <julius.volz@gmail.com>

Add PromLens binop matching explain view tests

ce26370

Signed-off-by: Julius Volz <julius.volz@gmail.com>

Document new fill binop modifiers

4c97952

Signed-off-by: Julius Volz <julius.volz@gmail.com>

Add new fill modifiers to features test data

d6aa6a3

Signed-off-by: Julius Volz <julius.volz@gmail.com>

juliusv force-pushed the binop-fill-modifier branch from f0d2ad4 to d6aa6a3 Compare January 15, 2026 06:59

Put binop fill modifiers behind a feature flag

d3b6e61

Signed-off-by: Julius Volz <julius.volz@gmail.com>

bwplotka reviewed Jan 16, 2026

View reviewed changes

bwplotka approved these changes Jan 16, 2026

View reviewed changes

promql/parser/lex.go Show resolved Hide resolved

juliusv added 2 commits January 16, 2026 20:11

Add a few fill modifier tests with keyword-like metric names

05440ff

Signed-off-by: Julius Volz <julius.volz@gmail.com>

Fix a missing space/newline in the binop docs

90dbdcd

Signed-off-by: Julius Volz <julius.volz@gmail.com>

juliusv merged commit 1d3d98e into main Jan 19, 2026
50 checks passed

juliusv deleted the binop-fill-modifier branch January 19, 2026 19:05

linasm mentioned this pull request Jan 28, 2026

perf(promql): update matchedSigs to a flat map reduce memory by 97% for GROUP_LEFT #17732

Closed

heliapb mentioned this pull request Feb 27, 2026

feat: add support to FillValues perses/promql-builder#36

Merged

Conversation

juliusv commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Does this PR introduce a user-facing change?

Uh oh!

Uh oh!

juliusv commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

juliusv commented Dec 11, 2025

Uh oh!

bboreham commented Dec 11, 2025

Uh oh!

roidelapluie commented Dec 12, 2025

Uh oh!

juliusv commented Dec 12, 2025

Uh oh!

juliusv commented Dec 12, 2025

Uh oh!

roidelapluie commented Dec 12, 2025

Uh oh!

jcreixell commented Dec 13, 2025

Uh oh!

juliusv commented Dec 15, 2025

Uh oh!

linasm commented Jan 15, 2026

Uh oh!

juliusv commented Jan 15, 2026

Uh oh!

linasm commented Jan 15, 2026

Uh oh!

juliusv commented Jan 15, 2026

Uh oh!

bwplotka commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bwplotka commented Jan 15, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bwplotka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

juliusv commented Jan 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

juliusv commented Dec 3, 2025 •

edited

Loading

juliusv commented Dec 11, 2025 •

edited

Loading

bwplotka commented Jan 15, 2026 •

edited

Loading