Add warnings (and annotations) to PromQL query results#12152
Add warnings (and annotations) to PromQL query results#12152beorn7 merged 35 commits intoprometheus:mainfrom
Conversation
beorn7
left a comment
There was a problem hiding this comment.
Thanks for doing this. As you can see from the comments, I'm not sure yet how we should tackle this problem. Comments and thoughts from @codesome, @roidelapluie, or any other interested maintainer are highly welcome.
As a minor point: Apparently, the linter has given up on enforcing a doc comment for every exported type. However, for many of the new exported types in this PR, a doc comment would be really helpful. Could you go through them and add a doc comment wherever it appears even remotely plausible to do so?
|
So, in summary, I think we have to crack to nuts here:
To reduce the problem surface, I think we should focus on (2) for now and only start thinking about (1) once the plumbing is in place. WRT (2): To give some preference among the many thoughts I have expressed in the comments, my gut feeling is currently to…
In that way, we won't need any structures (like |
b2f1e80 to
a57c3d1
Compare
36bdd6f to
a5078ca
Compare
beorn7
left a comment
There was a problem hiding this comment.
My apologies again for taking so long. This is a big deal, both in reward and effort, and smaller (also important) things kept preempting this bulky thing.
Thank you very much for addressing a lot of the detailed comments (and postponing the "range too short" issue). Before diving into a detailed code level review, let's find a way forward with the fundamental structure of the annotations. See my comments here. I'll also try to write down a proposal in the next hour or so. Maybe we should jump on a call to discuss our options in detail.
|
Here some rough ideas how to approach the problem. Happy to discuss here or in a call to refine.
To expand on the last point: Here is a slight variation of what you had already done: var (
PromQLInfo = errors.New("PromQL info")
PromQLWarning = errors.New("PromQL warning")
BadBucketLabelWarning = fmt.Errorf("%w: no bucket label or malformed label value", PromQLWarning)
PossibleNonCounterInfo = fmt.Errorf("%w: metric might not be a counter (name does not end in _total/_sum/_count)", PromQLInfo)
)
func NewBadBucketLabelWarning(metricName, label string) error {
return fmt.Errorf("%w: %s %s", BadBucketLabelWarning, metricName, label)
}
func NewPossibleNonCounterInfo(metricName string) error {
return fmt.Errorf("%w: %s", PossibleNonCounterInfo, metricName)
}For this example, I have only implemented two annotations here: Now a new version of the API could return the info annotations in a separate section by iterating through all annotations and testing them with for _, annotation := range annotations {
if errors.Is(annotation, annotations.PromQLInfo) {
// Add to info section.
} else {
// Add to warning section as before.
}
} |
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
2c029a6 to
221e064
Compare
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
221e064 to
da9633c
Compare
beorn7
left a comment
There was a problem hiding this comment.
Thank you very much. This is a huge accomplishment.
I'll merge this now, but there is one more thing to do:
Rule evaluations have ignored warnings so far, which was probably fine as they only came from remote read. Now that warnings AKA annotations are much more useful for recording rules, we need to log and count the annotations (and maybe even display them in the rules page of the web UI). The starting point is probably the EngineQueryFunc, where we need to extract the annotations from the res and return it as an additional return argument.
|
Thank you for your thorough reviews and all your suggestions! |
Related to PR #12152 Signed-off-by: Julius Volz <julius.volz@gmail.com>
) Related to PR #12152 Signed-off-by: Julius Volz <julius.volz@gmail.com>
| return enh.Out, annos.Add(annotations.NewMixedFloatsHistogramsWarning(metricName, args[0].PositionRange())) | ||
| } | ||
|
|
||
| if isCounter && !strings.HasSuffix(metricName, "_total") && !strings.HasSuffix(metricName, "_sum") && !strings.HasSuffix(metricName, "_count") { |
There was a problem hiding this comment.
I think you should add metricName=="" as a case that is not warned.
This would happen for instance if the input to rate comes from a calculation, and not from a series.
There was a problem hiding this comment.
Yeah... but is there a valid scenario where that can happen? Rates should always be done on raw counters coming from the storage before any other operation is done on them... to pass in a derived expression into rate() you would have to do a subquery to turn it into a range vector, and I'm not sure there's ever a case where it makes sense to do that for counters before rate-ing them?
…metheus#12982) Related to PR prometheus#12152 Signed-off-by: Julius Volz <julius.volz@gmail.com>
…metheus#12982) Related to PR prometheus#12152 Signed-off-by: Julius Volz <julius.volz@gmail.com> Signed-off-by: Levi Harrison <git@leviharrison.dev>
…metheus#12982) Related to PR prometheus#12152 Signed-off-by: Julius Volz <julius.volz@gmail.com> Signed-off-by: Levi Harrison <git@leviharrison.dev>
…metheus#12982) Related to PR prometheus#12152 Signed-off-by: Julius Volz <julius.volz@gmail.com> Signed-off-by: Levi Harrison <git@leviharrison.dev>
…metheus#12982) Related to PR prometheus#12152 Signed-off-by: Julius Volz <julius.volz@gmail.com> Signed-off-by: Levi Harrison <git@leviharrison.dev>
…theus#12152)" This reverts commit 69edd87.
…metheus#12982) Related to PR prometheus#12152 Signed-off-by: Julius Volz <julius.volz@gmail.com> Signed-off-by: Levi Harrison <git@leviharrison.dev>
Addresses some of #10839 and #11216 as a hackathon project.
Most of the code changes are just changing from
storage.Warningstonotes.Warningsto move it to a more neutral location as the new warnings we are adding are not related to storage.Annotations are created and handled separately from warnings here, but they are merged together with the warnings in the PromQL engine because the HTTP API and the UI don't support it yet and we want to avoid making external changes for now. By creating annotations separately though, it would make it easier in future when we want to display them differently.
Due to the way warnings are created, we end up with many duplicate warnings, or perhaps we find the warning is not needed later on, e.g. for the
RangeTooSmallWarning, we may add this warning at each section of the range, or we may only add it at the beginning or end but still return a non-empty result for the query. For now we do the deduplication and the removal of the unnecessary warning at the final response writing step in the API.New warnings (
util/notes/warnings.go):Possible future improvements:
notes.Warnings([]error) in the PromQL engine to structs that contain both warnings and annotations (the newnotes.Notes) to delay the merging of warnings and annotations (which would be helpful if we eventually separate them in the final output), and to have it as an independent struct to make any future changes easier (not done yet as it is not strictly necessary at this stage and I'm not sure if we want to do this)