Define sum of empty exponential histograms as null instead of 0.0#138349
Merged
JonasKunz merged 7 commits intoelastic:mainfrom Nov 21, 2025
Merged
Define sum of empty exponential histograms as null instead of 0.0#138349JonasKunz merged 7 commits intoelastic:mainfrom
JonasKunz merged 7 commits intoelastic:mainfrom
Conversation
Collaborator
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
JonasKunz
commented
Nov 20, 2025
| required_capability: enrich_load | ||
| required_capability: fix_replace_missing_field_with_null_duplicate_name_id_in_layout | ||
| required_capability: dense_vector_agg_metric_double_if_fns | ||
| required_capability: exponential_histogram |
Contributor
Author
There was a problem hiding this comment.
This is not needed anymore, because in EsqlSpecTestCase we already ensure that we only load the histogram-CSV data into an index if the capability is present.
Same for lookup-join.csv-spec
not-napoleon
added a commit
that referenced
this pull request
Nov 24, 2025
Part of #137988 Implement #138349 for the t-digest field type. This changes the behavior of the sum sub-field of t-digest when the digest is empty. Prior to this PR we treated it as 0, and this changes it to be null. This avoids a division by zero error in ESQL when trying to calculate the average of an empty histogram. We are adopting the same behavior for the exponential histogram field (see PR linked above), and this is important to keep the semantics of the two fields as close as possible.
afoucret
pushed a commit
to afoucret/elasticsearch
that referenced
this pull request
Nov 26, 2025
Part of elastic#137988 Implement elastic#138349 for the t-digest field type. This changes the behavior of the sum sub-field of t-digest when the digest is empty. Prior to this PR we treated it as 0, and this changes it to be null. This avoids a division by zero error in ESQL when trying to calculate the average of an empty histogram. We are adopting the same behavior for the exponential histogram field (see PR linked above), and this is important to keep the semantics of the two fields as close as possible.
ncordon
pushed a commit
to ncordon/elasticsearch
that referenced
this pull request
Nov 26, 2025
ncordon
pushed a commit
to ncordon/elasticsearch
that referenced
this pull request
Nov 26, 2025
Part of elastic#137988 Implement elastic#138349 for the t-digest field type. This changes the behavior of the sum sub-field of t-digest when the digest is empty. Prior to this PR we treated it as 0, and this changes it to be null. This avoids a division by zero error in ESQL when trying to calculate the average of an empty histogram. We are adopting the same behavior for the exponential histogram field (see PR linked above), and this is important to keep the semantics of the two fields as close as possible.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Followup for #138177.
Prior to this PR, we defined the
SUMof an empty histogram to be0.0in ES|QL.As a result, this rightfully would yield a division-by-zero warning when trying to compute the average:
SUM/COUNT = 0.0/0.0.Empty histograms are an edge case: however, they will be more common when we support cumulative histograms.
Then, the "delta" of an unchanged counter histogram will be an empty histogram.
In this PR, we change the definition of
SUMto benullfor empty histograms so that the average computation isnull/0.0instead, which correctly returnsnullwithout warnings.To stay efficient, we also apply this change to the doc-values when we store exponential histograms. This prevents us from having to do any conversions on the
sumblock when loading, we can just load it as-is as the invariants between doc values and the block match.Because this changes the invariants of the block, it is inherently backwards-incompatible. This is not a problem because the field is hidden behind a feature flag, but we still run bwc tests with feature flags enabled.
For this reason I replaced the ES|QL capabilities with a single, new one, which essentially disables BWC tests for this feature up until now.