ESQL: SUM long overflow fix and TransportVersionAware interface#141272
ESQL: SUM long overflow fix and TransportVersionAware interface#141272ncordon merged 56 commits intoelastic:mainfrom
Conversation
# Conflicts: # server/src/main/resources/transport/upper_bounds/9.4.csv
|
Hi @ivancea, I've created a changelog YAML for you. |
# Conflicts: # server/src/main/resources/transport/upper_bounds/9.4.csv
There was a problem hiding this comment.
Added tests for a combination of:
- FROM and ROW
- Grouping and not grouping
- SUM and AVG (Surrogates to SUM)
| }; | ||
| case SUM -> switch (dataType) { | ||
| case LONGS -> new SumLongAggregatorFunctionSupplier(); | ||
| case LONGS -> new SumLongAggregatorFunctionSupplier(-1, -2, ""); |
There was a problem hiding this comment.
FYI because agg-generated code is in the compute project, it can't receive Source (esql/core) like evaluators do. So we pass the line, column and text directly.
This was made in the warnExceptions PR
alex-spies
left a comment
There was a problem hiding this comment.
This is currently blocked on the outcome of #138805.
That said, I like this approach. Looks nice and simple, and everything is clear after the substitutions step in the optimizer.
| readWindow(in), | ||
| readSummationMode(in) | ||
| readSummationMode(in), | ||
| in.getTransportVersion().supports(ESQL_SUM_LONG_OVERFLOW_FIX) ? in.readBoolean() : true |
| * Returns the expression to be replaced by or {@code null} if this cannot | ||
| * be replaced. | ||
| */ | ||
| Expression forTransportVersion(TransportVersion minTransportVersion); |
There was a problem hiding this comment.
I think we need some more explanation here?
I think there are 3 cases.
- Replaced successfully
- Doesn't need replace (returned same as before)
- Cannot be replaced (but maybe it should and if unsuccessful it should throw about unsupported feature)?
It seems right now we don't have the error code/throwing in the API design, but we might need it. We should think about this when designing the new interface and provide comments here about the expected behavior in that case. I think we can just throw in the forTransportVersion implementation, maybe just improve the comment
| new ReplaceAliasingEvalWithProject(), | ||
| new SkipQueryOnEmptyMappings(), | ||
| new SubstituteSurrogateExpressions(), | ||
| new SubstituteTransportVersionAwareExpressions(), |
There was a problem hiding this comment.
This is brittle. If there is a rule after this one that introduces sum and it will create a non-bwc sum and fail on old nodes (where before it would succeed unless there is an overflow). I reviewed the rules after, they seem OK right now (but @alex-spies you might know more details). But rules change from time to time.
Maybe it is safer to start with the old compatible sum version, and use this rule to make it new version of the sum?
There was a problem hiding this comment.
Add to toString to see it in the plan outputs?
# Conflicts: # server/src/main/resources/transport/upper_bounds/9.4.csv
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 72 out of 78 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
...VersionAwareGoldenTests/testSumGetsReplacedWithSafeLong/local_physical_optimization.expected
Outdated
Show resolved
Hide resolved
...t/java/org/elasticsearch/xpack/esql/expression/function/aggregate/SumSerializationTests.java
Outdated
Show resolved
Hide resolved
...t/java/org/elasticsearch/xpack/esql/expression/function/aggregate/SumSerializationTests.java
Outdated
Show resolved
Hide resolved
# Conflicts: # server/src/main/resources/transport/upper_bounds/9.4.csv
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 73 out of 79 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
.../test/java/org/elasticsearch/xpack/esql/expression/function/AbstractAggregationTestCase.java
Show resolved
Hide resolved
.../test/java/org/elasticsearch/xpack/esql/expression/function/AbstractAggregationTestCase.java
Outdated
Show resolved
Hide resolved
…end on the transport version
# Conflicts: # server/src/main/resources/transport/upper_bounds/9.4.csv
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (8)
📒 Files selected for processing (104)
📝 WalkthroughWalkthroughThis pull request addresses long overflow handling in ESQL SUM aggregation (issue ✨ Finishing Touches🧪 Generate unit tests (beta)
|
resolves elastic#145381 I think this is a sequencing conflict with elastic#141272, which applied a similar change to many other golden tests.
Fixes #110437
Recovers an old PR now that we have
minTransportVersionavailable: #116170SUM(long)throws on an overflow, leading to a 500 error and fully breaking the query.This PR introduces a
warnExceptionsin the evaluator (Introduced here and never used yet: #111829) to instead show a warning and return a null for that group/aggregation.Because this is an aggregation intermediate state change, this isn't BWC. So a new
TransportVersionAwareinterface was introduced to let the function modify its behavior based on theminTransportVersionacross all the query clusters (This is the new bit that we didn't have in the original PR).