ESQL: Prevent overflow on SUM aggregation#111639
ESQL: Prevent overflow on SUM aggregation#111639ivancea wants to merge 30 commits intoelastic:mainfrom
Conversation
|
Hi @ivancea, I've created a changelog YAML for you. |
|
Hi @ivancea, I've updated the changelog YAML for you. |
| } | ||
| } | ||
| } | ||
| return result; |
There was a problem hiding this comment.
I think this is a copy of some other code. Could we make a static reference instead here? At least for some of it. It's so weird.
| protected final BigArrays bigArrays; | ||
|
|
||
| private BitArray seen; | ||
| private BitArray failed; |
There was a problem hiding this comment.
I think it's worth a comment here about when these are initialized. And I think it's also worth a comment about why you didn't use an enum or an array of two-bit values. I think it's correct to use to BitArrays here because frequently you won't need either. And when you need seen you often won't need failed. That's why I think you did it this way. And, also, it's probably easier to do it this way....
There was a problem hiding this comment.
Yeah, we could do it with an enum-like array. We would lose tho the "false by default" feature of those bitarrays that avoid growing them for falses, and it would take more memory I expect.
Tbh, I didn't consider it, this looked quite minimal.
Going to comment both of them. If you think using other kind of container would be worth it, just tell me!
| * idempotent and fast if already tracking so it's safe to, say, call it once | ||
| * for every block of values that arrives containing {@code null}. | ||
| */ | ||
| final void enableGroupIdTracking(SeenGroupIds seenGroupIds) { |
There was a problem hiding this comment.
Should this extend from AbstractArrayState? Are these the same plus the extra failure methods?
| addedWarnings++; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
I think it's be fine to just move warnings here.
There was a problem hiding this comment.
Well, not in the aggregation package, but into compute.
There was a problem hiding this comment.
The existing Warnings depends on Source too, which is also from esql-core. Here this receives the required Source parts (line and col numbers + text). But I don't think it's worth changing everything to do it that way (?)
I'll check it. I could try something like keeping a "Warnings" in esql-core with just that conversion. Looks weird tho (Not that this doesn't look weird btw)
| $endif$ | ||
| $if(boolean||double||float)$ | ||
| import org.elasticsearch.compute.data.IntVector; | ||
| $endif$ |
There was a problem hiding this comment.
Again, I'm sorry for this silly dance we have to do for imports. I just couldn't figure out a way to turn off the import checkstyle rule for just this file.
There was a problem hiding this comment.
This is a copypaste+failed things, I didn't touch anything here :hehe:
I'll diff them in case it's worth like, adding an extra parameter to generate both (fallible and non fallible) from the same template. But I'll try to not overcomplicate it, I don't want 2 levels of templating here 😆
Add the `warnExceptions` capability to aggregators. Same as for evaluators. This requires: - Having a "Warnings-like" object in the compute module. I created a new class for that, temporarily - Added new "FaillibleState" objects, that hold a "failed" boolean/bitset to mark failed groups - Catching exceptions in all combine methods receiving a single group (Not catching in evaluateFinal/Intermediate, nor in combines that receive the full state) - Shortcircuiting operations if the group is in "failed" state - Having a third intermediate state: `values, seen, failed` Extracted from #111639. Check it to see a use case
# Conflicts: # x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/AbstractFallibleArrayState.java
...pute/src/test/java/org/elasticsearch/compute/aggregation/SumLongAggregatorFunctionTests.java
Show resolved
Hide resolved
# Conflicts: # x-pack/plugin/esql/compute/src/main/generated/org/elasticsearch/compute/aggregation/SumLongGroupingAggregatorFunction.java
Add the `warnExceptions` capability to aggregators. Same as for evaluators. This requires: - Having a "Warnings-like" object in the compute module. I created a new class for that, temporarily - Added new "FaillibleState" objects, that hold a "failed" boolean/bitset to mark failed groups - Catching exceptions in all combine methods receiving a single group (Not catching in evaluateFinal/Intermediate, nor in combines that receive the full state) - Shortcircuiting operations if the group is in "failed" state - Having a third intermediate state: `values, seen, failed` Extracted from elastic#111639. Check it to see a use case
Add the `warnExceptions` capability to aggregators. Same as for evaluators. This requires: - Having a "Warnings-like" object in the compute module. I created a new class for that, temporarily - Added new "FaillibleState" objects, that hold a "failed" boolean/bitset to mark failed groups - Catching exceptions in all combine methods receiving a single group (Not catching in evaluateFinal/Intermediate, nor in combines that receive the full state) - Shortcircuiting operations if the group is in "failed" state - Having a third intermediate state: `values, seen, failed` Extracted from elastic#111639. Check it to see a use case
# Conflicts: # x-pack/plugin/esql/compute/src/main/generated/org/elasticsearch/compute/aggregation/SumLongAggregatorFunction.java # x-pack/plugin/esql/compute/src/test/java/org/elasticsearch/compute/data/BlockSerializationTests.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/expression/function/TestCaseSupplier.java
# Conflicts: # x-pack/plugin/esql/compute/src/main/generated/org/elasticsearch/compute/aggregation/SumLongAggregatorFunction.java
Fixes #110443
This PR does some things:
Add theThis part was extracted to ESQL: Added exception catching to aggregators #111829warnExceptionscapability to aggregators. Same as for evaluators.warnExceptionsforSumLongAggregatorSumTests, and adapt the testsTo be done:
DO NOT MERGE
We can't merge this before the retrocompatibility with older versions is implemented, as we're now adding a new block to the intermediate state.