Improve AccumulatorCompiler generated code#16722
Conversation
3cc35e3 to
2363119
Compare
461d1e7 to
ab7677d
Compare
ab7677d to
70a2420
Compare
|
@jamesgpearce James, these are very nice optimizations, but I can't really read the code generation code. The benchmark results visualization is very cool. I'm curious how did you create it? The changes looks good, but I wonder whether there is sufficient testing. Have you done any additional testing beyond unit tests? If so, would you share what these are? |
You can tell JMH to output results as JSON format, and those files can be parsed and displayed using jmh.morethan.io either by uploading the local file or by adding a
I added a test so that any distinct aggregations tests using |
|
Thank you, James. We are consistently finding that deep changes in the engine like these are not getting sufficiently covered by unit tests as bugs come out when we run the verifier during release process or after deployment in production settings. Any chance you could run this change on some of the production workload you have access to? This would give us a bit more confidence that this change won't break things. That said, I feel the quality of this PR is high therefore I'm going to accept. I'll wait for you to decide whether you can do additional testing before merging. |
c9a01d1 to
b1e8429
Compare
Improves the AccumulatorCompilerGeneratedCode by - Using an int for the maskChannel field as well as each inputChannel field required instead of retaining a boxed Optional or List of Integer - Filtering RLE mask blocks of true into nulls before entering the input loop to bypass unnecessary mask checks - Detecting RLE mask blocks of false and bypassing the addInput for loop if all positions are eliminated
b1e8429 to
12e4ce7
Compare
|
This has been running in our test environment for a while now with no edge cases discovered and has also been merged into Trino at this point without incident reported there. Since I see other ongoing work in the area of aggregations and codegen, I'm going to merge once the rebased changes are green since it seems like letting this PR sit open longer is more likely to let incompatible changes creep in. |
Improves the AccumulatorCompilerGeneratedCode by
intfield formaskChannelandinputChannelfields, instead of retaining a boxedOptionalorListof IntegerBenchmark results (note, this does not test the RLE mask handling scenarios):