-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-12751: [C++] Implement minimum/maximum kernels #10390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cpp/src/arrow/compute/api_scalar.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to name this something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worth giving this a more general name, because there are other potential uses. Maybe ElementWiseAggregateOptions? (since this is aggregation, just across instead of down)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If using "Aggregate" is too confusing, maybe "Combine" or "Merge" would work too. e.g. ElementWiseCombineOptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have ScalarAggregateOptions I think ElementWiseAggregateOptions should be ok. I've updated the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is rather a lot of code being generated; it might be nice if I could figure out the right template setup to consolidate the temporal types with their respective equivalent integral type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's GeneratePhysicalInteger, though that doesn't include floating point. Perhaps you could add GeneratePhysicalNumeric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this. Note that if we want to have CommonTimestamp() I don't think this'll fly since we'd need to scale the timestamp values to the common unit before comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, CommonTimestamp indicates that an implicit cast is necessary (which in the case of timestamps includes the scaling). It's not the responsibility of the kernel to execute that implicit cast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Ok, my mistake. I'll add some test cases for that.
Binary types aren't implemented here, though (and would need some work/restructuring to handle).
86b55b8 to
7bb91e0
Compare
|
(N.B. I still need to fix CommonTimestamp, got caught up in fixing the other feedback) |
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what the point is of a variadic function when we don't have e.g. variadic addition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use arr->null_count() == 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will allocate a null bitmap even if all input arrays have 0 nulls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of if (first), you could move the bitmap allocation below and use if (!output->buffers[0]).
cpp/src/arrow/ipc/json_simple.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we ensure it's equal to 1?
docs/source/cpp/compute.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add a column for the options class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this done by default in the test setup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh. I don't think generating string representations for each and every test is a good idea.
My understanding is that we are aiming for Arrow's compute API to reach parity with the collections of built-in functions available in popular SQL engines. These SQL engines include several variadic functions that combine/merge/aggregate values row-wise. Two such SQL functions are There are two ways we could implement this functionality without implementing variadic functions:
If one of those alternative approaches is superior, then perhaps we do not need a variadic function. But my current understanding is that both of these alternative approaches are inferior for reasons of usability and efficiency respectively. |
|
I see. Well, if efficiency is really important and it's common to have more than 2 inputs, then why not (though the implementation in this PR is probably far from optimal). |
docs/source/cpp/compute.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just came to my mind that it's gonna be very confusing to have "minimum" and "maximum" functions in addition to "min_max". Perhaps we can be less ambiguous, e.g. "scalar_min", "scalar_max" (or "horizontal_min" or "row_min"...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I'll rename the compute function. Perhaps elementwise_min in accordance with the options struct (though that is a bit wordy)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's wordy indeed. Or at least element_wise_min.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about naming them least and greatest like they do in SQL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is that less ambiguous?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just changed it to element_wise_min/max.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that using least and greatest would not fully resolve the ambiguity (except to users familiar with SQL) but the use of these synonyms would at least signal that these are something different from min and max. I think element_wise_min and element_wise_max are fine and do a better job of disambiguating as long as we're OK with their length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of trying to recycle buffers, we could change the kernel's flag and allow the executor to preallocate the output data buffer. Then we populate this with the anti extreme (for maximum/uint64_t we zero the buffer). Then we wouldn't need to use a bitmap to express "no values folded yet"; get_extreme({anti_extreme, v...}) == get_extreme({v...}). We'd also be able to compute the null bitmap separately in either case: !skip_nulls AND the bitmaps, skip_nulls OR the bitmaps.
In the case where scalar_count != 0, we can ExecScalar and use the result instead of an anti_extreme
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works quite well except in the case that we have a singular NaN array argument (element_wise_min([NaN])) - in which case we want to emit NaN not the antiextreme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, maybe let's just have the 'antiextreme' of a float be NaN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The antiextremes of float are Inf, -Inf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inf would trip up if we had something like element_wise_min([NaN, 0], [NaN, 1]) in which case we'd want to emit NaN and not Inf as the first element. Using Inf as the antiextreme means we'd emit Inf since the kernel takes any non-NaN value over any NaN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, I was expecting fmin(NaN, x) = NaN but in fact it's fmin(NaN, x) = x. Nevermind
50e86be to
33813ac
Compare
|
@lidavidm could you briefly describe what happens here if the arrays passed to these kernels have different but comparable data types such as one array of doubles and one array of integers? |
For numeric/temporal types, all arrays will be cast to a common compatible type, much the same as with arithmetic kernels (e.g. for doubles + integers, all values will be cast to doubles). |
bkietz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this!
This is a bit messy, but implements a variadic scalar maximum/minimum kernel.