Skip to content

Conversation

@lidavidm
Copy link
Member

This is a bit messy, but implements a variadic scalar maximum/minimum kernel.

@github-actions
Copy link

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to name this something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is worth giving this a more general name, because there are other potential uses. Maybe ElementWiseAggregateOptions? (since this is aggregation, just across instead of down)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If using "Aggregate" is too confusing, maybe "Combine" or "Merge" would work too. e.g. ElementWiseCombineOptions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have ScalarAggregateOptions I think ElementWiseAggregateOptions should be ok. I've updated the PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rather a lot of code being generated; it might be nice if I could figure out the right template setup to consolidate the temporal types with their respective equivalent integral type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's GeneratePhysicalInteger, though that doesn't include floating point. Perhaps you could add GeneratePhysicalNumeric?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this. Note that if we want to have CommonTimestamp() I don't think this'll fly since we'd need to scale the timestamp values to the common unit before comparison.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, CommonTimestamp indicates that an implicit cast is necessary (which in the case of timestamps includes the scaling). It's not the responsibility of the kernel to execute that implicit cast

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Ok, my mistake. I'll add some test cases for that.

Binary types aren't implemented here, though (and would need some work/restructuring to handle).

@lidavidm lidavidm force-pushed the arrow-12751 branch 2 times, most recently from 86b55b8 to 7bb91e0 Compare May 25, 2021 15:58
@lidavidm
Copy link
Member Author

(N.B. I still need to fix CommonTimestamp, got caught up in fixing the other feedback)

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what the point is of a variadic function when we don't have e.g. variadic addition?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use arr->null_count() == 0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will allocate a null bitmap even if all input arrays have 0 nulls.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of if (first), you could move the bitmap allocation below and use if (!output->buffers[0]).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we ensure it's equal to 1?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add a column for the options class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this done by default in the test setup?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh. I don't think generating string representations for each and every test is a good idea.

@ianmcook
Copy link
Member

ianmcook commented May 26, 2021

Can you explain what the point is of a variadic function when we don't have e.g. variadic addition?

My understanding is that we are aiming for Arrow's compute API to reach parity with the collections of built-in functions available in popular SQL engines. These SQL engines include several variadic functions that combine/merge/aggregate values row-wise. Two such SQL functions are greatest() and least(). The two kernels implemented in this PR are equivalent to those two SQL functions. Another such SQL function is concat() and its variant concat_ws(), which we intend to implement in ARROW-12709.

There are two ways we could implement this functionality without implementing variadic functions:

  1. Chain multiple calls to a binary function (as is required for addition of more than two values, as you mention)
  2. Combine the arrays row-wise into a ListArray (ARROW-12739) then operate on the ListArray with a unary function

If one of those alternative approaches is superior, then perhaps we do not need a variadic function. But my current understanding is that both of these alternative approaches are inferior for reasons of usability and efficiency respectively.

@pitrou
Copy link
Member

pitrou commented May 26, 2021

I see. Well, if efficiency is really important and it's common to have more than 2 inputs, then why not (though the implementation in this PR is probably far from optimal).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just came to my mind that it's gonna be very confusing to have "minimum" and "maximum" functions in addition to "min_max". Perhaps we can be less ambiguous, e.g. "scalar_min", "scalar_max" (or "horizontal_min" or "row_min"...).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I'll rename the compute function. Perhaps elementwise_min in accordance with the options struct (though that is a bit wordy)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's wordy indeed. Or at least element_wise_min.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about naming them least and greatest like they do in SQL?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is that less ambiguous?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just changed it to element_wise_min/max.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that using least and greatest would not fully resolve the ambiguity (except to users familiar with SQL) but the use of these synonyms would at least signal that these are something different from min and max. I think element_wise_min and element_wise_max are fine and do a better job of disambiguating as long as we're OK with their length.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of trying to recycle buffers, we could change the kernel's flag and allow the executor to preallocate the output data buffer. Then we populate this with the anti extreme (for maximum/uint64_t we zero the buffer). Then we wouldn't need to use a bitmap to express "no values folded yet"; get_extreme({anti_extreme, v...}) == get_extreme({v...}). We'd also be able to compute the null bitmap separately in either case: !skip_nulls AND the bitmaps, skip_nulls OR the bitmaps.

In the case where scalar_count != 0, we can ExecScalar and use the result instead of an anti_extreme

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works quite well except in the case that we have a singular NaN array argument (element_wise_min([NaN])) - in which case we want to emit NaN not the antiextreme.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, maybe let's just have the 'antiextreme' of a float be NaN.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The antiextremes of float are Inf, -Inf

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inf would trip up if we had something like element_wise_min([NaN, 0], [NaN, 1]) in which case we'd want to emit NaN and not Inf as the first element. Using Inf as the antiextreme means we'd emit Inf since the kernel takes any non-NaN value over any NaN.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I was expecting fmin(NaN, x) = NaN but in fact it's fmin(NaN, x) = x. Nevermind

@lidavidm lidavidm force-pushed the arrow-12751 branch 2 times, most recently from 50e86be to 33813ac Compare June 2, 2021 16:26
@ianmcook
Copy link
Member

ianmcook commented Jun 3, 2021

@lidavidm could you briefly describe what happens here if the arrays passed to these kernels have different but comparable data types such as one array of doubles and one array of integers?

@lidavidm
Copy link
Member Author

lidavidm commented Jun 3, 2021

@lidavidm could you briefly describe what happens here if the arrays passed to these kernels have different but comparable data types such as one array of doubles and one array of integers?

For numeric/temporal types, all arrays will be cast to a common compatible type, much the same as with arithmetic kernels (e.g. for doubles + integers, all values will be cast to doubles).

Copy link
Member

@bkietz bkietz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants