Skip to content

Conversation

@lidavidm
Copy link
Member

This wraps the tdigest aggregation to make it easier for bindings to use. This PR is on top of ARROW-14050 (#11199).

@github-actions
Copy link

@ianmcook
Copy link
Member

@lidavidm When I test the hash aggregate variation of this this from the R bindings on arrays of integer data that have integral medians within groups, I'm seeing errors like this:

Error: Invalid: Referenced field x was double but should have been int32

Do these kernels always return float64 output or are there some cases where integer input will give integer output?

@lidavidm
Copy link
Member Author

@ianmcook hmm, no, it should always give float64 regardless of the input type - I'll take a look tomorrow (if you push it somewhere I can dig into it too)

@ianmcook
Copy link
Member

@ianmcook hmm, no, it should always give float64 regardless of the input type - I'll take a look tomorrow (if you push it somewhere I can dig into it too)

I built Arrow C++ from this PR and I'm using that to test the changes to the R bindings in #11018. The test that fails with this error is here: https://github.com/apache/arrow/pull/11018/files#diff-a18898eb61bfc3d3dda292cf594b26d5f08a8cd7b455b72ed9c58f6e9c1311edR234-R249

@lidavidm
Copy link
Member Author

Thanks, that should be fixed now - I had declared the wrong output type, leading to expressions getting bound with the wrong types. The tests pass locally for me now.

@ianmcook
Copy link
Member

Thanks, that should be fixed now - I had declared the wrong output type, leading to expressions getting bound with the wrong types. The tests pass locally for me now.

Yep that fixes it, thanks!

@ianmcook ianmcook requested a review from pitrou September 22, 2021 15:14
@ianmcook ianmcook requested a review from pitrou September 23, 2021 18:38
@ianmcook
Copy link
Member

Is this ready to merge? The CI failure appears to be unrelated.

@lidavidm
Copy link
Member Author

I think we should be good here? We can tackle anything in a followup if needed.

@lidavidm lidavidm closed this in 78ac132 Sep 24, 2021
@lidavidm lidavidm deleted the arrow-14052 branch September 24, 2021 13:02
ViniciusSouzaRoque pushed a commit to s1mbi0se/arrow that referenced this pull request Oct 20, 2021
This wraps the tdigest aggregation to make it easier for bindings to use. This PR is on top of ARROW-14050 (apache#11199).

Closes apache#11204 from lidavidm/arrow-14052

Authored-by: David Li <li.davidm96@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants