Skip to content

Conversation

@johanpel
Copy link
Contributor

@johanpel johanpel commented Dec 7, 2021

No description provided.

@github-actions
Copy link

github-actions bot commented Dec 7, 2021

@github-actions
Copy link

github-actions bot commented Dec 7, 2021

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Can we also add this function to https://github.com/apache/arrow/blob/master/docs/source/cpp/compute.rst ? (It should be in the same table as 'ascii_reverse' and 'utf8_reverse'.) We should document that this one does not make any assumptions about the encoding.

@lidavidm
Copy link
Member

lidavidm commented Dec 7, 2021

Oh, and it needs to be listed in https://github.com/apache/arrow/blob/master/docs/source/python/api/compute.rst as well (again alongside its ascii/utf8 counterparts)

@johanpel
Copy link
Contributor Author

johanpel commented Dec 8, 2021

Thanks for the comments. I added the docs. I also added a test because the function can be applied to String/LargeString as well. I'm in doubt whether we would want to prevent applying this function to String/LargeString. Use cases would be quite exotic.

If we'd like to prevent it I could introduce BinaryTypes() with {binary(), large_binary()} in arrow/type.h and use that to add the kernel to the registry only for these types, but I'm not sure if this would have further implications for the rest of the codebase.

@lidavidm
Copy link
Member

lidavidm commented Dec 8, 2021

Oh, whoops. No, good point: we should not support string types here, we should register only Binary and LargeBinary. For String/LargeString the kernel would need to validate that its output is still valid UTF-8 which would be redundant with string_reverse.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, just two nits.

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update!

@lidavidm lidavidm closed this in b1f009c Dec 10, 2021
@ursabot
Copy link

ursabot commented Dec 10, 2021

Benchmark runs are scheduled for baseline = 00d5077 and contender = b1f009c. b1f009c is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.09% ⬆️0.0%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants