Skip to content

Conversation

@pitrou
Copy link
Member

@pitrou pitrou commented Jun 9, 2021

Factor out type-agnostic string operations (such as finding a split pattern) in separate classes to avoid generating several versions of them when generating the typed kernel execution classes. This also makes the code slightly easier to understand and maintain (IMHO) by reducing use of subclassing.

Also fix a bug where some kernels would error out on invalid UTF8 data, even when it's the masked payload of a null value.

Factor out type-agnostic string operations (such as finding a split pattern)
in separate classes to avoid generating several versions of them when
generating the typed kernel execution classes.
@pitrou pitrou requested a review from lidavidm June 9, 2021 17:24
@pitrou
Copy link
Member Author

pitrou commented Jun 9, 2021

This reduces the code size of compute/kernels/scalar_string.cc.o by about 5% (in release mode). Not a terrific improvement, but a worthwhile cleanup IMHO.

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@github-actions
Copy link

github-actions bot commented Jun 9, 2021

@pitrou pitrou closed this in 9839eb4 Jun 9, 2021
@pitrou pitrou deleted the ARROW-12951-string-transform-refactor-v2 branch September 22, 2021 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants