-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-13853: [R] String to_title, to_lower, to_upper kernels #11232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-13853: [R] String to_title, to_lower, to_upper kernels #11232
Conversation
180d12f to
f6c29d0
Compare
|
@github-actions crossbow submit test-r-linux-as-cran |
|
Revision: f6c29d0379efb31fe4e41c959356bc484c4ca43c Submitted crossbow builds: ursacomputing/crossbow @ actions-859
|
jonkeane
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks good — a few comments and running our as-cran crossbow build to confirm that this doesn't error on that.
a53004b to
42f6860
Compare
jonkeane
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is up with the tests, I've added a suggestion that should get around that. If it persists or you want me to dig into this deeper let me know and I'll pull the branch locally and figure out what's going wrong here.
|
Thanks all for the great reviews! |
|
Interesting...the failed tests are because # R stringr
> str_to_title("1Foo1") # "1foo1"
# Python
>>> "1Foo1".title() # "1Foo1"I conclude that this also applies for "is_title()" function because it ignores beginning integers...well... What to do in such a case? cc @jonkeane @nealrichardson |
|
Hmm, practically speaking I suspect this won't matter to all that many people. We have a few options:
I suspect that 2 is probably the easiest right now. |
|
And if you're looking for other ways this could be even more hairy: https://stat.ethz.ch/R-manual/R-devel/library/tools/html/toTitleCase.html
|
|
Yeah, when I was implementing the For this PR, the R binding calls the C++ kernels directly, so it will produce different results for extreme cases when compared to stringr. If users find issues or request more complex functionality, then this can be revisited in a future PR. |
nealrichardson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI failure is unrelated (failure to install an optional dependency)
This PR adds locale validation to string functions matching stringr: str_to_title, str_to_lower, and str_to_upper.
In Arrow C++, these kernels do not support a locale argument so only the default R locale ("en") is currently supported.
Closes apache#11232 from edponce/ARROW-13853-String-title-case-kernel
Authored-by: Eduardo Ponce <edponce00@gmail.com>
Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
This PR adds locale validation to string functions matching stringr: str_to_title, str_to_lower, and str_to_upper.
In Arrow C++, these kernels do not support a locale argument so only the default R locale ("en") is currently supported.