-
Notifications
You must be signed in to change notification settings - Fork 1.1k
perf: Use Cow in get_format_string in FFI_ArrowSchema #6853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| } | ||
|
|
||
| fn get_format_string(dtype: &DataType) -> Result<String, ArrowError> { | ||
| fn get_format_string(dtype: &DataType) -> Result<Cow<'static, str>, ArrowError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we don't need it to be owned. Maybe just &str?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying we can use &str return instead of Cow? some of the data types use format! to create new strings so I don't think that can work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. I didn't see there are Cow::Owned.
|
It fails msrv check. Otherwise the performance looks good. |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @andygrove and @viirya -- this looks good to me as well (modulo the test failures)
arrow-schema/src/ffi.rs
Outdated
| DataType::Struct(_) => Ok("+s".to_string()), | ||
| DataType::Map(_, _) => Ok("+m".to_string()), | ||
| DataType::RunEndEncoded(_, _) => Ok("+r".to_string()), | ||
| DataType::Null => Ok(Cow::Borrowed("n")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This syntax is totally fine (and very explicit)
I think you can express the same thing using into which is less verbose:
For example
| DataType::Null => Ok(Cow::Borrowed("n")), | |
| DataType::Null => Ok("n".into()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is much nicer. Updated.
| } | ||
|
|
||
| fn get_format_string(dtype: &DataType) -> Result<String, ArrowError> { | ||
| fn get_format_string(dtype: &DataType) -> Result<Cow<'static, str>, ArrowError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strictly speaking this is an API change, but since the next scheduled release is a major one (allows API changes) we can merge it in to main without worry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this function is exposed as part of the public API for this crate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is only used internally.
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @andygrove and @viirya
* add cast_decimal bench * format * save * revert * criterion disable default features * address feedback
Which issue does this PR close?
N/A
Rationale for this change
We're using a lot of FFI in Comet and are looking for any performance wins we can get.
What changes are included in this PR?
Use
Cowinget_format_stringto reduce some string allocations.Are there any user-facing changes?
No