-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Improve ergonomics of Scalar #4704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like an improvement to me. I sill think ScalarValue::new_int etc would be valuable
| /// See [`Datum`] for more information | ||
| pub struct Scalar<'a>(&'a dyn Array); | ||
| #[derive(Debug, Copy, Clone)] | ||
| pub struct Scalar<T: Array>(T); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what a user would be looking for is Scalar::new_primitive() and ScalarValue::new_string(). Without thos signatures function, I predict pretty much anyone using Scalars` are going to make some version of the function from #4701
fn make_primitive_scalar<T: num::ToPrimitive + std::fmt::Debug>(
d: &DataType,
scalar: T,
) -> Result<ArrayRef, ArrowError> {
match d {
DataType::Int8 => {
let right = try_to_type!(scalar, to_i8)?;
Ok(Arc::new(PrimitiveArray::<Int8Type>::from(vec![right])))
}
DataType::Int16 => {
let right = try_to_type!(scalar, to_i16)?;
Ok(Arc::new(PrimitiveArray::<Int16Type>::from(vec![right])))
}
DataType::Int32 => {
let right = try_to_type!(scalar, to_i32)?;
Ok(Arc::new(PrimitiveArray::<Int32Type>::from(vec![right])))
}
DataType::Int64 => {
let right = try_to_type!(scalar, to_i64)?;
Ok(Arc::new(PrimitiveArray::<Int64Type>::from(vec![right])))
}
DataType::UInt8 => {
let right = try_to_type!(scalar, to_u8)?;
Ok(Arc::new(PrimitiveArray::<UInt8Type>::from(vec![right])))
}
...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make some version of the function from
I don't follow this, why would they need to be creating a scalar from a mix of a concrete rust type and an arrow DataType? That function is a temporary compatibility shim to map the old scalar comparison kernels onto the new scalar APIs, and will be deleted in the near future.
The major reason I'm pushing back on this is it opens up a can of worms about how do you specify which particular ArrowPrimitiveType, or OffsetSize, you want, or any metadata like timezones, etc... By keeping things array based we avoid all that nonsense
| } | ||
|
|
||
| /// Create a new [`Scalar`] from `v` | ||
| pub fn new_scalar(value: impl AsRef<T::Native>) -> Scalar<Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about BooleanArray and DictonaryArray and ListArray?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add BooleanArray, not sure what the signature for nested types should be, and they aren't needed by any kernels anyway so seems unnecessary
Which issue does this PR close?
Closes #.
Rationale for this change
Originally I wanted to avoid Scalar permitting ownership to discourage constructions like
Vec<Scalar>. Unfortunately this had the side-effect of makingScalarquite cumbersome to use. This PR instead changes Scalar to be generic overT: Array, this allows constructions likeIt also allows adding constructor methods to
PrimitiveArrayandByteArray, that improve the ergonomics of constructing such scalars.What changes are included in this PR?
Are there any user-facing changes?
Yes, as
&dyn Array: ?Arraythis is a breaking change.