-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Perform type coercion for corr aggregate function #15776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perform type coercion for corr aggregate function #15776
Conversation
1933efe to
d8a7641
Compare
|
Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. |
|
PR is active. |
| }; | ||
|
|
||
| let physical_args = | ||
| // Create base physical expressions (without coercion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think typically coercsion is applied ealier in the planning proesses, not after physiscal expressions have been crated
Did you consider applying the coercsion along with the other coercion ?
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR @kumarlokesh 🙏
| let array_x = downcast_array::<Float64Array>(array_x); | ||
| let array_y = &cast(&values[1], &DataType::Float64)?; | ||
| let array_y = downcast_array::<Float64Array>(array_y); | ||
| let array_x = downcast_array::<Float64Array>(&values[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me like this code only handles Float64 but the signature of the function reports that it accepts any numeric type:
impl Correlation {
/// Create a new COVAR_POP aggregate function
pub fn new() -> Self {
Self {
signature: Signature::uniform(2, NUMERICS.to_vec(), Volatility::Immutable),
}
}
}I wonder if you just changed the signature to say the function needs Float64 argument types, woudl that be enough?
DataFusion already has a bunch of coercion rules, see https://docs.rs/datafusion/latest/datafusion/logical_expr/type_coercion/index.html for example
|
I am sorry for the delayed review |
|
Marking as draft as I think this PR is no longer waiting on feedback and I am trying to make it easier to find PRs in need of review. Please mark it as ready for review when it is ready for another look |
e8a9364 to
14e40a6
Compare
14e40a6 to
bcb761a
Compare
bcb761a to
bacff91
Compare
bacff91 to
1e4f2e9
Compare
|
@alamb thanks for the review! |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @kumarlokesh -- this looks good to me
Thanks for sticking with it
Maybe you can review the content of datafusion/sqllogictest/test_files/corr_type_coercion.slt one more time to make sure it is adding new coverage but otherwise it is ready to merge in my mind
| Self { | ||
| signature: Signature::uniform(2, NUMERICS.to_vec(), Volatility::Immutable), | ||
| signature: Signature::exact( | ||
| vec![DataType::Float64, DataType::Float64], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| group1 1 1 | ||
| group2 2 1 | ||
|
|
||
| # Verify the benchmark query's physical plan to confirm type coercion is happening |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't really see any verification in this plan -- I would expect to see casting happening in the datasource of something
| @@ -0,0 +1,248 @@ | |||
| # Licensed to the Apache Software Foundation (ASF) under one | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how much coverage this file provides compared to what was added in aggregate.slt
However, I think it is ok to keep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb Test file datafusion/sqllogictest/test_files/corr_type_coercion.slt doesn't seem to add enough new coverage than what's already covered in aggregate.slt.
Removed redundant test file.
|
Thanks again @kumarlokesh |
Which issue does this PR close?
corraggregate function #13721.Rationale for this change
What changes are included in this PR?
CorrelationGroupsAccumulator.update_batchmethodAre these changes tested?
Are there any user-facing changes?