-
Notifications
You must be signed in to change notification settings - Fork 1.9k
chore: remove panics in datafusion-common::scalar by making more operations return Result
#7901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: remove panics in datafusion-common::scalar by making more operations return Result
#7901
Conversation
|
The CI appears to be failing. Marking as draft until they are passing. If there are specific questions about this PR, please let us know. |
|
Thank you for the work @junjunjd |
863ed9a to
4b61807
Compare
|
@alamb CI is fixed. This MR is ready for final review. |
Weijun-H
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @junjunjd 👍
comphead
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Epic work @junjunjd thanks
I'm thinking if we can get rid of .expect ? Or at least provide more details in expect message?
I removed the |
4b61807 to
60d1543
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we get rid of .expect in tests 🤔?
|
@Weijun-H @comphead using unwrap and expect in tests is actually the preferred practice, see https://github.com/influxdata/influxdb/blob/main/docs/style_guide.md#dont-return-result-from-test-functions. It makes the test failure easier to parse for a human and the test framework will already provide all the necessary context on failure. |
Weijun-H
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @junjunjd
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR @junjunjd. We very much appreciate the effort -- I am sorry that the ticket description may have been misleading
Buried in #3313, it says
The goal is not to remove all panics but review and make sure we are using them appropriately. Bonus points for adding documentation for invariants.
Can you explain why you removed the panics that you did? I think most of them are "unreachable" so forcing client code to check for errors that will never happen makes it harder to work with (and is why this PR adds around 500 new lines of code)
|
|
||
| let lt_res = | ||
| arrow::compute::kernels::cmp::lt(&arr1, &arr2).unwrap(); | ||
| arrow::compute::kernels::cmp::lt(&arr1, &arr2).ok()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This panic is reachable, for example if arr1 and arr2 have different data type, arrow::compute::kernels::cmp::lt will panic.
I think it makes sense to return None here instead of panicking and exiting since user just performs a partial order comparison.
This does not require any code change in client side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The potential downside of returning None rather than panic'ing is that it may mask a real bug and make it harder to track down -- comparing scalars of different types likely means they should have been coerced before
| ), | ||
| }, | ||
| ScalarValue::Fixedsizelist(..) => { | ||
| unimplemented!("FixedSizeList is not supported yet") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb What would be the preferred way to handle unimplemented errors in datafusion? There are many places where a NotImplemented error is returned instead of using unimplemented! and panicking. IMO returning an error makes more sense as user can choose to ignore unimplemented errors instead of panicking and exiting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree returning NotYetImplemented is a better choice
|
@alamb Thanks for the review. The majority of the added lines should be caused by line wrap reformat in tests. I have added comments to the panics I removed in scalar.rs. To summarize, these panics can be categorized into five types in general:
|
Result
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, thank you to @junjunjd for investing so much time, not just in the code but also evaluating the implications of the changes.
In general I think there are tradeoffs between panic and returning Errs -- specifically:
- Panic's are not as user friendly, but they stop computation immediately when something "unexpected" happens and thus are often easier to debug and locate the problem
- Errs are more user friendly, and can return messages that may help users workaround/fix whatever is wrong.
I realize there is a judgement call required to decide if something is "expected" or not and how much information users can get from error messages vs panics and weighing off a better user experience vs code that is more efficient to debug
I also realize the existing DataFusion codebase is not consistent in its handling of panics and errors.
On the balance I think this PR is an improvement to the code, and therefore I think it could be merged. Users of the affected APIs can simply unwrap the Result and get the same panic behavior as before.
I think it would be ok not to merge it too if other reviewers feel strongly in the other direction.
|
|
||
| let lt_res = | ||
| arrow::compute::kernels::cmp::lt(&arr1, &arr2).unwrap(); | ||
| arrow::compute::kernels::cmp::lt(&arr1, &arr2).ok()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The potential downside of returning None rather than panic'ing is that it may mask a real bug and make it harder to track down -- comparing scalars of different types likely means they should have been coerced before
| ), | ||
| }, | ||
| ScalarValue::Fixedsizelist(..) => { | ||
| unimplemented!("FixedSizeList is not supported yet") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree returning NotYetImplemented is a better choice
|
@junjunjd can you please merge and resolve the conflicts in this PR? Then we can merge it in |
|
Marking as draft as it needs conflicts resolved prior to being mergable. |
60d1543 to
691ebf4
Compare
691ebf4 to
2026514
Compare
|
@alamb Thank you for the review! I rebased the MR. It is ready for final review/merge. |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @junjunjd
This reverts commit e642cc2.
Which issue does this PR close?
It removes the majority of panics in datafusion-common::scalar #3313.
Rationale for this change
Important move towards closing #3313
Closes #3313
What changes are included in this PR?
Replace most of the panics in datafusion-common::scalar by
internal_err,not_impl_error otherDataFusionErrorvariantsAre these changes tested?
Yes
Are there any user-facing changes?
No