-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Allow dropping qualified columns #19549
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
now we are changing/improving the behavior of
|
renato2099
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my only request/comment would be to find the place where we can update the documentation for the drop_column change, thanks!
| .map(|name| { | ||
| schema | ||
| .qualified_field_from_column(&Column::from_name(*name)) | ||
| .map_err(|_| plan_datafusion_err!("Column '{}' not found", name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much better than silently ignoring the error 👍
So the
Also just clarification that |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me
| pub fn drop_columns(self, columns: &[&str]) -> Result<DataFrame> { | ||
| pub fn drop_columns<T>(self, columns: &[T]) -> Result<DataFrame> | ||
| where | ||
| T: Into<Column> + Clone, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering why the Clone bound is needed. In case anyone else is curious -- it turns out it needed because what is passed in is &[T] (vs something like [T])
error[E0277]: the trait bound `datafusion_common::Column: std::convert::From<&T>` is not satisfied
--> datafusion/core/src/dataframe/mod.rs:455:42
|
455 | let column: Column = col.into();
| ^^^^ the trait `std::convert::From<&T>` is not implemented for `datafusion_common::Column`
|
= note: required for `&T` to implement `std::convert::Into<datafusion_common::Column>`
help: consider extending the `where` clause, but there might be an alternative better way to express this requirement
|
450 | T: Into<Column>, datafusion_common::Column: std::convert::From<&T>
| +++++++++++++++++++++++++++++++++++++++++++++++++
| ctx.register_batch("t", batch)?; | ||
|
|
||
| let df = ctx.table("t").await?.drop_columns(&["f.c1"])?; | ||
| let df = ctx.table("t").await?.drop_columns(&["\"f.c1\""])?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while this is technically a breaking API change, I think it is reasonable to treat it as a bug fix
|
Thank you @ntjohnson1 and @renato2099 |
This reverts commit 52d153e.
…/drop_ambiguous_columns
Which issue does this PR close?
Rationale for this change
Explanation in the issue. Motivation coming more concretely from datafusion-python apache/datafusion-python#1305 (comment)
What changes are included in this PR?
df.col(<name>)to match the expr version but then we probably do repeated checks for the underlying table name unless there is some caching somewhere. Maybe that performance impact isn't significant.Are these changes tested?
Yes some additional tests are provided.
Are there any user-facing changes?
I had to update the
drop_columns(&[])test since the type can no longer be inferred. I'm not sure if that is representative of any actual use cases though since I expect the more common is a vector that might be empty in which case the type would be specified.It now requires specifying columns with dots in them similar to other places
"\"f.col1\""to disambiguate from"f.col1".