-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-11330: [Rust][DataFusion] add ExpressionVisitor to encode expression walking #9278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
||
| /// Recursively walk an expression tree, collecting the unique set of column names | ||
| /// referenced in the expression | ||
| pub fn expr_to_column_names(expr: &Expr, accum: &mut HashSet<String>) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty good example of the kind of repetition that can be removed using this visitor pattern.
Note that I still left all expr types enumerated so that anyone who adds a new Expr type need to update this code, and (hopefully) think if they need to add special handling for that new expr types
|
I took a quick look and this looks good to me. I will try and review in more detail tonight. |
jorgecarleitao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is beautiful. Thank you, @alamb
|
I just wanted to add that there is no need to wait for me to review before merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This should make it a bit less verbose / and error prone to write things like optimization rules.
I think we may just want to try to use it, if we find another pattern / approach works better we can adapt / improve it.
Performance-wise, it doesn't compile to a handwritten for loop of course (or even plain recursion), but for most foreseeable use cases, except some extreme examples, I think that should be fine for now.
|
Thanks @jorgecarleitao and @Dandandan . I am personally quite excited at using this same idea for expression rewriting. |
Co-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>
…ssion walking ## Problem: * There are several places in the DataFusion codebase where a walk of an Expression tree is needed * The logic of how to walk the tree is replicated * Adding new expression types often require many mechanically different but semantically the same changes in many places where no special treatment of such types is needed This PR introduces a `ExpressionVisitor` trait and the `Expr::accept` function to consolidate this walking of the expression tree. It does not intend to change any functionality. If folks like this pattern, I have ideas for a similar type of trait `ExpressionRewriter` which can be used to rewrite expressions (much like `clone_with_replacement`) as a subsquent PR. I think this was mentioned by @Dandandan in the [Rust roadmap](https://docs.google.com/document/d/1qspsOM_dknOxJKdGvKbC1aoVoO0M3i6x1CIo58mmN2Y/edit#heading=h.kstb571j5g5j) cc @jorgecarleitao @Dandandan and @andygrove Closes #9278 from alamb/alamb/expression_visitor Authored-by: Andrew Lamb <andrew@nerdnetworks.org> Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>
Problem:
This PR introduces a
ExpressionVisitortrait and theExpr::acceptfunction to consolidate this walking of the expression tree. It does not intend to change any functionality.If folks like this pattern, I have ideas for a similar type of trait
ExpressionRewriterwhich can be used to rewrite expressions (much likeclone_with_replacement) as a subsquent PR. I think this was mentioned by @Dandandan in the Rust roadmapcc @jorgecarleitao @Dandandan and @andygrove