Conversation
There was a problem hiding this comment.
The crux: having the join config's parts actually in the node info, rather than the join config as a whole.
This makes it so that the fields the Join relies on are actually picked up when we run expressions - and therefore also when we run references.
There was a problem hiding this comment.
Getting rid of the List<Expression> conditions is actually necessary; in a first attempt, I had them part of Join's NodeInfo, but that made tests fail:
- If expressions like
left == rightare part of the node info forJoin(which they should), they are also subject to optimizations. - If
leftis a reference attribute for a literalnull, its data type isNULL. If we applyFoldNull, thusleft == rightwill be optimized tonull.
Our optimizations treat our expressions as things that are to be evaluated, so replacing them by something that evaluates to the same is seen as correct. That is absolutely not okay if the expressions are meant to only symbolize a correspondence between attributes on the left and right hand sides.
There was a problem hiding this comment.
In this case, I think having unit tests for logical plan methods is warranted; making sure that the references are computed correctly is important.
x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/plan/logical/JoinTests.java
Outdated
Show resolved
Hide resolved
86d58d0 to
44196fe
Compare
Randomize; test Join.expressions() as well.
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Hi @alex-spies, I've created a changelog YAML for you. |
| out.writeNamedWriteableCollection(matchFields); | ||
| out.writeCollection(conditions, (o, v) -> v.writeTo(o)); | ||
| out.writeNamedWriteableCollection(leftFields); | ||
| out.writeNamedWriteableCollection(rightFields); |
There was a problem hiding this comment.
This is fine because you can't use this outside of a snapshot build and we don't offer wire compatibility between snapshot builds.
|
Thanks for the review @nik9000 ! |
Join.references()return all the references to attributes in children used for matching the left and right hand sides.leftAttribute == rightAttributeexpressions to express how to match with a table inLOOKUP; these expressions get optimized away sometimes. This requires a breaking change toJoin's serialization.It is important that a call to
Join.references()(resp.Join.expressions()) returns all the references (expressions) that theJoinrelies on; esp. the references may need to be rewritten in optimizations.Ex.: If we wish to push down an
Evalpast aJoin, we may need to adjust attribute names in case of name collisions:In optimization rules, we use machinery like
QueryPlan.transformExpressionsOnly, like here. This relies on a correct implementation of theinfo()method - which also implies correctness ofreferences()andexpressions().