Implicit key type conversion for join using step in pipeline#19885
Implicit key type conversion for join using step in pipeline#19885
Conversation
35ac215 to
8d7a135
Compare
8d7a135 to
4ec9636
Compare
src/Interpreters/join_common.cpp
Outdated
| { | ||
| assert(left_keys.size() == right_keys.size()); | ||
|
|
||
| /// only JOIN USING supported |
There was a problem hiding this comment.
The fact that only USING is supported is clear enough from the code, the comment should probably tell "why".
There was a problem hiding this comment.
I've added comment to TreeRewriter, where we start to deal with join column conversion and to InterpreterSelectQuery where step added to query plan.
https://github.com/ClickHouse/ClickHouse/pull/19885/files#diff-03665017349505c5529e5d0eeeeae80081e064267d6035b94cd428baf11856ceR417
https://github.com/ClickHouse/ClickHouse/pull/19885/files#diff-5452971ae18972aa669265008a29b3732a99d9415f5b3af10d573c0807667a22R978
src/Interpreters/TableJoin.cpp
Outdated
| if (leftBecomeNullable(col.type)) | ||
| col.type = makeNullable(col.type); | ||
|
|
||
| std::unordered_map<String, DataTypePtr> type_map; |
There was a problem hiding this comment.
So the type information in the input block here is "wrong" (before conversion), and we have to overwrite it w/ the converted values? But why? Can the input block be made to already have the proper types?
There was a problem hiding this comment.
I guess, when actions for converting are in a separate step, this code won't be needed ...
There was a problem hiding this comment.
I moved common type calculation into TreeRewriter and now TableJoin contains information about converted types on earlier stage, so it is possible to change types in addJoinedColumn and have correct types in columns_added_by_join
akuzm
left a comment
There was a problem hiding this comment.
It's nice that this patch is much smaller than the previous version.
I think it would be helpful to add a high-level comment somewhere, that describes what we do -- e.g. we can convert USING columns to their common supertype by inserting additional casts into ExpressionTransform before join. This only works for USING because etc etc...
4b68c68 to
a6ea5ff
Compare
|
I've found bugs with aggregation due to fact that in analyzeAggregation corrected types are not calculated yet. Common type calculated moved to earlier stage to It fixes bug and also I suppose makes code a bit clearer |
| else | ||
| { | ||
| auto new_dag = ActionsDAG::merge( | ||
| std::move(*joined_block_actions->getActionsDAG().clone()), |
There was a problem hiding this comment.
Should we split joined_block_actions into separate variables not to use merge here? Now it consist from createJoinedBlockActions and additional converting step. @KochetovNicolai
There was a problem hiding this comment.
As I understand, this actions are applied to right part of join. And we don't use separate step for this actions for now. So, I think it is ok to merge.
Split before_join and converting_join_columns dags Add more detailed comments for column type conversion for join using
0aa2434 to
1a26310
Compare
|
I've added implementation of type conversion for |
KochetovNicolai
left a comment
There was a problem hiding this comment.
I did not read tests, but changes in code LGTM.
|
Integration test failure doesn't look relevant. Should we merge? |
I found one issue, query doesn't work if syntax Likely after some optimization some required columns are dropped. I'm trying to figure it out. |
Looks kinda bad, probably better to investigate. Do I remember correctly that you keep the name of the column, but replace it with the casted one? So it should have been named just |
…bleJoin" This reverts commit e6406c3.
Fix error with wrong right column type in aggregate with nulls Add logging inferred type for join keys Add test for join_use_nulls
|
Internal documentation ticket DOCSUP-9868 |
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Supports implicit key type conversion for JOIN. Closes #18567.
Ref PR #18672