Planner: support Join#5320
Conversation
|
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. DetailsReviewer can indicate their review by submitting an approval review. |
|
/run-all-tests |
Coverage for changed filesCoverage summaryfull coverage report (for internal network access only) |
| Expression: <cast after aggregation> | ||
| SharedQuery: <restore concurrency> | ||
| ParallelAggregating, max_threads: 10, final: true | ||
| Expression x 10: <before aggregation> |
There was a problem hiding this comment.
projection action to remove useless column.
The output columns of join are {r_a, r_b, join_c, l_a, l_b, join_c}.
The columns agg needed are {r_a, join_c}
There was a problem hiding this comment.
Because now PhysicalPlan::outputAndOptimize can work on Join?
There was a problem hiding this comment.
yes, after optimize, Useless columns can be destructed earlier.
| for (const auto & col : sample_block) | ||
| schema.emplace_back(col.name, col.type); | ||
| return std::make_shared<PhysicalSource>("source", schema, log->identifier(), sample_block, source_streams); | ||
| return std::make_shared<PhysicalSource>(executor_id, schema, log->identifier(), sample_block, source_streams); |
There was a problem hiding this comment.
In order for the physical node to get the executor_id of the actual child
| auto left = popBack(); | ||
|
|
||
| // use for gtest_physical_plan | ||
| if (dagContext().isTest() && right->tp() != PlanType::Source) |
There was a problem hiding this comment.
After DAGQueryBlock removed, it will be removed
| physical_plan.buildSource(input_streams); | ||
| RUNTIME_ASSERT(!input_streams_vec[i].empty(), log, "input streams cannot be empty"); | ||
| assert(query_block.children[i] && query_block.children[i]->root && query_block.children[i]->root->has_executor_id()); | ||
| physical_plan.buildSource(query_block.children[i]->root->executor_id(), input_streams_vec[i]); |
There was a problem hiding this comment.
query_block.children[i]->root->executor_id()
In order for the physical node to get the executor_id of the actual child
| Expression: <cast after aggregation> | ||
| SharedQuery: <restore concurrency> | ||
| ParallelAggregating, max_threads: 10, final: true | ||
| Expression x 10: <before aggregation> |
| Expression: <append join key and join filters for probe side> | ||
| Expression: <final projection> | ||
| MockTableScan | ||
| Expression x 10: <before aggregation> |
| MockExchangeReceiver | ||
| Expression x 20: <remove useless column after join> | ||
| NonJoined: <add stream with non_joined_data if full_or_right_join>)"; | ||
| Expression x 20: <before aggregation> |
| Expression: <append join key and join filters for probe side> | ||
| Expression: <final projection> | ||
| MockExchangeReceiver | ||
| Expression x 20: <before aggregation> |
cfd6b91 to
2dc2a34
Compare
|
/run-all-tests |
Coverage for changed filesCoverage summaryfull coverage report (for internal network access only) |
05fd9bd to
7693e14
Compare
|
/run-all-tests |
Coverage for changed filesCoverage summaryfull coverage report (for internal network access only) |
| RUNTIME_ASSERT(!input_streams.empty(), log, "input streams cannot be empty"); | ||
| physical_plan.buildSource(input_streams); | ||
| RUNTIME_ASSERT(!input_streams_vec[i].empty(), log, "input streams cannot be empty"); | ||
| assert(query_block.children[i] && query_block.children[i]->root && query_block.children[i]->root->has_executor_id()); |
There was a problem hiding this comment.
It is not true for some list based executors(sent from TiSpark), do we plan to support that in Planner?
There was a problem hiding this comment.
for list based executors, input_streams_vec.size() == 0.
So no error will occur here for list based executors :)
There was a problem hiding this comment.
do we plan to support that in Planner?
yes
There was a problem hiding this comment.
why list based executors has input_streams_vec.size() == 0 ?
does it take another interpreter path?
There was a problem hiding this comment.
Because for list based executors , there will only be one query block.
And the source of query block will only be table scan.
There was a problem hiding this comment.
for TableScan, the input_streams_vec must be empty
| Expression: <cast after aggregation> | ||
| SharedQuery: <restore concurrency> | ||
| ParallelAggregating, max_threads: 10, final: true | ||
| Expression x 10: <before aggregation> |
There was a problem hiding this comment.
Because now PhysicalPlan::outputAndOptimize can work on Join?
Co-authored-by: Zhi Qi <30543181+LittleFall@users.noreply.github.com>
|
/merge |
|
@SeaRise: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger If you have any questions about the PR merge process, please refer to pr process. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
This pull request has been accepted and is ready to merge. DetailsCommit hash: a55ff0a |
|
/merge |
|
@SeaRise: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger If you have any questions about the PR merge process, please refer to pr process. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
This pull request has been accepted and is ready to merge. DetailsCommit hash: be40431 |
Coverage for changed filesCoverage summaryfull coverage report (for internal network access only) |
What problem does this PR solve?
Issue Number: ref #4739
Problem Summary:
What is changed and how it works?
PhysicalJoinPhysicalSource.executor_id = child_query_block->root->executor_idandFinalProjection.executor_id = child_physical_plan.executor_id, In order for the physical node to get the executor_id of the actual child.Check List
Tests
Side effects
Documentation
Release note