Skip to content

Join only construct joined columns that is needed by its parent operator #8296

@windtalker

Description

@windtalker

Enhancement

In TiFlash, the output schema of join is [left_columns, right_columns], for example

select a.id from a join b on a.id = b.id and a.id1 = b.id1 and a.id2 = b.id2 and a.id3 = b.id3

The output schema of join is [a.id, a.id1, a.id2, a.id3, b.id, b.id1, b.id2, b.id3]

However, only a.id is actually needed. And inside join, the cost of combine the left columns with the right columns is actually very high: for left columns, it need to call replicateRange, and for right columns, it need to be added row by row using AddFound, if we can only return the column used by the parent operators, it will avoid some meaningless cost and increase the performance of join.

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/enhancementThe issue or PR belongs to an enhancement.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions