Columns: optimize ColumnString filter when selectivity is high (#9987)#10036
Conversation
Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com>
Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com>
Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com>
ref pingcap#6092 Fix related regression caused by pingcap#9661 Before, one query reads pack [start, end) from disk, and add it them to cache, meanwhile another query also requests to read pack [start, end), then it need to copy each pack data to a new column. Now, return the cached column directly. Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com> Signed-off-by: JaySon-Huang <tshent@qq.com> Co-authored-by: JaySon-Huang <tshent@qq.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
|
Also cherry-pick #9994 together to fix the performance regression |
|
In my local testing cluster, the tiflash tablescan meets performance regression without these two fixes. But there is no performance regression overall the query. And the performance regression mainly fixed by #9994. With the commit fixed in 9994, tiflash table scan can shorten 200ms. Waiting for the benchmark environment result. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: JinheLin, Lloyd-Pottiger The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
|
Verified that with this PR, the workload on benchmark env takes ~263s to run all queries. Without this PR, it takes ~326s. |

This is an automated cherry-pick of #9987
What problem does this PR solve?
Issue Number: ref #9699, close #10029
Problem Summary:
What is changed and how it works?
following optimization of #9670
Check List
Tests
Side effects
Documentation
Release note