Columns: optimize ColumnString filter when selectivity is high#9987
Columns: optimize ColumnString filter when selectivity is high#9987ti-chi-bot[bot] merged 6 commits intopingcap:masterfrom
Conversation
Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com>
dbms/src/Columns/filterColumn.cpp
Outdated
| if (index + length == FILTER_SIMD_BYTES) | ||
| break; |
There was a problem hiding this comment.
How about adding a predefined mask table?
constexpr uint64_t MASKS[65] = [] {
uint64_t masks[65] = {};
for (int i = 0; i < 64; ++i) {
masks[i] = ~((1ULL << i) - 1);
}
masks[64] = 0;
return masks;
}();
mast &= MASKS[index + length];Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com>
windtalker
left a comment
There was a problem hiding this comment.
Do we have some existing test for this?
Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com>
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gengliqi, windtalker The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
|
/cherry-pick release-9.0-beta.1 |
|
@JaySon-Huang: new pull request created to branch DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
#10036) ref #9699, close #10029 optimize the performance of ColumnString filter when the selectivity of filter is high: For example, when filter is `0111111111111111011111111111111101111111111111110111111111111111`, the mask will be `11111111111111110111111111111111101111111111111111011111111111111110`, since it does not be `[0]*[1]+` or `[1]+[0]*`, we need to copy each selected row one by one. Now, we can copy 15 rows at once. The total elapsed time of TPC-H 50 reduce from 42.9s to 41.1s. Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com> Signed-off-by: JaySon-Huang <tshent@qq.com> Co-authored-by: Lloyd-Pottiger <yan1579196623@gmail.com> Co-authored-by: Lloyd-Pottiger <60744015+Lloyd-Pottiger@users.noreply.github.com> Co-authored-by: JaySon-Huang <tshent@qq.com>
What problem does this PR solve?
Issue Number: ref #9699, close #10029
Problem Summary:
What is changed and how it works?
following optimization of #9670
Check List
Tests
Side effects
Documentation
Release note