Skip to content

Conversation

@luozenglin
Copy link
Contributor

@luozenglin luozenglin commented Jan 7, 2023

Proposed changes

Issue Number: close #15678

related pr: #15558

Problem summary

The double column predicate is pushed down to the storage layer to cause a core dump, because there is no PredicateCreator corresponding to the double type. It looks like none of the double column predicates were pushed to the storage layer before, and I'm not quite sure why.

Checklist(Required)

  1. Does it affect the original behavior:
    • Yes
    • No
    • I don't know
  2. Has unit tests been added:
    • Yes
    • No
    • No Need
  3. Has document been added or modified:
    • Yes
    • No
    • No Need
  4. Does it need to update dependencies:
    • Yes
    • No
  5. Are there any changes that cannot be rolled back:
    • Yes (If Yes, please explain WHY)
    • No

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@github-actions github-actions bot added area/planner Issues or PRs related to the query planner kind/test labels Jan 7, 2023
@hello-stephen
Copy link
Contributor

hello-stephen commented Jan 7, 2023

TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 35.34 seconds
load time: 483 seconds
storage size: 17123011688 Bytes
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230107033550_clickbench_pr_75222.html

@Gabriel39
Copy link
Contributor

PTAL @xiaokang

@xiaokang
Copy link
Contributor

xiaokang commented Jan 7, 2023

PTAL @xiaokang

I will check it.

@xiaokang
Copy link
Contributor

xiaokang commented Jan 7, 2023

The root cause is that there is no implementation for float/doube in get_creator function in be/src/olap/predicate_creator.h. get_creator will return default value nullptr and cause create_predicate core in the following code.

template <PredicateType PT, typename ConditionType>
inline ColumnPredicate* create_predicate(const TabletColumn& column, int index,
const ConditionType& conditions, bool opposite,
MemPool* pool) {
return get_creator<PT, ConditionType>(column.type())
->create(column, index, conditions, opposite, pool);
}

@xiaokang
Copy link
Contributor

xiaokang commented Jan 7, 2023

This PR can be merged for temporary fix. I'll fix get_creator and add testcases for every data type.

@luozenglin
Copy link
Contributor Author

The root cause is that there is no implementation for float/doube in get_creator function in be/src/olap/predicate_creator.h. get_creator will return default value nullptr and cause create_predicate core in the following code.

template <PredicateType PT, typename ConditionType>
inline ColumnPredicate* create_predicate(const TabletColumn& column, int index,
const ConditionType& conditions, bool opposite,
MemPool* pool) {
return get_creator<PT, ConditionType>(column.type())
->create(column, index, conditions, opposite, pool);
}

Yes, but what confuses me is why the create_predicate of type double was not implemented before? In other words, the double type was not pushed down to the storage layer before because of any pitfalls?

@xiaokang
Copy link
Contributor

xiaokang commented Jan 8, 2023

Yes, but what confuses me is why the create_predicate of type double was not implemented before? In other words, the double type was not pushed down to the storage layer before because of any pitfalls?

I have confirmed float/double type was not pushed down to the storage layer by checking the code in _normalize_conjuncts() and discussing with @moonming . The pitfalls may be the precision problem. I will do some research on other databases.

exception "[RUNTIME_ERROR]Argument at index 3 for function split_part must be constant"
}

qt_1 "select split_part(k8, '1', 1), k8, split_part(concat(k8, '12'), '1', 1) from test_query_db.test order by k8 limit 2;"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this regression test related with the issue? It does not have any pushdown predicates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

topn generated a runtime predicate, a newly supported feature in pr #15558

Copy link
Contributor

@Gabriel39 Gabriel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Gabriel39 Gabriel39 merged commit 05f6e4c into apache:master Jan 9, 2023
morningman pushed a commit that referenced this pull request Jan 16, 2023
…scalar type (#15790)

related to #15558 #15693
1. dup key table with 17 scalar datatypes
2. unique key table with mow enabled
3. unique key table with mow disabled
luwei16 pushed a commit to luwei16/incubator-doris that referenced this pull request Apr 7, 2023
Issue Number: close http://jira.selectdb-in.cc/browse/CORE-1462

Describe the overview of changes.

commit e1697741a82f875ca42b0d18caa7972eaa225bee
Author: Kang <kxiao.tiger@gmail.com>
Date:   Thu Jan 19 22:59:29 2023 +0800

    [opt](test) scalar_types_p0 use 100k lines dataset and scalar_types_p2 use 1000k (apache#16104)

commit 33a47e8d02644123ffd8c5c4353653c1c175e96a
Author: Kang <kxiao.tiger@gmail.com>
Date:   Wed Jan 18 14:17:24 2023 +0800

    [testcase](bitmap index)bitmap index testcase (apache#15975)

    * add bitmap index testcases for all scalar types

commit 260a631441834ca7e23da4b77c922eb818eddca7
Author: Kang <kxiao.tiger@gmail.com>
Date:   Mon Jan 16 16:49:59 2023 +0800

    [regression-test](topn)add test cases for nonkey topn query for each scalar type (apache#15790)

    related to apache#15558 apache#15693
    1. dup key table with 17 scalar datatypes
    2. unique key table with mow enabled
    3. unique key table with mow disabled

commit 81cea5219ae86df950f10aa123072df78c7cdf23
Author: Kang <kxiao.tiger@gmail.com>
Date:   Sun Feb 19 23:28:33 2023 +0800

    [bugfix](topn) fix topn read_orderby_key_columns nullptr (apache#16896)

    The SQL `SELECT nationkey FROM regression_test_query_p0_limit.tpch_tiny_nation ORDER BY nationkey DESC LIMIT 5`
    make be core dump since dereference a nullptr `read_orderby_key_columns in VCollectIterator::_topn_next`,
    triggered by skipping _colname_to_value_range init in apache#16818 .

    This PR makes two changes:
    1. avoid read_orderby_key_columns nullptr in TabletReader::_init_orderby_keys_param
    2. return error if read_orderby_key_columns is nullptr unexpected in VCollectIterator::_topn_next to avoid core dump

commit 2fee1d1d79942e49eddaafdc2b49e49b0651b109
Author: Kang <kxiao.tiger@gmail.com>
Date:   Fri Feb 10 12:56:33 2023 +0800

    [Improvement](topn) add limit threashold session variable and fuzzy for topn optimizations (apache#16514)

    1. add limit threshold for topn runtime pushdown and key topn optimization
    2. use unified session variable topn_opt_limit_threshold for all topn optimizations
    3. add fuzzy support for topn_opt_limit_threshold

commit 1696bed39129fcc891f32f64ff1fb43f9531fcd4
Author: Kang <kxiao.tiger@gmail.com>
Date:   Thu Feb 2 09:13:32 2023 +0800

    [bugfix](topn) fix topn runtime predicate getting value bug for decimal type (apache#16331)

    * fix topn runtime predicate getting value bug for decimal type

    * fix cast_to_string bug for TYPE_DECIMALV2

commit d70cdf61521a23417c9bc734a3cdb668265a15b0
Author: Kang <kxiao.tiger@gmail.com>
Date:   Wed Feb 22 16:18:46 2023 +0800

    topn sync doris order by key topn query optimization apache#15663

commit 1df514c8f0b66ae9a8438617163a31848e519949
Author: Kang <kxiao.tiger@gmail.com>
Date:   Wed Feb 22 15:14:43 2023 +0800

    sync with doris runtime prune for topn query apache#15558
swjtu-zhanglei pushed a commit to swjtu-zhanglei/incubator-doris that referenced this pull request Jul 25, 2023
Issue Number: close http://jira.selectdb-in.cc/browse/CORE-1462

Describe the overview of changes.

commit e1697741a82f875ca42b0d18caa7972eaa225bee
Author: Kang <kxiao.tiger@gmail.com>
Date:   Thu Jan 19 22:59:29 2023 +0800

    [opt](test) scalar_types_p0 use 100k lines dataset and scalar_types_p2 use 1000k (apache#16104)

commit 33a47e8d02644123ffd8c5c4353653c1c175e96a
Author: Kang <kxiao.tiger@gmail.com>
Date:   Wed Jan 18 14:17:24 2023 +0800

    [testcase](bitmap index)bitmap index testcase (apache#15975)

    * add bitmap index testcases for all scalar types

commit 260a631441834ca7e23da4b77c922eb818eddca7
Author: Kang <kxiao.tiger@gmail.com>
Date:   Mon Jan 16 16:49:59 2023 +0800

    [regression-test](topn)add test cases for nonkey topn query for each scalar type (apache#15790)

    related to apache#15558 apache#15693
    1. dup key table with 17 scalar datatypes
    2. unique key table with mow enabled
    3. unique key table with mow disabled

commit 81cea5219ae86df950f10aa123072df78c7cdf23
Author: Kang <kxiao.tiger@gmail.com>
Date:   Sun Feb 19 23:28:33 2023 +0800

    [bugfix](topn) fix topn read_orderby_key_columns nullptr (apache#16896)

    The SQL `SELECT nationkey FROM regression_test_query_p0_limit.tpch_tiny_nation ORDER BY nationkey DESC LIMIT 5`
    make be core dump since dereference a nullptr `read_orderby_key_columns in VCollectIterator::_topn_next`,
    triggered by skipping _colname_to_value_range init in apache#16818 .

    This PR makes two changes:
    1. avoid read_orderby_key_columns nullptr in TabletReader::_init_orderby_keys_param
    2. return error if read_orderby_key_columns is nullptr unexpected in VCollectIterator::_topn_next to avoid core dump

commit 2fee1d1d79942e49eddaafdc2b49e49b0651b109
Author: Kang <kxiao.tiger@gmail.com>
Date:   Fri Feb 10 12:56:33 2023 +0800

    [Improvement](topn) add limit threashold session variable and fuzzy for topn optimizations (apache#16514)

    1. add limit threshold for topn runtime pushdown and key topn optimization
    2. use unified session variable topn_opt_limit_threshold for all topn optimizations
    3. add fuzzy support for topn_opt_limit_threshold

commit 1696bed39129fcc891f32f64ff1fb43f9531fcd4
Author: Kang <kxiao.tiger@gmail.com>
Date:   Thu Feb 2 09:13:32 2023 +0800

    [bugfix](topn) fix topn runtime predicate getting value bug for decimal type (apache#16331)

    * fix topn runtime predicate getting value bug for decimal type

    * fix cast_to_string bug for TYPE_DECIMALV2

commit d70cdf61521a23417c9bc734a3cdb668265a15b0
Author: Kang <kxiao.tiger@gmail.com>
Date:   Wed Feb 22 16:18:46 2023 +0800

    topn sync doris order by key topn query optimization apache#15663

commit 1df514c8f0b66ae9a8438617163a31848e519949
Author: Kang <kxiao.tiger@gmail.com>
Date:   Wed Feb 22 15:14:43 2023 +0800

    sync with doris runtime prune for topn query apache#15558
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/planner Issues or PRs related to the query planner kind/test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] [daily] be core when execute split_part query

6 participants