Skip to content

Conversation

@zhangstar333
Copy link
Contributor

@zhangstar333 zhangstar333 commented Nov 9, 2023

Proposed changes

Issue Number: close #xxx
support sql: select count(1)-count(not null) from table, the agg of count could push down.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@zhangstar333
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.16 seconds
stream load tsv: 551 seconds loaded 74807831229 Bytes, about 129 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.4 seconds inserted 10000000 Rows, about 340K ops/s
storage size: 17162336368 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit bde504e6484a5c1f50156aeffdf0a3f548f65c49, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5205	5051	5112	5051
q2	367	244	241	241
q3	2077	2024	1997	1997
q4	1459	1429	1431	1429
q5	4137	4126	4101	4101
q6	249	127	132	127
q7	2076	1599	1624	1599
q8	2759	2722	2720	2720
q9	10360	10322	10192	10192
q10	3506	3552	3592	3552
q11	365	258	259	258
q12	452	300	300	300
q13	4500	4164	4125	4125
q14	317	294	291	291
q15	607	557	575	557
q16	705	626	609	609
q17	1124	1071	1105	1071
q18	7857	7543	7360	7360
q19	1682	1690	1677	1677
q20	576	373	349	349
q21	4915	4552	4599	4552
q22	539	436	446	436
Total cold run time: 55834 ms
Total hot run time: 52594 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4914	5036	4970	4970
q2	374	250	264	250
q3	3994	4024	3933	3933
q4	2797	2744	2771	2744
q5	6536	6432	6415	6415
q6	239	121	133	121
q7	3110	2663	2751	2663
q8	4781	4759	4791	4759
q9	17847	17704	17664	17664
q10	4101	4185	4188	4185
q11	712	677	660	660
q12	1022	871	841	841
q13	4311	3907	3862	3862
q14	382	350	362	350
q15	647	560	553	553
q16	762	708	698	698
q17	3965	3952	3910	3910
q18	9431	9267	9327	9267
q19	1866	1755	1779	1755
q20	2352	2061	2039	2039
q21	8925	8691	8787	8691
q22	935	895	866	866
Total cold run time: 84003 ms
Total hot run time: 81196 ms

@zhangstar333
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.83 seconds
stream load tsv: 579 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.8 seconds inserted 10000000 Rows, about 347K ops/s
storage size: 17104376719 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit c36b44c0dae3c67d7ebdc9141b8c347e40d20131, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4903	4654	4600	4600
q2	371	142	134	134
q3	2036	1945	1936	1936
q4	1400	1243	1252	1243
q5	3962	3898	3991	3898
q6	254	130	132	130
q7	1440	891	894	891
q8	2764	2802	2777	2777
q9	20659	9520	9371	9371
q10	3466	3510	3514	3510
q11	376	243	249	243
q12	452	290	302	290
q13	4569	3848	3776	3776
q14	312	310	288	288
q15	588	537	525	525
q16	656	610	580	580
q17	1132	924	898	898
q18	7920	7432	7458	7432
q19	1697	1658	1652	1652
q20	530	302	283	283
q21	4460	4037	3988	3988
q22	482	376	374	374
Total cold run time: 64429 ms
Total hot run time: 48819 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4586	4565	4592	4565
q2	344	245	266	245
q3	4035	4014	4012	4012
q4	2716	2705	2716	2705
q5	9557	9616	9622	9616
q6	245	124	125	124
q7	3042	2488	2519	2488
q8	4449	4417	4382	4382
q9	12972	12853	12823	12823
q10	4043	4132	4174	4132
q11	794	691	664	664
q12	973	814	817	814
q13	4282	3584	3548	3548
q14	395	350	367	350
q15	567	516	531	516
q16	730	666	675	666
q17	3905	3878	3871	3871
q18	9650	9129	9156	9129
q19	1823	1788	1781	1781
q20	2409	2067	2046	2046
q21	8930	8578	8580	8578
q22	906	832	792	792
Total cold run time: 81353 ms
Total hot run time: 77847 ms

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 23, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@zhangstar333 zhangstar333 merged commit 33de92c into apache:master Nov 23, 2023
zhangstar333 added a commit to zhangstar333/incubator-doris that referenced this pull request Nov 23, 2023
…apache#26677

support sql: select count(1)-count(not null) from table, the agg of count could push down.
xiaokang pushed a commit that referenced this pull request Nov 23, 2023
…#26677 (#27499)

support sql: select count(1)-count(not null) from table, the agg of count could push down.
eldenmoon pushed a commit to eldenmoon/incubator-doris that referenced this pull request Nov 27, 2023
…apache#26677 (apache#27499)

support sql: select count(1)-count(not null) from table, the agg of count could push down.
eldenmoon added a commit that referenced this pull request Nov 27, 2023
* [fix](stats) Fix update rows for unique table didn't get updated properly #26968 (#27337)

* [FIX](jsonb) fix jsonb in predict column #27325 (#27424)

* [fix](fe) slots in having clause should be set to need materialized(#27412) (#27429)

* [Bug](insert)fix insert wrong data on mv when stmt have multiple values (#27297) (#27382)

fix insert wrong data on mv when stmt have multiple values

* [fix](fe ut) Fix OlapQueryCacheTest failed (#27305) (#27406)

1.
```
java.lang.NullPointerException: null
        at org.apache.doris.catalog.Env.getCurrentSystemInfo(Env.java:793) ~[classes/:?]
        at org.apache.doris.qe.SimpleScheduler$UpdateBlacklistThread.run(SimpleScheduler.java:206) ~[classes/:?]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_382]

java.lang.NullPointerException
        at org.apache.doris.qe.OlapQueryCacheTest.setUp(OlapQueryCacheTest.java:226)
```

2.
```
[ERROR] testSqlCacheKeyWithNestedViewForNereids  Time elapsed: 1.962 s  <<< FAILURE!
java.lang.AssertionError: SELECT command denied to user 'testCluster:testUser'@'192.168.1.1' for table 'internal: testCluster:testDb: appevent'
	at org.apache.doris.qe.OlapQueryCacheTest.parseSqlByNereids(OlapQueryCacheTest.java:579)
	at org.apache.doris.qe.OlapQueryCacheTest.testSqlCacheKeyWithNestedViewForNereids(OlapQueryCacheTest.java:1338)
```

3.
```
[ERROR] Tests run: 28, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 113.63 s <<< FAILURE! - in org.apache.doris.qe.OlapQueryCacheTest
[ERROR] testCacheModeTable  Time elapsed: 1.657 s  <<< ERROR!
java.lang.IllegalArgumentException: Value of type org.apache.doris.qe.QueryState incompatible with return type org.apache.doris.system.SystemInfoService of org.apache.doris.catalog.Env#getCurrentSystemInfo()
        at org.apache.doris.qe.OlapQueryCacheTest.setUp(OlapQueryCacheTest.java:156)
```

* [regression test](schema change) add some schema change regression cases (#27112) (#27418)

* [fix](Nereids) result type of add precision is 1 more than expected (#27136) (#27426)

* [fix](Nereids): fill miss slot in having subquery (#27177) (#27394)

* [fix](memory) Fix make_top_consumption_snapshots heap-use-after-free #27434 (#27465)

* [fix](function) make TIMESTAMP function DEPEND_ON_ARGUMENT (#27343) (#27458)

* [fix](test) order by clause in test_map(#27390) (#27391)

pick #27390

* [performance](Planner): optimize getStringValue() in DateLiteral (#27363) (#27470)

- reduce cost of `getStringValue()`
- original code don't consider `microsecond` part in `getStringValue()`

(cherry picked from commit 044a295)

* [Chore](pick) do not push down agg on aggregate column (#27356) (#27498)

* [fix](stats) table not exists error msg not print objects name #27074 (#27463)

* [improve](nereids) support agg function of count(const value) pushdown #26677 (#27499)

support sql: select count(1)-count(not null) from table, the agg of count could push down.

* [test](fe-ut) fix unstable MysqlServerTest (#27459)

Need to find a unbind port for MysqlServerTest

* [opt](MergedIO) no need to merge large columns (#27315) (#27497)

1. Fix a profile bug of `MergeRangeFileReader`, and add a profile `ApplyBytes` to show the total bytes  of ranges.
2. There's no need to merge large columns, because `MergeRangeFileReader` will increase the copy time.

* [improvement](drop tablet)  impr gc shutdown tablet lock (#26151) (#27478)

* [doc](stats) SQL manual for stats (#27461)

* [chore](merge-on-write) disable rowid conversion check for mow table by default (#27482) (#27508)

* [fix](regression)Fix hive p2 case (#27466) (#27511)

* [fix](statistics)Fix auto analyze remove finished job bug #27486 (#27510)

* [Bug](bitmap) Fix heap-use-after-free in the bitmap functions #27411 (#27521)

* [Pick](nereids) Pick: partition prune fails in case of NOT expression (#27047) (#27507)

* [fix](clone) Fix engine_clone file exist (#27361) (#27536)

* [chore](case) adjust timeout of broker load case #27540

* Fix auto analyze doesn't filter unsupported type bug. (#27547)

Fix auto analyze doesn't filter unsupported type bug.
Catch throwable in auto analyze thread for each database, otherwise the thread will quit when one database failed to create jobs and all other databases will not get analyzed.
change FE config item full_auto_analyze_simultaneously_running_task_num to auto_analyze_simultaneously_running_task_num
backport #27559

* [chore](fe plugin) Upgrade dependency to doris 2.0-SNAPSHOT #27522 (#27558)

* [Bug](materialized-view) add limitation for duplicate expr on materialized view (#27523) (#27562)

* [fix](planner)join node should output required slot from parent node #27526 (#27551)

* [branch-2.0](hive) enable hive view by default (#27550)

* [pick](nereids) adjust bc join and shuffle join #27113 (#27566)

* [Fix](hive-transactional-table) Fix NPE when query empty hive transactional table. (#27567)

---------

Co-authored-by: AKIRA <33112463+Kikyou1997@users.noreply.github.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: Pxl <pxl290@qq.com>
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
Co-authored-by: Luwei <814383175@qq.com>
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
Co-authored-by: 谢健 <jianxie0@gmail.com>
Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com>
Co-authored-by: jakevin <jakevingoo@gmail.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: Mingyu Chen <morningman@163.com>
Co-authored-by: Ashin Gau <AshinGau@users.noreply.github.com>
Co-authored-by: yujun <yu.jun.reach@gmail.com>
Co-authored-by: Xin Liao <liaoxinbit@126.com>
Co-authored-by: Jibing-Li <64681310+Jibing-Li@users.noreply.github.com>
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>
Co-authored-by: minghong <englefly@gmail.com>
Co-authored-by: Jack Drogon <jack.xsuperman@gmail.com>
Co-authored-by: Dongyang Li <hello_stephen@qq.com>
Co-authored-by: zhiqiang <seuhezhiqiang@163.com>
Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>
Co-authored-by: Qi Chen <kaka11.chen@gmail.com>
seawinde pushed a commit to seawinde/doris that referenced this pull request Nov 28, 2023
…apache#26677

support sql: select count(1)-count(not null) from table, the agg of count could push down.
gnehil pushed a commit to gnehil/doris that referenced this pull request Dec 4, 2023
…apache#26677 (apache#27499)

support sql: select count(1)-count(not null) from table, the agg of count could push down.
@xiaokang xiaokang mentioned this pull request Dec 4, 2023
XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023
…apache#26677

support sql: select count(1)-count(not null) from table, the agg of count could push down.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.3-merged p0_b reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants