Skip to content

Conversation

@Kikyou1997
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@Kikyou1997
Copy link
Contributor Author

run buildall

1 similar comment
@Kikyou1997
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.73 seconds
stream load tsv: 566 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 29 seconds loaded 2358488459 Bytes, about 77 MB/s
stream load orc: 68 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s
storage size: 17099658733 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit f106afb0d7591faab68d2bc75f8556c4b187c9e7, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4907	4671	4623	4623
q2	371	157	158	157
q3	2037	1933	1877	1877
q4	1409	1255	1232	1232
q5	3953	3937	3978	3937
q6	256	134	124	124
q7	1410	855	859	855
q8	2832	2837	2794	2794
q9	9797	9946	9410	9410
q10	3442	3495	3536	3495
q11	373	249	254	249
q12	442	295	297	295
q13	4545	3788	3798	3788
q14	309	280	295	280
q15	585	526	526	526
q16	491	449	463	449
q17	1141	1003	974	974
q18	8001	7536	7559	7536
q19	1666	1675	1669	1669
q20	552	303	298	298
q21	4405	4007	4033	4007
q22	473	365	378	365
Total cold run time: 53397 ms
Total hot run time: 48940 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4573	4562	4555	4555
q2	336	226	244	226
q3	4016	4014	3996	3996
q4	2712	2712	2696	2696
q5	9609	9614	9606	9606
q6	251	123	125	123
q7	2961	2492	2545	2492
q8	4452	4467	4544	4467
q9	12942	12829	12839	12829
q10	4074	4163	4165	4163
q11	747	636	665	636
q12	974	814	833	814
q13	4293	3569	3583	3569
q14	375	352	340	340
q15	570	518	514	514
q16	602	584	585	584
q17	3911	3821	3916	3821
q18	9594	9079	9036	9036
q19	1826	1789	1802	1789
q20	2396	2071	2052	2052
q21	8911	8700	8710	8700
q22	878	828	815	815
Total cold run time: 81003 ms
Total hot run time: 77823 ms

@Kikyou1997
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 43.15 seconds
stream load tsv: 584 seconds loaded 74807831229 Bytes, about 122 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 34 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 29.7 seconds inserted 10000000 Rows, about 336K ops/s
storage size: 17099579372 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit e264996101de04ad16fab73fd1547c28347e7db3, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4923	4672	4652	4652
q2	351	161	158	158
q3	1528	1358	1262	1262
q4	1155	1079	958	958
q5	3252	3246	3268	3246
q6	254	130	128	128
q7	1014	517	539	517
q8	2231	2249	2223	2223
q9	6954	6967	6994	6967
q10	3319	3374	3429	3374
q11	340	212	212	212
q12	352	222	222	222
q13	4710	3927	3861	3861
q14	249	221	217	217
q15	579	553	520	520
q16	429	373	404	373
q17	1031	656	639	639
q18	8032	7455	8094	7455
q19	1567	1529	1560	1529
q20	559	333	297	297
q21	3430	2944	2993	2944
q22	362	305	304	304
Total cold run time: 46621 ms
Total hot run time: 42058 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4582	4573	4610	4573
q2	323	208	239	208
q3	3730	3755	3735	3735
q4	2552	2546	2521	2521
q5	6178	6171	6187	6171
q6	250	123	127	123
q7	2558	1967	1971	1967
q8	3733	3666	3690	3666
q9	9456	9373	9362	9362
q10	4064	4175	4174	4174
q11	635	506	507	506
q12	800	621	652	621
q13	4344	3660	3642	3642
q14	278	245	238	238
q15	588	530	524	524
q16	524	471	474	471
q17	2087	2078	2093	2078
q18	9606	8953	9039	8953
q19	1802	1787	1766	1766
q20	2315	1989	1972	1972
q21	7341	7151	6996	6996
q22	650	549	556	549
Total cold run time: 68396 ms
Total hot run time: 64816 ms

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 29, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morrySnow morrySnow merged commit d00c04c into apache:master Nov 29, 2023
eldenmoon pushed a commit to eldenmoon/incubator-doris that referenced this pull request Dec 3, 2023
eldenmoon pushed a commit to eldenmoon/incubator-doris that referenced this pull request Dec 3, 2023
eldenmoon added a commit that referenced this pull request Dec 4, 2023
* [chore](case) Use correct insert stmt for cold heat separation case #27546 (#27585)

Co-authored-by: AlexYue <yj976240184@gmail.com>

* [enhance](S3) Print the error detail for every s3 operation (#27572) (#27615)

* [nereids] fix stats error when using dateTime type filter #27571 (#27577)

* [fix](planner)sort node should materialized required slots for itself #27605 (#27620)

* [fix](Nereids) non-deterministic expression should not be constant (#27606) (#27631)

* [enhancement](stats) Add process for aggstate type #27640 (#27642)

* [Fix](statistics)Fix bug and improve auto analyze. (#27626) (#27657)

1. Implement needReAnalyzeTable for ExternalTable. For now, external table will not be reanalyzed in 10 days.
2. For HiveMetastoreCache.loadPartitions, handle the empty iterator case to avoid Index out of boundary exception.
3. Wrap handle show analyze loop with try catch, so that when one table failed (for example, catalog dropped so the table couldn't be found anymore), we can still show the other tables.
4. For now, only OlapTable and Hive HMSExternalTable support sample analyze, throw exception for other types of table.
5. In StatisticsCollector, call constructJob after createTableLevelTaskForExternalTable to avoid NPE.

* [profile](bugfix) should not cache profile content because the profile may not be a full profile (#27635)

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>

* [Enhance](fe) Support setting initial root password when FE firstly launch (#27438) (#27603)

* [opt](plan) only lock olap table when query plan #27639 (#27656)

bp #27639

* select coordinator node from user's tag when exec streaming load (#27106) (#27677)

* [fix](statistics)Need to recalculate health value when table row count become 0  #27673 (#27674)

backport #27673

* [fix](statistics)Fix sample min max npe bug  #27702 (#27707)

backport #27702

* [Bug](join) try fix wrong _has_null_in_build_side setted (#27684) (#27710)

* [Fix](show-load)Show load npe(userinfo is null) (#27698) (#27719)

* [pick](nereids)temporary partition is always pruned #27636 (#27722)

* [enhancement](stats) limit bq cap size for analyze task #27685 (#27687)

* [improvement](statistics) Add config for the threshold of column count for auto analyze #27713 (#27723)

* [doc](fix) k8s operator docs fix to 2.0 (#27476)

* [Improvement](planner)support select tablets with nereids optimize #23164 #23365 (#27740)

#23164
#23365

* [FIX](complextype)fix complex type hash equals (#27743)

* [fix](statistics) Fix show auto analyze missing jobs bug (#27761)

* [bugfix](topn) fix coredump in copy_column_data_to_block when nullable mismatch

return RuntimeError if copy_column_data_to_block nullable mismatch to avoid coredump in input_col_ptr->filter_by_selector(sel_rowid_idx, select_size, raw_res_ptr) .

The problem is reported by a doris user but I can not reproduce it, so there is no testcase added currently.

* [opt](stats) Use escape rather than base64 for min/max value #27746 (#27748)

* [refactor](http) disable snapshot and get_log_file api (#27724) (#27770)

* [branch-2.0](pick 27738) Warning log to trace send fragment #27738 (#27760)

* [branch-2.0](pick #27771) Add more detail msg for waitRPC exception (#27773)

* [Bug](pipeline) prevent PipelineFragmentContext destruct early (#27790)

* [deps](compression) Opt gzip decompress by libdeflate on X86 and X86_64 platforms: 1. Add libdeflate lib.  (#27542) (#27711)

Backport from #27542.

* [FIX](case)fix case truncate table first #27792

* [doc](stats) add auto_analyze_table_width_threshold description. (#27818) (#27832)

* [fix](bdbje) Fix bdbje logging level not work (#27597) (#27788)

* `EnvironmentConfig.FILE_LOGGING_LEVEL` only set FileHandlerLevel, we should
   set logger level firstly, otherwise it will not take effect.

* [Opt](compression) Opt gzip decompress by libdeflate on X86 and X86_64 platforms: 2. Opt gzip decompression by libdeflate lib. (#27669) (#27801)

Backport from #27669.

* [branch-2.0](fix) Fix broken exception message #27836

* [Bug](func) coredump in equal for null in function (#27843)

* [minor](stats) Update olap table row count after analyze (#27858)

pick from master #27814

* [fix](stats)min and max return NaN when table is empty (#27863)

fix analyze empty table and min/max null value bug:
1. Skip empty analyze task for sample analyze task. (Full analyze task already skipped).
2. Check sample rows is not 0 before calculate the scale factor.
3. Remove ' in sql template after remove base64 encoding for min/max value.

backport #27862

* [minor](stats) Throw error when sync analyze failed (#27846)

pick from master #27845

* [fix](stats) Don't save colToPartitions anymore to save mem (#27880)

pick from master #27879

* [fix](nereids) set operation's result type is wrong if decimal overflows (#27872)

pick from master #27870

* [Config] Modify the default value of tablet_schema_cache_recycle_interval (#27877)

* [fix](like_func) incorrect result of like with 'NO_BACKSLASH_ESCAPES' mode(#27842) (#27851)

* [fix](fe) Fix show frontends npt in some situations (#27295) (#27789)

```
java.lang.NullPointerException: null
    at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getMasterSocket(ReplicationGroupAdmin.java:191)
    at com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:607)
    at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getGroup(ReplicationGroupAdmin.java:406)
    at org.apache.doris.ha.BDBHA.getElectableNodes(BDBHA.java:132)
    at org.apache.doris.common.proc.FrontendsProcNode.getFrontendsInfo(FrontendsProcNode.java:84)
    at org.apache.doris.qe.ShowExecutor.handleShowFrontends(ShowExecutor.java:1923)
    at org.apache.doris.qe.ShowExecutor.execute(ShowExecutor.java:355)
    at org.apache.doris.qe.StmtExecutor.handleShow(StmtExecutor.java:2113)
    ...
```

* [branch-2.0](fix) Fix extremely high CPU usage caused by rf merge #27894 (#27895)

* [fix](stacktrace) ignore stacktrace for error code INVALID_ARGUMENT INVERTED_INDEX_NOT_IMPLEMENTED (#27898)

* ignore stacktrace for error INVALID_ARGUMENT INVERTED_INDEX_NOT_IMPLEMENTED

* AndBlockColumnPredicate::evaluate

* [opt](nereids) Branch-2.0: remove partition & histogram from col stats to reduce memory usage #27885 (#27896)

* [pick](Nereids) temporary partition is selected only if user manually specified: Branch-2.0 #27893 (#27905)

* [fix](multi-catalog)support the max compute partition prune (#27154) (#27902)

backport #27154

* [fix](Nereids) should not push down project to the nullable side of outer join #27912 (#27913)

* fix compile

---------

Co-authored-by: Dongyang Li <hello_stephen@qq.com>
Co-authored-by: AlexYue <yj976240184@gmail.com>
Co-authored-by: xzj7019 <131111794+xzj7019@users.noreply.github.com>
Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
Co-authored-by: AKIRA <33112463+Kikyou1997@users.noreply.github.com>
Co-authored-by: Jibing-Li <64681310+Jibing-Li@users.noreply.github.com>
Co-authored-by: yiguolei <676222867@qq.com>
Co-authored-by: yiguolei <yiguolei@gmail.com>
Co-authored-by: DuRipeng <453243496@qq.com>
Co-authored-by: Mingyu Chen <morningman@163.com>
Co-authored-by: wangbo <wangbo@apache.org>
Co-authored-by: Pxl <pxl290@qq.com>
Co-authored-by: Calvin Kirs <acm_master@163.com>
Co-authored-by: minghong <englefly@gmail.com>
Co-authored-by: catpineapple <42031973+catpineapple@users.noreply.github.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: Kang <kxiao.tiger@gmail.com>
Co-authored-by: zhiqiang <seuhezhiqiang@163.com>
Co-authored-by: Qi Chen <kaka11.chen@gmail.com>
Co-authored-by: Lei Zhang <27994433+SWJTU-ZhangLei@users.noreply.github.com>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: Lightman <31928846+Lchangliang@users.noreply.github.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: slothever <18522955+wsjz@users.noreply.github.com>
gnehil pushed a commit to gnehil/doris that referenced this pull request Dec 4, 2023
@xiaokang xiaokang mentioned this pull request Dec 4, 2023
XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.3-merged p0_b reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants