[fix](planner)scan node should project all required expr from parent node #26886

starocean999 · 2023-11-13T09:15:28Z

add initOutputSlotIds method in scan node to do correct projection

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

starocean999 · 2023-11-13T09:15:45Z

run buildall

starocean999 · 2023-11-13T09:20:08Z

run buildall

doris-robot · 2023-11-13T11:15:35Z

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit 9f1804c9fb6c7490702ed2ce55a72d8f75d96aba, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5280	5168	5208	5168
q2	366	157	160	157
q3	2038	1996	2007	1996
q4	1409	1372	1386	1372
q5	4000	4008	3991	3991
q6	257	128	126	126
q7	1479	890	892	890
q8	2782	2782	2771	2771
q9	9715	9521	9473	9473
q10	3465	3520	3544	3520
q11	394	255	247	247
q12	439	287	289	287
q13	4574	4129	4124	4124
q14	326	283	297	283
q15	617	568	577	568
q16	674	599	585	585
q17	1132	1091	1082	1082
q18	8059	7629	7588	7588
q19	1702	1696	1708	1696
q20	564	303	303	303
q21	4701	4310	4392	4310
q22	507	415	422	415
Total cold run time: 54480 ms
Total hot run time: 50952 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5079	5017	5063	5017
q2	327	216	224	216
q3	4046	3998	4078	3998
q4	2800	2764	2772	2764
q5	9675	9730	9604	9604
q6	246	120	123	120
q7	3049	2519	2518	2518
q8	4885	4846	4854	4846
q9	13112	12896	12950	12896
q10	4082	4172	4211	4172
q11	769	659	653	653
q12	983	805	818	805
q13	4296	3936	3919	3919
q14	381	356	340	340
q15	642	540	547	540
q16	788	697	705	697
q17	3844	3880	3881	3880
q18	9682	9631	9530	9530
q19	1831	1768	1774	1768
q20	2411	2110	2073	2073
q21	8959	9035	8688	8688
q22	935	862	873	862
Total cold run time: 82822 ms
Total hot run time: 79906 ms

doris-robot · 2023-11-13T11:25:49Z

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.88 seconds
stream load tsv: 556 seconds loaded 74807831229 Bytes, about 128 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.2 seconds inserted 10000000 Rows, about 342K ops/s
storage size: 17162357285 Bytes

doris-robot · 2023-11-13T11:25:56Z

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.01 seconds
stream load tsv: 557 seconds loaded 74807831229 Bytes, about 128 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17162282374 Bytes

starocean999 · 2023-11-14T03:54:51Z

run buildall

doris-robot · 2023-11-14T05:46:15Z

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit c6762f5cdb52771efefc853e51ab6c32c112792a, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5288	5099	5078	5078
q2	369	146	158	146
q3	2032	2119	2029	2029
q4	1419	1389	1362	1362
q5	3959	3978	3947	3947
q6	254	130	132	130
q7	1464	902	890	890
q8	2778	2778	2764	2764
q9	9609	9556	9485	9485
q10	3493	3533	3522	3522
q11	385	256	251	251
q12	436	289	282	282
q13	4577	4162	4080	4080
q14	312	279	300	279
q15	630	548	551	548
q16	676	589	588	588
q17	1131	1077	1088	1077
q18	8173	7570	7716	7570
q19	1676	1668	1668	1668
q20	527	308	292	292
q21	4719	4326	4387	4326
q22	504	406	410	406
Total cold run time: 54411 ms
Total hot run time: 50720 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5071	5038	5068	5038
q2	339	215	250	215
q3	4089	4057	4040	4040
q4	2828	2774	2770	2770
q5	9705	9710	9814	9710
q6	246	123	124	123
q7	3019	2504	2496	2496
q8	4865	4864	4839	4839
q9	13005	12950	12844	12844
q10	4080	4176	4190	4176
q11	721	655	648	648
q12	1009	803	822	803
q13	4295	3858	3850	3850
q14	379	357	350	350
q15	609	547	566	547
q16	768	704	688	688
q17	3875	3942	3839	3839
q18	9572	9432	9394	9394
q19	1819	1778	1777	1777
q20	2397	2055	2058	2055
q21	8683	8736	8758	8736
q22	872	840	843	840
Total cold run time: 82246 ms
Total hot run time: 79778 ms

doris-robot · 2023-11-14T05:57:07Z

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.54 seconds
stream load tsv: 554 seconds loaded 74807831229 Bytes, about 128 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.2 seconds inserted 10000000 Rows, about 354K ops/s
storage size: 17098165180 Bytes

starocean999 · 2023-11-15T01:39:40Z

run buildall

doris-robot · 2023-11-15T03:35:20Z

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit e5ab7e00075a1a55f2623cb28cdeb82c55d3ee80, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5269	5108	5071	5071
q2	354	171	154	154
q3	2028	2007	1991	1991
q4	1407	1362	1363	1362
q5	3958	3999	3980	3980
q6	259	129	128	128
q7	1479	884	883	883
q8	2780	2794	2780	2780
q9	9761	9793	9489	9489
q10	3464	3549	3513	3513
q11	372	256	258	256
q12	428	282	281	281
q13	4554	4151	4147	4147
q14	325	290	290	290
q15	640	552	525	525
q16	669	595	586	586
q17	1135	1097	1063	1063
q18	8080	7657	7721	7657
q19	1692	1665	1670	1665
q20	580	296	314	296
q21	4700	4325	4373	4325
q22	511	402	418	402
Total cold run time: 54445 ms
Total hot run time: 50844 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4956	5107	4956	4956
q2	321	224	219	219
q3	4034	4007	4024	4007
q4	2760	2761	2766	2761
q5	9612	9579	9645	9579
q6	248	119	124	119
q7	3065	2451	2486	2451
q8	4873	4863	4917	4863
q9	13202	13137	13166	13137
q10	4053	4190	4211	4190
q11	723	643	650	643
q12	979	825	821	821
q13	4302	3893	3876	3876
q14	377	343	346	343
q15	630	555	554	554
q16	754	657	686	657
q17	3904	3876	3895	3876
q18	9541	9496	9291	9291
q19	1858	1786	1774	1774
q20	2408	2055	2067	2055
q21	8804	8787	8802	8787
q22	902	877	860	860
Total cold run time: 82306 ms
Total hot run time: 79819 ms

doris-robot · 2023-11-15T03:45:54Z

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.74 seconds
stream load tsv: 551 seconds loaded 74807831229 Bytes, about 129 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 34 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 28.2 seconds inserted 10000000 Rows, about 354K ops/s
storage size: 17096280051 Bytes

starocean999 · 2023-11-15T10:55:45Z

run buildall

starocean999 · 2023-11-15T12:19:28Z

run buildall

…node

starocean999 · 2023-11-15T12:34:05Z

run buildall

doris-robot · 2023-11-15T14:18:16Z

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.7 seconds
stream load tsv: 572 seconds loaded 74807831229 Bytes, about 124 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 28.4 seconds inserted 10000000 Rows, about 352K ops/s
storage size: 17098825611 Bytes

doris-robot · 2023-11-15T14:27:48Z

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.63 seconds
stream load tsv: 564 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.4 seconds inserted 10000000 Rows, about 352K ops/s
storage size: 17099112633 Bytes

starocean999 · 2023-11-16T01:56:38Z

run clickbench

doris-robot · 2023-11-16T04:02:44Z

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.98 seconds
stream load tsv: 569 seconds loaded 74807831229 Bytes, about 125 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.2 seconds inserted 10000000 Rows, about 342K ops/s
storage size: 17098839207 Bytes

doris-robot · 2023-11-16T04:05:59Z

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit c22dcae4a8a23da825edb4c69f50c2b6779c7be5, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4940	4691	4692	4691
q2	359	156	159	156
q3	2042	1898	1908	1898
q4	1384	1276	1262	1262
q5	3975	3960	3985	3960
q6	249	128	128	128
q7	1403	872	883	872
q8	2751	2769	2772	2769
q9	9765	9805	9455	9455
q10	3460	3503	3504	3503
q11	374	239	249	239
q12	438	302	294	294
q13	4617	3767	3777	3767
q14	314	292	290	290
q15	593	531	540	531
q16	663	593	575	575
q17	1133	920	882	882
q18	7755	7280	7375	7280
q19	1674	1689	1689	1689
q20	555	308	291	291
q21	4410	3960	3963	3960
q22	478	374	376	374
Total cold run time: 53332 ms
Total hot run time: 48866 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4586	4617	4591	4591
q2	336	221	257	221
q3	4007	3985	3990	3985
q4	2685	2701	2699	2699
q5	9547	9565	9471	9471
q6	242	117	126	117
q7	2620	2244	2259	2244
q8	4437	4474	4472	4472
q9	13249	13067	13141	13067
q10	4048	4167	4198	4167
q11	808	632	640	632
q12	981	815	819	815
q13	4277	3537	3593	3537
q14	393	353	338	338
q15	582	522	530	522
q16	738	669	655	655
q17	3857	3907	3872	3872
q18	9533	8961	9052	8961
q19	1793	1770	1783	1770
q20	2413	2083	2040	2040
q21	8738	8664	8605	8605
q22	880	799	806	799
Total cold run time: 80750 ms
Total hot run time: 77580 ms

…node #26886 (#27096)

github-actions · 2023-11-17T04:33:36Z

PR approved by at least one committer and no changes requested.

github-actions · 2023-11-17T04:33:38Z

PR approved by anyone and no changes requested.

* [keyword](decimalv2) Add DecimalV2 keyword #26283 (#26319) * [fix](planner) Fix sample partition table #25912 (#26399) In the past, two conditions needed to be met when sampling a partitioned table: 1. Data is evenly distributed between partitions; 2. Data is evenly distributed between buckets. Finally, the number of sampled rows in each partition and each bucket is the same. Now, sampling will be proportional to the number of partitioned and bucketed rows. * [fix](spark-load)fix-Unique-key-with-MOR-by-sparkload #26383 (#26414) * [fix](nereids)fix bug of select mv in nereids #26235 (#26415) * [improvement](show trash) Fix be restart slow when too many trash files #26147 (#26417) * [fix](planner)should keep at least one slot materialized in agg node #26116 (#26419) * [fix](multi-catalog)add the FAQ for Aliyun DLF and add the fs.xx.impl check #25594 (#26422) * [coverage](pipeline) Remove unless code and add call method for coverage #25552 (#26423) Remove unless code and add call method for coverage * [Fix](statistics)Fix analyze min max sql syntax error. #26240 (#26443) backport #26240 * [fix](auditlog) fix without lock in QueryStatisticsRecvr find (#26441) * [fix](invert index) Fix the timing error when opening the searcher #26401 (#26472) * [fix](nereids)only enable colocate scan for one phase global parttion topn in some condition #26473 (#26481) * [branch-2.0](cherry-pick) Add more indexed column reader be unit test #25652 (#26430) * [enhancement](regression) fault injection for segcompaction test (#25709) (#26305) 1. generalized debug point facilities from docker suites for fault-injection/stubbing cases 2. add segcompaction fault-injection cases for demonstration 3. add -238 TOO_MANY_SEGMENTS fault-injection case for good Co-authored-by: zhengyu <freeman.zhang1992@gmail.com> * [fix](case) rm non-visiable charactor null in out file (#26540) * [fix](load) fix merged row number miscounting because of race condition (#26516) row numbers miscounting because of race condition, will cause load to fail sometimes with warning 'the rows number written doesn't match'. Signed-off-by: freemandealer <freeman.zhang1992@gmail.com> * [test](regression) Add more regression test for FE (#26539) * [test](coverage) Improve test coverage for runtime filter (#26314) (#26547) * [fix](Nereids) RewriteCteChildren not work with cost based rewritter (#26326) (#26530) we use a map to record rewrite cte children result to avoid rewrite twice in cost based rewritter. However, we record cte outer and inner in one map, and use null as outer result's key, use cte id as inner result's key. This is wrong, because every anchor has an outer, and we could only record one outer. So when we use the cache in cost based rewritter, we get wrong outer plan from the cache. Then the error will be thrown as below: ``` Caused by: java.lang.IllegalArgumentException: Stats for CTE: CTEId#1 not found at com.google.common.base.Preconditions.checkArgument(Preconditions.java:143) ~[guava-32.1.2-jre.jar:?] at org.apache.doris.nereids.stats.StatsCalculator.visitLogicalCTEConsumer(StatsCalculator.java:1049) ~[classes/:?] at org.apache.doris.nereids.stats.StatsCalculator.visitLogicalCTEConsumer(StatsCalculator.java:147) ~[classes/:?] at org.apache.doris.nereids.trees.plans.logical.LogicalCTEConsumer.accept(LogicalCTEConsumer.java:111) ~[classes/:?] at org.apache.doris.nereids.stats.StatsCalculator.estimate(StatsCalculator.java:222) ~[classes/:?] at org.apache.doris.nereids.stats.StatsCalculator.estimate(StatsCalculator.java:200) ~[classes/:?] at org.apache.doris.nereids.jobs.cascades.DeriveStatsJob.execute(DeriveStatsJob.java:108) ~[classes/:?] at org.apache.doris.nereids.jobs.scheduler.SimpleJobScheduler.executeJobPool(SimpleJobScheduler.java:39) ~[classes/:?] at org.apache.doris.nereids.jobs.executor.Optimizer.execute(Optimizer.java:51) ~[classes/:?] at org.apache.doris.nereids.jobs.rewrite.CostBasedRewriteJob.getCost(CostBasedRewriteJob.java:98) ~[classes/:?] at org.apache.doris.nereids.jobs.rewrite.CostBasedRewriteJob.execute(CostBasedRewriteJob.java:64) ~[classes/:?] at org.apache.doris.nereids.jobs.executor.AbstractBatchJobExecutor.execute(AbstractBatchJobExecutor.java:119) ~[classes/:?] at org.apache.doris.nereids.rules.rewrite.RewriteCteChildren.visit(RewriteCteChildren.java:72) ~[classes/:?] at org.apache.doris.nereids.rules.rewrite.RewriteCteChildren.visit(RewriteCteChildren.java:56) ~[classes/:?] at org.apache.doris.nereids.trees.plans.visitor.PlanVisitor.visitLogicalSink(PlanVisitor.java:118) ~[classes/:?] at org.apache.doris.nereids.trees.plans.visitor.SinkVisitor.visitLogicalResultSink(SinkVisitor.java:72) ~[classes/:?] at org.apache.doris.nereids.trees.plans.logical.LogicalResultSink.accept(LogicalResultSink.java:58) ~[classes/:?] at org.apache.doris.nereids.rules.rewrite.RewriteCteChildren.visitLogicalCTEAnchor(RewriteCteChildren.java:86) ~[classes/:?] at org.apache.doris.nereids.rules.rewrite.RewriteCteChildren.visitLogicalCTEAnchor(RewriteCteChildren.java:56) ~[classes/:?] at org.apache.doris.nereids.trees.plans.logical.LogicalCTEAnchor.accept(LogicalCTEAnchor.java:60) ~[classes/:?] at org.apache.doris.nereids.rules.rewrite.RewriteCteChildren.visitLogicalCTEAnchor(RewriteCteChildren.java:86) ~[classes/:?] at org.apache.doris.nereids.rules.rewrite.RewriteCteChildren.visitLogicalCTEAnchor(RewriteCteChildren.java:56) ~[classes/:?] at org.apache.doris.nereids.trees.plans.logical.LogicalCTEAnchor.accept(LogicalCTEAnchor.java:60) ~[classes/:?] at org.apache.doris.nereids.rules.rewrite.RewriteCteChildren.rewriteRoot(RewriteCteChildren.java:67) ~[classes/:?] at org.apache.doris.nereids.jobs.rewrite.CustomRewriteJob.execute(CustomRewriteJob.java:58) ~[classes/:?] at org.apache.doris.nereids.jobs.executor.AbstractBatchJobExecutor.execute(AbstractBatchJobExecutor.java:119) ~[classes/:?] at org.apache.doris.nereids.NereidsPlanner.rewrite(NereidsPlanner.java:275) ~[classes/:?] at org.apache.doris.nereids.NereidsPlanner.plan(NereidsPlanner.java:218) ~[classes/:?] at org.apache.doris.nereids.NereidsPlanner.plan(NereidsPlanner.java:118) ~[classes/:?] at org.apache.doris.nereids.trees.plans.commands.ExplainCommand.run(ExplainCommand.java:81) ~[classes/:?] at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:550) ~[classes/:?] ``` * [fix](Nereids) could not run query with repeat node in cte (#26330) (#26531) pick from master PR: #26330 commit id: a89477e ExpressionDeepCopier not process VirtualReference, so we generate inline plan with mistake. * [opt](Nereids) remove Nondeterministic trait from date related functions (#26444) (#26568) * change version to 2.0.3-rc03dev * [fix](regression-test) Fix regiressin test syncer suit use master fe directly (#26456) (#26583) Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com> * Revert "[improvement](scanner_schedule) reduce memory consumption of scanner #24199 (#25547)" (#26613) This reverts commit 9a19581 to investigate ANALYZE TABLE WITH SYNC problem * [enhancement](Nereids): add LOG info to show the phase of NereidsPlanner. (#26542) * [opt](regression test) Add string-like column order by test #26379 (#26533) * [Feature](auditloader) Plugin auditloader use auth token to avoid using cleartext passwords in config (#26278) (#26532) Doris FE will check if stream load http request has auth token after checking password failed; Plugin audit-log loader can use auth token if plugin config set use_auth_token to true Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com> * [branch-2.0](JdbcCatalog) fix that the predicate column name does not have back quote when querying the JDBC appearance (#26479) (#26560) master pr: #26479 * [fix](prepare statement) Not supported such prepared statement if prepare a forward master sql (#26512) (#26638) * [Pick-2.0](regression) add failure injection in inverted index writer #26121 (#26376) * [fix](regression) fix regression framework bug: if real test result is negative, it will miss check test result #25734 (#25734) (#26551) * [Branch-2.0](regression-test) Add tvf regression tests #26322 #26455 (#26566) * [fix](BE)Branch-2.0 unknown runtime filter when get filter from _consumer_map (#26570) * [regression-test](framework) support Non concurrent mode #26487 (#26574) * [regression-test](fix) fix case bug #26561 (#26578) * [fix](backup) Add repo id to local meta/info files to avoid overwriting #26536 (#26622) The local meta/info files generated during backup are not distinguished by repo names. If two backup jobs with the same name are submitted to different repos at the same time, meta/info may be overwritten by another backup job. * [cases](regression-test) Add backup & restore test case #26490 #26491 (#26623) * [case](regression) Adapt show create table and views to 2.0 (#26624) * [fix](regression-test) add more check to address flaky test_partial_update_with_delete_stmt #26474 (#26628) * [feature](Nereids): push down topN through join #24720 (#26634) Push TopN through Join. JoinType just can be left/right outer join or cross join, because data of their one child can't be filtered. new TopN is (original limit + original offset, 0) as limit and offset. (cherry picked from commit 3c9ff7a) * [Test](statistics) Add test cases for external table statistics #26511 (#26636) 1. Test for close and open auto collection for external catalog. 2. Test for analyze table table_name (column) and whole table. * [fix](runtime filter) append late arrival runtime filters in vfilecanner #25996 (#26640) `VFileScanner` will try to append late arrival runtime filters in each loop of `ScannerScheduler::_scanner_scan`. However, `VFileScanner::_get_next_reader` only generates the `_push_down_conjuncts` in the first loop, so the late arrival runtime filters are ignored. * [fix](information_schema)fix bug that metadata_name_ids error tableid and append information_schema case #26238 (#26646) fix bug that #24059 . Added some information_schema scanner tests. files schema_privileges table_privileges partitions rowsets statistics table_constraints Based on infodb_support_ext_catalog=false, it currently includes tests for all tables under the information_schema database. * [Improve](map)Map impli cast #26126 (#26654) * [chore](regression) Do stale resource reclaim before executing cold heat separation p2 case #26596 (#26660) * fix shrink in topN for complext type #26609 (#26661) * [fix](planner) Fix decimal precision and scale wrong when create table like #25802 (#26666) Use field datatype such as decimal(10, 0) to create table like. Because the scale is 0, the precision and scale will lost when create table like done. this will fix the bug. **Before fix, create table with following SQL**: CREATE TABLE IF NOT EXISTS db_test.table_test ( `name` varchar COMMENT "1m size", `id` SMALLINT COMMENT "[-32768, 32767]", `timestamp0` decimal null comment "c0", `timestamp1` decimal(38, 0) null comment "c1" ) DISTRIBUTED BY HASH(`id`) BUCKETS 1 PROPERTIES ('replication_num' = '1'); **and Then run** CREATE TABLE db_test.table_test_like LIKE db_test.table_test SHOW CREATE TABLE db_test.table_test_like; the field `timestamp1` will be decimal(9, 0), it's wrong. this will fix it. Co-authored-by: JingDas <114388747+JingDas@users.noreply.github.com> * [fix](test) fix sql block rule test (#26671) * [Coverage](BE) Delete vinfo_func in BE #26562 (#26674) * [Fix](partial update) Fix core when successfully schema change and load during a partial update #26210 (#26518) * [typo] copy branch master docs to branch-2.0 (#26703) * [typo] update sql-functions to upper-case (#26706) * [Bug](cherry-pick) Add status dispose in branch 2.0 beta rowset reader (#26684) * (selectdb-cloud) Reduce FE db lock range for ShowDataStmt #26588 (#26621) Reduce read lock critical sections and avoid execution timeouts * [brach-2.0](pick)use 2 phase agg above union all #26245 (#26664) * [bug](bitmap) fix bitmap value copy operator not call reset #26451 (#26681) when a empty bitmap assign to other bitmap the other bitmap should reset self firstly, and then set empty type. * [fix](planner)isnull predicate can't be safely constant folded in inlineview #25377 (#26685) * [fix](nereids)unnest in-subquery with agg node in proper condition #25800 (#26687) * [fix](nereids)add visitMarkJoinReference method in ExpressionDeepCopier #25874 (#26688) * [fix](nereids)don't normalize column name for base index #26476 (#26690) * [fix](planner)cast floating point type to bigint for bit functions #26598 (#26691) * [fix](Nereids) storage later agg rule process agg children by mistake #26101 (#26698) pick from master PR #26101 commit id c0ed5f7 update Project#findProject agg function's children could be any expression rather than only slot. we use Project#findProject to process them. But this util could only process slot. This PR update this util to let it could process all type expression. * [fix](Nereids) time extract function constant folding core (#26292) (#26699) pick from master PR: #26292 commit id: 74fd5da some time extract function changed return type in the previous PR #18369 but it is not change FE constant folding function signature. This is let them have same signature to avoid BE core. * [fix](Nereids) only search internal funcftion when dbName is empty (#26296) (#26700) pick from master PR: #26296 commit id: 6892fc9 if call function with database name. we should only search UDF * [fix](Nereids) ban right outer, right anti, full outer with bucket shuffle (#26529) (#26702) pick from master PR: #26529 commit id: f80495d if left bucket has no data, we do not generate left bucket instance. These join should reserve all right side data. But because left instance is not exists. So right data will be discard since no dest be set. We ban these join temporarily until we could generate all instance for left side in Coordinator. * [test](statistics)Add hive statistics all data type p0 test (#26676) (#26715) * [test](serialisation) Serialise some cases and enable str_to_date tests #26651 (#26716) 1 enable the cases about str_to_date, which have been muted because some parallel config influence. 2 serialise some cases which called admin set config * Revert "[Coverage](BE) Delete vinfo_func in BE #26562 (#26674)" (#26724) This reverts commit 22eafa4. * [fix](regression-test) add tests for jdbc catalog (#26608) (#26719) * [fix](nereids)SimplifyRange rule may mess up and/or predicate #26304 (#26693) * [Fix](fs_benchmark_tools) Fix `run_fs_benchmark.sh` classpath issue. (#26183) (#26704) Backport from #26183. * [Fix](partial update) Fix core when doing partial update on tables with row column after schema change #26632 (#26695) * [Opt](orc-reader) Optimize orc string dict filter in not_single_conjunct case. (#26386) (#26696) Optimize orc/parquet string dict filter in not_single_conjunct case. We can optimize this processing to filter block firstly by dict code, then filter by not_single_conjunct. Because dict code is int, it will filter faster than string. For example: ``` select count(l_receiptdate) from lineitem_date_as_string where l_shipmode in ('MAIL', 'SHIP') and l_commitdate < l_receiptdate and l_receiptdate >= '1994-01-01' and l_receiptdate < '1995-01-01'; ``` `l_receiptdate` and `l_shipmode` will using string dict filtering, and `l_commitdate < l_receiptdate` is the an not_single_conjunct which contains dict filter field. We can optimize this processing to filter block firstly by dict code, then filter by not_single_conjunct. Because dict code is int, it will filter faster than string. Before: mysql> select count(l_receiptdate) from lineitem_date_as_string where l_shipmode in ('MAIL', 'SHIP') and l_commitdate < l_receiptdate and l_receiptdate >= '1994-01-01' and l_receiptdate < '1995-01-01'; +----------------------+ | count(l_receiptdate) | +----------------------+ | 49314694 | +----------------------+ 1 row in set (6.87 sec) After: mysql> select count(l_receiptdate) from lineitem_date_as_string where l_shipmode in ('MAIL', 'SHIP') and l_commitdate < l_receiptdate and l_receiptdate >= '1994-01-01' and l_receiptdate < '1995-01-01'; +----------------------+ | count(l_receiptdate) | +----------------------+ | 49314694 | +----------------------+ 1 row in set (4.85 sec) * [docs](docs) Update Files of Branch-2.0 (#26737) * [date](parser) Support DateV1 keyword (#25414) (#26746) * [Fix](orc-reader) Fix orc complex types when late materialization was turned on by disabling late materialization in this case. (#26548) (#26743) Fix orc complex types when late materialization was turned on in orc reader by disabling late materialization in this case. * [fix](udf)java udf does not support overloaded evaluate method (#22681) (#26768) Co-authored-by: HB <hubiao01@corp.netease.com> * [fix](show_proc) fix show statistic proc dir to ensure that result only contains dbs in internal catalog (#26254) (#26763) backport #26254 Co-authored-by: caiconghui <55968745+caiconghui@users.noreply.github.com> * [Enhancement](sql-cache) Use update time of hive to avoid cache miss through multi fe nodes. (#26424) (#26762) backport #26424 * [Fix](partial update) Fix partial update info loss when the delete bitmaps of the committed transactions are calculated by the compaction #26556 (#26735) * [hotfix](editlog) Fix upsert replay on follower not contains loadedTableIndexIds (#26597) (#26756) * [chore](regression-test) Fix error add partition operation due to duplicate partition range #26742 (#26758) * [Bug](materialized-view) fix some bugs on create mv with percentile_approx (#26528) (#26764) 1. percentile_approx have wrong symbol 2. fnCall.getParams() get obsolete childrens * [Bug](agg-state) fix file load insert wrong data to agg_state (#26581) (#26765) * [Bug](decimalv2) getCmpType return decimalv2 when lhs/rhs type both is decimalv2 (#26705) (#26767) * [fix](Nereids) fix plan shape of query64 unstable (#26012) (#26775) don't remove the physical plan after optimizing the plan in dphyper. * [FIX](complextype) fxi array nested struct literal #26270 (#26778) * [improvement](disk balance) Prevent duplicate disk balance tasks afte… (#25990) (#26745) * [branch-2.0](transaction) Fix publish txn wait too long when not meet quorum #26659 (#26759) * [bugfix](clickhouse) fix datetime convert error. (#26128) (#26766) Co-authored-by: Guangdong Liu <liugddx@gmail.com> * [Fix](row store) cache invalidate key should not include sequence column #26771 (#26780) * [branch-2.0](pick) support HTTP request with chunked transfer (#26520) (#26785) * [feature](nestedType) add nested data type to create table tool (#26787) * [fix](hudi) fix wrong schema when query hudi table on obs #26789 (#26791) * [fix](decimal) fix undefined behaviour of divide by zero when cast string to decimal (#26792) * [fix](refresh) fix priv issue of refresh database and table operation #26793 (#26794) * [minor] add disable swap command tip (#26798) * [fix](information_schema) fix test_query_sys_tables schema_privileges regression case #26753 (#26800) * [branch-2.0] fix test result (#26801) fix output error from #26743 On master branch, the value in struct field is wrapped by quota, but on branch 2.0, the value in struct field is NOT wrapped by quota * fix: restore load job progress before retry load task (#26802) Co-authored-by: chenboyang.922 <chenboyang.922@bytedance.com> * [fix](thrift)limit be and fe thrift server max pkg size,avoid accepting error or too large package causing OOM #26179 (#26805) * [fix](Planner): don't push down isNull predicate into view (#26288) (#26773) * [opt](scanner) increase the connection num of s3 client #26795 (#26796) * [enhancement](metrics) enhance visibility of flush thread pool (#26544) (#26819) * [fix](regression) move fault-injection data to the right place (#26825) Signed-off-by: freemandealer <freeman.zhang1992@gmail.com> * [feature](binlog) Add ingest_binlog/http_get_snapshot limit download speed && Add async ingest_binlog (#26323) (#26733) * [fix](jdbc catalog) fix mysql zero date (#26569) (#26837) * [ci](pipeline) add tpch sff100 test on branch-2.0 (#26824) * [pick](nerieds) make AGG_SCALAR_SUBQUERY_TO_WINDOW_FUNCTION rewrite rule #25969 (#26852) * [enhancement](230) print max version and spec version when -230 happens (#26643) (#26854) * [chore](fs) Don't print the stack for file system and it's derived class #26814 (#26838) * [compile](gcc) fix gcc compile error #26863 * [test](jdbc) pick some jdbc test from branch master (#26860) * [pipeline](exec) disable shared scan in default and disable shared scan in limit with where scan (#25952) (#26815) * [regression](partial update) Add cases when the deleted rows have non nullable columns without default value #26776 (#26848) * [feature](fe) Add coverage tool for FE UT (#26203) (#26857) * [fix](map) the implementation of ColumnMap::replicate was incorrect (#26647) (#26868) * [fix](broker load) pass loadToSingleTablet to olapTableSink (#26680) (#26869) * [regression-test](framework) Support running tests multiple times and reporting correctly to TeamCity (#26606) (#26871) * [refactor](stats) refactor collection logic and opt some config #26163 (#26858) picked from #26163 * [bug](user login)fix PASSWORD_LOCK_TIME setting UNBOUNDED does not take effect #26585 (#26859) * [Improvement](statistics)Improve stats sample strategy (#26435) (#26890) backport #26435 Improve the accuracy of sample stats collection. For non distribution columns, use `n*d / (n - f1 + f1*n/N)` where `f1` is the number of distinct values that occurred exactly once in our sample of n rows (from a total of N), and `d` is the total number of distinct values in the sample. For distribution columns, use `ndv(n) * fraction of tablets sampled` for NDV. For very large tablet to sample, use limit to control the total lines to scan (for non key column only, because key column is sorted and will be inaccurate using limit). * [fix](partial update) Fix NPE when the query statement of an update statement is a point query in OriginPlanner #26881 (#26900) * [bug](function) add signature for precentile function (#26867) (#26926) Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com> * enable pipeline and nereids in test-pipeline (#26918) * [Fix](Planner) fix varchar does not show real length #25171 (#26850) * [improvement](statistics)Multi bucket columns using DUJ1 to collect ndv #26950 (#26976) backport #26950 * [fix](statistics)Fix external table show column stats type bug #26910 (#26921) backport: #26910 * [minor](stats) rename stats related session variable name #26936 (#26928) * [nereids](datetime) fix wrong result type of datetime add with interval as first arg (#26957) (#26987) * [fix](Nereids) column pruning under union broken unexpectedly (#26884) (#26985) * [fix](catalog) Fix ClickHouse DataTime64 precision parsing (#26980) * [opt](MergeIO) use equivalent merge size to measure merge effectiveness (#26741) (#26923) backport #26741 * add defensive code in runtime predicate to avoid crash due to column not in tablet schema #26990 (#26991) * [fix](stats) fix auto collector always create sample job no matter the table size #26968 (#26972) * [Enhance](regression)enhance docker network by add docker network subnet (#26872) * [fix](case) regression-test/suites/show_p0/test_show_statistic_proc.groovy (#26925) Co-authored-by: stephen <hello-stephen@qq.com> * [fix](auth) fix overwrite logic of user with domain (#27003) backport #27002 * [Branch-2.0](Serde) Fix content displayed by complex types in MySQL Client (#26880) backport #25946 and #26301 * [test](tvf) append tvf read hive_text file regression case. (#26790) (#26989) backport #26790 * [test](information_schema)append information_schema external_table_p0 case. (#27029) backport : #26846 * [fix](parquet) compressed_page_size has the same meaning in page v1 and v2 (#26783) (#26922) backport #26783 * [BugFix](JDBC Catalog) fix jdbc catalog query bitmap may cause be core sometimes (#26933) (#27018) * [Enhance](regression) skip test_information_schema_external (#27058) * [improvement](pipeline) task group scan entity (#19924) (#27040) Co-authored-by: Lijia Liu <liutang123@yeah.net> * [opt](pipeline) Return InternalError to FE instead of doing a useless DCHECK in ExecNode #27035 (#27057) Effect: Client will see error message like below when BE meeting plan logical error. RROR 1105 (HY000): errCode = 2, detailMessage = ([xxx]())[CANCELLED]Logical error during processing VNewOlapScanNode(dr_case_tag), output of projections 2 mismatches with exec node output 3 * [fix](nereids)Fix nereids fail to parse tablesample bug (#26982) backport #26981 * [branch2.0](test) fix external table test case with nested type display (#27092) * [fix](load) skip cancel already cancelled channels (#27109) * [fix](Nereids) store user variable in connect context (#26655) (#26920) pick from master #26655 1. user variable should be case insensitive 2. user variable should be cleared after the connection reset * [test](parquet)append parquet reader byte_array_decimal and rle_bool case (#26751) (#27026) backport #26751 * [fix](nereids) support uncorrelated subquery in join condition (#26893) pick from master #26672 commit id: 17b1108 * [Bug](pipeline) try fix the exchange sink buffer result error (#27087) * [fix](function)return NULL rather than 'null' if path not found #25880 (#26823) * [enhancement](nereids)make error message more readable when bind logicalRepeat node #26744 (#26895) * [regression](delete) add delete case for every type (#26961) * [branch-2.0](paimon)disable paimon decimal case (#26971) * [regression](partial update) Add row store cases for all existing partial update cases #26924 (#27017) * [fix](statistics) fix updated rows incorrect due to typo in code #26979 (#27034) * [fix](typo) Use minutes as auto analyze schedule interval #26968 (#27041) * [Improvement](function) opt for case when #23068 (#27054) * [fix](planner)scan node should project all required expr from parent node #26886 (#27096) * [fix](nereids)count in correlated subquery shoud not output null value #27064 (#27097) * [fix](load) add lock in active_memtable_mem_consumption #25207 (#27100) * [branch-2.0](suites) Enable test_cast_with_scale_type since Nereids is ON (#26986) * [Fix](multi-catalog) Fix NPE when replaying hms events #26803 (#26997) Co-authored-by: wangxiangyu <wangxiangyu@360shuke.com> * [Opt](scanner-scheduler) Optimize `BlockingQueue`, `BlockingPriorityQueue` and change remote scan thread pool #26784 (#27053) - Optimize `BlockingQueue`, `BlockingPriorityQueue` by swapping `notify` and `unlock` to reduce lock competition. Ref: https://www.boost.org/doc/libs/1_54_0/boost/thread/sync_bounded_queue.hpp - Change remote scan thread pool to `PriorityQueue`. * [fix](errmsg) fix multiple FE processes start err msg (#27009) (#27080) * [FIX](regresstest) fix test_load_with_map_nested_array csv for id #27105 (#27107) * [FIX](map)fix map nested decimal with element at #27030 (#27110) * [feature](tvf)(jni-avro)jni-avro scanner add complex data types (#26236) (#26731) * [fix](nereids)fix bug that query infomation_schema.rowsets fe send fragment to one of muilti be. (#27025) (#27090) Fixed the bug of incomplete query results when querying information_schema.rowsets in the case of multiple BEs. The reason is that the schema scanner sends the scan fragment to one of multiple bes, and be queries the information of fe through rpc. Since the rowsets information requires information about all BEs, the scan fragment needs to be sent to all BEs. * [Config](statistics)Set enable_auto_analyze default value to true. #27146 * [branch2.0](test) fix doris jdbc catalog test case (#27150) 1. Fix doris_jdbc_catalog test case out file 2. Add log to debug 2 unstable test cases: pg_jdbc_catalog and oracle_jdbc_catalog * [bugfix](tablet)fix the tablet will be deleted when clone due to concurrency #25784 (#26777) * [fix](sink) crash caused by wild pointer of counter in VDataStreamSender (#26947) (#27148) If preparation fails, the counter _peak_memory_usage_counter will be a wild pointer. *** SIGSEGV address not mapped to object (@0x454d49545f) received by PID 16992 (TID 18856 OR 0x7f4d05444700) from PID 1296651359; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:417 1# os::Linux::chained_handler(int, siginfo*, void*) in /app/doris/Nexchip-doris-1.2.4.2-bin-x86_64/java8/jre/lib/amd64/server/libjvm.so 2# JVM_handle_linux_signal in /app/doris/Nexchip-doris-1.2.4.2-bin-x86_64/java8/jre/lib/amd64/server/libjvm.so 3# signalHandler(int, siginfo*, void*) in /app/doris/Nexchip-doris-1.2.4.2-bin-x86_64/java8/jre/lib/amd64/server/libjvm.so 4# 0x00007F55C85B9400 in /lib64/libc.so.6 5# doris::vectorized::VDataStreamSender::close(doris::RuntimeState*, doris::Status) at /root/doris/be/src/vec/sink/vdata_stream_sender.cpp:734 6# doris::PlanFragmentExecutor::close() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:543 7# doris::PlanFragmentExecutor::~PlanFragmentExecutor() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:95 8# doris::FragmentExecState::~FragmentExecState() at /root/doris/be/src/runtime/fragment_mgr.cpp:112 9# std::_Sp_counted_ptr<doris::FragmentExecState*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() at /root/ldb/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:348 10# doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&) at /root/doris/be/src/runtime/fragment_mgr.cpp:855 11# doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&) at /root/doris/be/src/runtime/fragment_mgr.cpp:592 12# doris::PInternalServiceImpl::_exec_plan_fragment_impl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, doris::PFragmentRequestVersion, bool) at /root/doris/be/src/service/internal_service.cpp:463 13# doris::PInternalServiceImpl::_exec_plan_fragment_in_pthread(google::protobuf::RpcController*, doris::PExecPlanFragmentRequest const*, doris::PExecPlanFragmentResult*, google::protobuf::Closure*) at /root/doris/be/src/service/internal_service.cpp:305 14# doris::WorkThreadPool<false>::work_thread(int) at /root/doris/be/src/util/work_thread_pool.hpp:160 15# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84 16# start_thread in /lib64/libpthread.so.0 17# clone in /lib64/libc.so.6 * [improvement](log) log desensitization without displaying user info (#26912) (#27167) * [branch-2.0](cherry-pick) add chunked transfer json test (#26902) (#27164) * [fix](statistics)Fix alter column stats bug (#27093) (#27189) backport #27093 * (fix)[schema change] fix incorrect setting of schema change jobstate when replay editlog (#26992) (#27139) * [fix](jni) avoid BE crash and NPE when close paimon reader #27129 (#27204) bp #27129 * [enhancement](jdbc catalog) Add lowercase column name mapping to Jdbc data source & optimize database and table mapping #27124 (#27130) * [case] Load json data with enable_simdjson_reader=false (#26601) (#27158) Co-authored-by: HowardQin <hao.qin@esgyn.cn> * [fix](function) fix error when use negative number in explode_numbers #27020 (#27180) * [fix](iceberg) iceberg use customer method to encode special characters of field name (#27108) (#27205) Fix two bugs: 1. Missing column is case sensitive, change the column name to lower case in FE for hive/iceberg/hudi 2. Iceberg use custom method to encode special characters in column name. Decode the column name to match the right column in parquet reader. * [enhancement](binlog) Add dbName && tableName in CreateTableRecord (#26901) (#27208) * [Branch2.0](Export) add show export regression testes #27140 (#27160) * [log](tablet invert) add preconditition check failed log (#26770) (#27171) * [branch-2.0](publish version) publish version task no need return VERSION_NOT_EXIST #27005 (#27174) * [minor](stats) Add start/end time for analyze job, precise to seconds of TableStats update time #27123 (#27185) * [test](regression) Add more alter stmt regression case (#26988) (#27193) * [test](external_table_p0)append log in external_table_p0 for debug unknown table case #27212 (#27213) * [Improve](txn) Add some fuzzy test stub in txn (#26712) (#27144) * [branch-2.0](fe ut) fix decommission test #27082 (#27175) * [Fix](multi-catalog) Fix complex type crash when using dict filter facility in the parquet-reader. (#27151) (#27187) - Fix complex type crash when using the dict filter facility in the parquet-reader by turning off the dict filter facility in this case. - Add orc complex types regression test. * [Optimize](point query) clear names to reduce mem consumption and cpu cost related to block column name (#26931) (#27157) * [fix](fe) Fix `enable_nereids_planner` forward not take effect (#26782) (#27159) * The java reflection method `getFields()` only return public fields, but enable_nereids_planner is private * [fix](fe ut) Fix borrow oject throw npe (#27072) (#27207) occasional failure of fe ut, borrowObject throw npe ``` get agent task request. type: CREATE, signature: 10008, fe addr: null java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) at org.apache.commons.pool2.impl.GenericKeyedObjectPool.register(GenericKeyedObjectPool.java:1079) at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:347) get agent task request. type: CREATE, signature: 10012, fe addr: TNetworkAddress(hostname:127.0.0.1, port:56072) at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:277) at org.apache.doris.common.GenericPool.borrowObject(GenericPool.java:99) at org.apache.doris.utframe.MockedBackendFactory$DefaultBeThriftServiceImpl$1.run(MockedBackendFactory.java:219) at java.lang.Thread.run(Thread.java:750) ``` * [regression](conf) Make checkpoint/clean thread trigger more frequent (#26883) (#27194) * When run p0, we want some checkpoint/clean thread in FE work more frequently * [hotfix](priv) Fix restore snapshot user priv with add cluster in UserIdentity (#26969) (#27210) Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com> * [branch-2.0](fe ut) fix unstable test DecommissionBackendTest (#27173) * [fix](disk migrate) migrate ignore not exists tablet (#26779) (#27172) * [fix](build)macos clang 15 version compilation error (#25457) * [fix](tablet sched) fix sched delete stale remain replica (#27050) (#27179) * Revert "[Branch2.0](Export) add show export regression testes #27140 (#27160)" (#27217) This reverts commit d76581d, since it caused test_show_export testcase fail. * Revert "[test](regression) Add more alter stmt regression case (#26988) (#27193)" (#27216) This reverts commit 42d4806, since it caused test_alter_table_drop_column and test_alter_table_modify_column testcases fail. * [fix](nereids)remove literal partition by and order by expression in window function #26899 (#27214) * [fix](agg) fix coredump of multi distinct of decimal128I (#27014) (#27228) * [fix](agg) fix coredump of multi distinct of decimal128 * fix * Revert "[enhancement](jdbc catalog) Add lowercase column name mapping to Jdbc data source & optimize database and table mapping #27124 (#27130)" (#27230) This reverts commit 087fccd. * [feature](Nereids): eliminate sort under subquery (#26993) (#27218) * [fix](ccr) Mark getBinlog,getBinlogLag,getMeta,getBackendMeta as from master (#27211) (#27227) Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com> * [fix](Nereids): NullSafeEqual should be in HashJoinCondition #27127 (#27232) * [fix](planner)the data type should be the same between input slot and sort slot #27137 (#27215) * [branch2.0](nereids)Pick #26873 #25769: partition prune fix (#27222) * [improvement](fe and broker) support specify broker to getSplits, check isSplitable, file scan for HMS Multi-catalog (#24830) (#27236) bp #24830 * [fix](fe ut) fix unstable ut TabletRepairAndBalanceTest (#27044) (#27239) * [minor](stats) Report error with more friendly meesage when timeout #27197 (#27240) * [fix](build index) Fix inverted index hardlink leak and missing problem #26903 (#27244) * [fix](multi-catalog)add the max compute fe ut and fix download expired #27007 (#27220) bp #27007 * [cherry-pick](regression) add hms catalog broker scan case (#25453) (#27253) * [cherry-pick](fe) select BE local broker to scan Hive table when 'broker.name' in hms catalog is specified (#27122) (#27252) Since #24830 introduce `broker.name` in hms catalog, data scan will run on specified brokers. And [doris operator](https://github.com/selectdb/doris-operator) support BE and broker deployed in same pod, BE access local broker is the fastest approach to access data. In previous logic, every inputSplit will select one BE to execute, then randomly select one broker for actual data access, BE and related broker are always located on separate K8S pod. This pr optimizes the broker select strategy to prioritize BE-local broker when `broker.name` is specified in hms catalog. * [improvement](statistics)Use count as ndv for unique/agg olap table single key column (#27186) (#27275) Single key column of unique/agg olap table has the same value of count and ndv, for this kind of column, don't need to calculate ndv, simply use count as ndv. backport #27186 * [minor](stats) Fix potential npe when loading stats #27200 (#27241) * [fix](tablesample) Fix computeSampleTabletIds NullPointerException (#27165) (#27258) * [fix](partial update) keep case insensitivity and use the columns' origin names in partialUpdateCols in origin planner #27223 (#27255) * [chore](fix) sync check-pr-if-need-run-build.sh with master branch (#27250) * [fix](compile) fix BE compile failure on Mac (#27206) (#27281) * [chore](clucene) coverage compilation option added #27162 (#27284) * [FIX]Fix complex type meta schema in information database #27203 (#27286) * [feature](Nereids): Pushdown LimitDistinct Through Join (#25113) (#27288) Push down limit-distinct through left/right outer join or cross join. such as select t1.c1 from t1 left join t2 on t1.c1 = t2.c1 order by t1.c1 limit 1; * [fix](inverted index) reset fs_writer to nullptr before throw exception (#27202) (#27289) * [fix](planner)output slot should be materialized as intermediate slot in agg node #27282 (#27285) * [FIX](complextype)Fix complex nested and add regress test #26973 (#27293) * [fix](test) disable forbid_unknown_col_stats (#27303) * [fix](stats) Release analyze tasks once job finished #27310 (#27309) * [doc](fix) a new docs for k8s deploy by operator to 2.0 (#26927) * [doc](fix) fix date trunc doc (#27320) * [Fix](statistics)Fix analyze sql including key word bug #27321 (#27322) backport #27321 * [cherry-pick](function) improve compoundPred optimization work with children is nullable #26160 (#27354) * Revert "[improvement](routine-load) add routine load rows check (#25818)" (#27336) * [refactor](planner) filter empty partitions in a unified location (#27190) (#27256) * [fix](hms) fix compatibility issue of hive metastore client #27327 (#27328) * [Branch2.0](Export) add show export regression testes (#27330) * [fix](stats) Fix thread leaks when doing checkpoint #27334 #27335 * [fix](stats) Fix creating too many tasks on new env (#27362) * [fix](build index) fix core when build index for a new column which without data (#27276) * change version to 2.0.3-rc04 (#27392) * fix merge * update clucene --------- Signed-off-by: freemandealer <freeman.zhang1992@gmail.com> Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com> Co-authored-by: Gabriel <gabrielleebuaa@gmail.com> Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com> Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com> Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com> Co-authored-by: deardeng <565620795@qq.com> Co-authored-by: slothever <18522955+wsjz@users.noreply.github.com> Co-authored-by: HappenLee <happenlee@hotmail.com> Co-authored-by: Jibing-Li <64681310+Jibing-Li@users.noreply.github.com> Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com> Co-authored-by: zzzxl <33418555+zzzxl1993@users.noreply.github.com> Co-authored-by: abmdocrt <Yukang.Lian2022@gmail.com> Co-authored-by: HHoflittlefish777 <77738092+HHoflittlefish777@users.noreply.github.com> Co-authored-by: zhengyu <freeman.zhang1992@gmail.com> Co-authored-by: Dongyang Li <hello_stephen@qq.com> Co-authored-by: walter <w41ter.l@gmail.com> Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com> Co-authored-by: Kang <kxiao.tiger@gmail.com> Co-authored-by: Jack Drogon <jack.xsuperman@gmail.com> Co-authored-by: jakevin <jakevingoo@gmail.com> Co-authored-by: zhiqiang <seuhezhiqiang@163.com> Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com> Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com> Co-authored-by: meiyi <myimeiyi@gmail.com> Co-authored-by: airborne12 <airborne08@gmail.com> Co-authored-by: TengJianPing <18241664+jacktengg@users.noreply.github.com> Co-authored-by: minghong <englefly@gmail.com> Co-authored-by: shuke <37901441+shuke987@users.noreply.github.com> Co-authored-by: walter <patricknicholas@foxmail.com> Co-authored-by: zhannngchen <48427519+zhannngchen@users.noreply.github.com> Co-authored-by: Ashin Gau <AshinGau@users.noreply.github.com> Co-authored-by: daidai <2017501503@qq.com> Co-authored-by: amory <wangqiannan@selectdb.com> Co-authored-by: AlexYue <yj976240184@gmail.com> Co-authored-by: seawinde <149132972+seawinde@users.noreply.github.com> Co-authored-by: JingDas <114388747+JingDas@users.noreply.github.com> Co-authored-by: zclllyybb <zhaochangle@selectdb.com> Co-authored-by: bobhan1 <bh2444151092@outlook.com> Co-authored-by: Jeffrey <color.dove@gmail.com> Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com> Co-authored-by: Qi Chen <kaka11.chen@gmail.com> Co-authored-by: KassieZ <139741991+KassieZ@users.noreply.github.com> Co-authored-by: Mingyu Chen <morningman@163.com> Co-authored-by: HB <hubiao01@corp.netease.com> Co-authored-by: Pxl <pxl290@qq.com> Co-authored-by: 谢健 <jianxie0@gmail.com> Co-authored-by: yujun <yu.jun.reach@gmail.com> Co-authored-by: Guangdong Liu <liugddx@gmail.com> Co-authored-by: zfr95 <87513668+zfr9527@users.noreply.github.com> Co-authored-by: chen <czjourney@163.com> Co-authored-by: TsukiokaKogane <cby141994@gmail.com> Co-authored-by: chenboyang.922 <chenboyang.922@bytedance.com> Co-authored-by: ryanzryu <143597717+ryanzryu@users.noreply.github.com> Co-authored-by: Siyang Tang <82279870+TangSiyang2001@users.noreply.github.com> Co-authored-by: zy-kkk <zhongyk10@gmail.com> Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com> Co-authored-by: Lei Zhang <27994433+SWJTU-ZhangLei@users.noreply.github.com> Co-authored-by: Jerry Hu <mrhhsg@gmail.com> Co-authored-by: qiye <jianliang5669@gmail.com> Co-authored-by: AKIRA <33112463+Kikyou1997@users.noreply.github.com> Co-authored-by: Liqf <109049295+LemonLiTree@users.noreply.github.com> Co-authored-by: LiBinfeng <46676950+LiBinfeng-01@users.noreply.github.com> Co-authored-by: zhangguoqiang <18372634969@163.com> Co-authored-by: stephen <hello-stephen@qq.com> Co-authored-by: GoGoWen <82132356+GoGoWen@users.noreply.github.com> Co-authored-by: wangbo <wangbo@apache.org> Co-authored-by: Lijia Liu <liutang123@yeah.net> Co-authored-by: Kaijie Chen <ckj@apache.org> Co-authored-by: Yulei-Yang <yulei.yang0699@gmail.com> Co-authored-by: lsy3993 <110876560+lsy3993@users.noreply.github.com> Co-authored-by: zhangdong <493738387@qq.com> Co-authored-by: Xiangyu Wang <dut.xiangyu@gmail.com> Co-authored-by: wangxiangyu <wangxiangyu@360shuke.com> Co-authored-by: wudongliang <46414265+DongLiang-0@users.noreply.github.com> Co-authored-by: Houliang Qi <neuyilan@163.com> Co-authored-by: Luwei <814383175@qq.com> Co-authored-by: HowardQin <hao.qin@esgyn.cn> Co-authored-by: jiafeng.zhang <zhangjf1@gmail.com> Co-authored-by: DuRipeng <453243496@qq.com> Co-authored-by: catpineapple <42031973+catpineapple@users.noreply.github.com> Co-authored-by: YueW <45946325+Tanya-W@users.noreply.github.com>

…node (apache#26886)

…node apache#26886 (apache#27096)

…node (apache#26886)

…rialized (#32092) introduced by #26886 run this sql: SELECT caseId FROM ( SELECT caseId, count(judgementDateId) FROM ( SELECT abs(caseId) AS caseId, id as judgementDateId FROM dr_user_test_t2 ) AGG_RESULT GROUP BY caseId ) TOTAL order by 1; will get: ERROR 1105 (HY000): errCode = 2, detailMessage = (172.17.0.1)[INTERNAL_ERROR]couldn't resolve slot descriptor 1, desc: tuples: Tuple(id=5 slots=[Slot(id=10 type=DOUBLE col=-1, colname=, nullable=1), Slot(id=11 type=VARCHAR col=-1, colname=id, nullable=1)] has_varlen_slots=1) Tuple(id=4 slots=[Slot(id=8 type=DOUBLE col=-1, colname=, nullable=1)] has_varlen_slots=0) Tuple(id=2 slots=[Slot(id=4 type=DOUBLE col=-1, colname=caseId, nullable=1)] has_varlen_slots=0) Tuple(id=0 slots=[Slot(id=0 type=VARCHAR col=-1, colname=caseId, nu

…rialized (apache#32092) introduced by apache#26886 run this sql: SELECT caseId FROM ( SELECT caseId, count(judgementDateId) FROM ( SELECT abs(caseId) AS caseId, id as judgementDateId FROM dr_user_test_t2 ) AGG_RESULT GROUP BY caseId ) TOTAL order by 1; will get: ERROR 1105 (HY000): errCode = 2, detailMessage = (172.17.0.1)[INTERNAL_ERROR]couldn't resolve slot descriptor 1, desc: tuples: Tuple(id=5 slots=[Slot(id=10 type=DOUBLE col=-1, colname=, nullable=1), Slot(id=11 type=VARCHAR col=-1, colname=id, nullable=1)] has_varlen_slots=1) Tuple(id=4 slots=[Slot(id=8 type=DOUBLE col=-1, colname=, nullable=1)] has_varlen_slots=0) Tuple(id=2 slots=[Slot(id=4 type=DOUBLE col=-1, colname=caseId, nullable=1)] has_varlen_slots=0) Tuple(id=0 slots=[Slot(id=0 type=VARCHAR col=-1, colname=caseId, nu

…rialized (#32092) introduced by #26886 run this sql: SELECT caseId FROM ( SELECT caseId, count(judgementDateId) FROM ( SELECT abs(caseId) AS caseId, id as judgementDateId FROM dr_user_test_t2 ) AGG_RESULT GROUP BY caseId ) TOTAL order by 1; will get: ERROR 1105 (HY000): errCode = 2, detailMessage = (172.17.0.1)[INTERNAL_ERROR]couldn't resolve slot descriptor 1, desc: tuples: Tuple(id=5 slots=[Slot(id=10 type=DOUBLE col=-1, colname=, nullable=1), Slot(id=11 type=VARCHAR col=-1, colname=id, nullable=1)] has_varlen_slots=1) Tuple(id=4 slots=[Slot(id=8 type=DOUBLE col=-1, colname=, nullable=1)] has_varlen_slots=0) Tuple(id=2 slots=[Slot(id=4 type=DOUBLE col=-1, colname=caseId, nullable=1)] has_varlen_slots=0) Tuple(id=0 slots=[Slot(id=0 type=VARCHAR col=-1, colname=caseId, nu

…rialized (#32092) (#32133) cherry-pick from master #27096 introduced by #26886 run this sql: SELECT caseId FROM ( SELECT caseId, count(judgementDateId) FROM ( SELECT abs(caseId) AS caseId, id as judgementDateId FROM dr_user_test_t2 ) AGG_RESULT GROUP BY caseId ) TOTAL order by 1; will get: ERROR 1105 (HY000): errCode = 2, detailMessage = (172.17.0.1)[INTERNAL_ERROR]couldn't resolve slot descriptor 1, desc: tuples: Tuple(id=5 slots=[Slot(id=10 type=DOUBLE col=-1, colname=, nullable=1), Slot(id=11 type=VARCHAR col=-1, colname=id, nullable=1)] has_varlen_slots=1) Tuple(id=4 slots=[Slot(id=8 type=DOUBLE col=-1, colname=, nullable=1)] has_varlen_slots=0) Tuple(id=2 slots=[Slot(id=4 type=DOUBLE col=-1, colname=caseId, nullable=1)] has_varlen_slots=0) Tuple(id=0 slots=[Slot(id=0 type=VARCHAR col=-1, colname=caseId, nu

…rialized (apache#32092) (apache#32133) cherry-pick from master apache#27096 introduced by apache#26886 run this sql: SELECT caseId FROM ( SELECT caseId, count(judgementDateId) FROM ( SELECT abs(caseId) AS caseId, id as judgementDateId FROM dr_user_test_t2 ) AGG_RESULT GROUP BY caseId ) TOTAL order by 1; will get: ERROR 1105 (HY000): errCode = 2, detailMessage = (172.17.0.1)[INTERNAL_ERROR]couldn't resolve slot descriptor 1, desc: tuples: Tuple(id=5 slots=[Slot(id=10 type=DOUBLE col=-1, colname=, nullable=1), Slot(id=11 type=VARCHAR col=-1, colname=id, nullable=1)] has_varlen_slots=1) Tuple(id=4 slots=[Slot(id=8 type=DOUBLE col=-1, colname=, nullable=1)] has_varlen_slots=0) Tuple(id=2 slots=[Slot(id=4 type=DOUBLE col=-1, colname=caseId, nullable=1)] has_varlen_slots=0) Tuple(id=0 slots=[Slot(id=0 type=VARCHAR col=-1, colname=caseId, nu

starocean999 marked this pull request as draft November 13, 2023 09:15

xiaokang added the dev/2.0.3 label Nov 13, 2023

starocean999 force-pushed the nereids_1103 branch from 9f1804c to c6762f5 Compare November 14, 2023 03:54

starocean999 force-pushed the nereids_1103 branch from c6762f5 to e5ab7e0 Compare November 15, 2023 01:39

starocean999 marked this pull request as ready for review November 15, 2023 01:39

starocean999 force-pushed the nereids_1103 branch from e5ab7e0 to 9f452b4 Compare November 15, 2023 10:55

starocean999 added 5 commits November 15, 2023 20:31

[fix](planner)scan node should project all required expr from parent …

44480b7

…node

add test case

46b931b

fix fe ut

45ede85

use default session variable

5ab0625

fix bug

c22dcae

starocean999 force-pushed the nereids_1103 branch from 2d63d83 to c22dcae Compare November 15, 2023 12:32

wm1581066 added the usercase Important user case type label label Nov 16, 2023

starocean999 mentioned this pull request Nov 16, 2023

[fix](planner)scan node should project all required expr from parent node #27096

Merged

xiaokang pushed a commit that referenced this pull request Nov 16, 2023

[fix](planner)scan node should project all required expr from parent …

52064dd

…node #26886 (#27096)

xiaokang added dev/2.0.3-merged and removed dev/2.0.3 labels Nov 16, 2023

morrySnow approved these changes Nov 17, 2023

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 17, 2023

github-actions bot added the reviewed label Nov 17, 2023

englefly approved these changes Nov 17, 2023

View reviewed changes

starocean999 merged commit 5b8aaf9 into apache:master Nov 23, 2023

seawinde pushed a commit to seawinde/doris that referenced this pull request Nov 28, 2023

[fix](planner)scan node should project all required expr from parent …

cbfae91

…node (apache#26886)

gnehil pushed a commit to gnehil/doris that referenced this pull request Dec 4, 2023

[fix](planner)scan node should project all required expr from parent …

45efa62

…node apache#26886 (apache#27096)

XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023

[fix](planner)scan node should project all required expr from parent …

493c4b3

…node (apache#26886)

nextdreamblue mentioned this pull request Mar 12, 2024

[fix](planner) remove input slot for aggregate slot which is not materialized #32092

Merged

[fix](planner)scan node should project all required expr from parent node #26886

[fix](planner)scan node should project all required expr from parent node #26886

Uh oh!

Conversation

starocean999 commented Nov 13, 2023

Further comments

Uh oh!

starocean999 commented Nov 13, 2023

Uh oh!

starocean999 commented Nov 13, 2023

Uh oh!

doris-robot commented Nov 13, 2023

Uh oh!

doris-robot commented Nov 13, 2023

Uh oh!

doris-robot commented Nov 13, 2023

Uh oh!

starocean999 commented Nov 14, 2023

Uh oh!

doris-robot commented Nov 14, 2023

Uh oh!

doris-robot commented Nov 14, 2023

Uh oh!

starocean999 commented Nov 15, 2023

Uh oh!

doris-robot commented Nov 15, 2023

Uh oh!

doris-robot commented Nov 15, 2023

Uh oh!

starocean999 commented Nov 15, 2023

Uh oh!

starocean999 commented Nov 15, 2023

Uh oh!

starocean999 commented Nov 15, 2023

Uh oh!

doris-robot commented Nov 15, 2023

Uh oh!

doris-robot commented Nov 15, 2023

Uh oh!

starocean999 commented Nov 16, 2023

Uh oh!

doris-robot commented Nov 16, 2023

Uh oh!

doris-robot commented Nov 16, 2023

Uh oh!

github-actions bot commented Nov 17, 2023

Uh oh!

github-actions bot commented Nov 17, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants