Skip to content

Conversation

@englefly
Copy link
Contributor

What problem does this PR solve?

PR#36784: Bucket shuffle join shall be disabled if the number of scan instances for the left table of the join is less than 10.
The value "10" should not be hardcoded here; instead, it should be calculated based on the cluster size. This PR will replace the fixed value "10" with a dynamic value equal to 0.8 times the maximum concurrency of the cluster.

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Sep 22, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@englefly
Copy link
Contributor Author

run buildall

@englefly englefly marked this pull request as draft September 22, 2025 03:31
@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@englefly englefly marked this pull request as ready for review September 22, 2025 09:40
@englefly
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-DS: Total hot run time: 188638 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 14a8ccea341cedefc7f6737418bddf01bfe88101, data reload: false

query1	898	414	412	412
query2	6017	1735	1719	1719
query3	4929	228	234	228
query4	25978	23880	23027	23027
query5	2065	628	489	489
query6	296	263	243	243
query7	458	515	313	313
query8	304	278	256	256
query9	2933	2608	2584	2584
query10	338	349	286	286
query11	15748	15618	14837	14837
query12	173	115	111	111
query13	2371	559	439	439
query14	11086	9581	9305	9305
query15	203	195	174	174
query16	7282	706	510	510
query17	1440	757	615	615
query18	1986	417	321	321
query19	213	201	170	170
query20	133	123	119	119
query21	211	132	116	116
query22	4034	4287	4165	4165
query23	33991	33067	33145	33067
query24	9680	2409	2418	2409
query25	613	562	489	489
query26	1354	271	167	167
query27	2757	508	358	358
query28	4433	2222	2151	2151
query29	790	600	494	494
query30	372	236	198	198
query31	901	814	735	735
query32	120	77	70	70
query33	708	382	340	340
query34	802	856	526	526
query35	906	858	749	749
query36	1016	1036	914	914
query37	121	112	87	87
query38	3541	3590	3493	3493
query39	1523	1426	1428	1426
query40	225	133	126	126
query41	64	65	61	61
query42	134	121	113	113
query43	508	482	463	463
query44	1409	842	834	834
query45	187	183	177	177
query46	880	1014	644	644
query47	1765	1797	1693	1693
query48	386	421	310	310
query49	777	495	419	419
query50	652	707	411	411
query51	3908	3954	3933	3933
query52	116	116	105	105
query53	241	262	200	200
query54	687	585	543	543
query55	99	88	93	88
query56	333	339	320	320
query57	1183	1211	1108	1108
query58	308	315	283	283
query59	2491	2634	2615	2615
query60	343	351	323	323
query61	158	154	156	154
query62	814	740	671	671
query63	236	199	202	199
query64	4166	1164	812	812
query65	4082	3972	3991	3972
query66	1133	450	351	351
query67	15563	15289	15046	15046
query68	8059	944	598	598
query69	511	329	289	289
query70	1362	1343	1250	1250
query71	567	344	318	318
query72	5892	4936	5062	4936
query73	640	605	362	362
query74	9225	9181	8626	8626
query75	3343	3340	2856	2856
query76	3135	1178	764	764
query77	645	414	328	328
query78	9633	9885	8948	8948
query79	1029	844	647	647
query80	638	592	514	514
query81	505	259	227	227
query82	236	165	137	137
query83	275	270	256	256
query84	264	121	95	95
query85	852	463	425	425
query86	343	306	290	290
query87	3790	3778	3720	3720
query88	2888	2258	2253	2253
query89	400	324	298	298
query90	1932	216	216	216
query91	165	165	143	143
query92	83	72	68	68
query93	1430	1014	647	647
query94	694	445	339	339
query95	396	322	316	316
query96	487	571	285	285
query97	2907	2968	2883	2883
query98	268	217	208	208
query99	1357	1424	1310	1310
Total cold run time: 259523 ms
Total hot run time: 188638 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.09 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 14a8ccea341cedefc7f6737418bddf01bfe88101, data reload: false

query1	0.05	0.05	0.05
query2	0.10	0.05	0.06
query3	0.26	0.09	0.08
query4	1.61	0.12	0.12
query5	0.28	0.28	0.25
query6	1.17	0.68	0.64
query7	0.04	0.03	0.03
query8	0.06	0.04	0.04
query9	0.62	0.53	0.52
query10	0.58	0.57	0.57
query11	0.16	0.12	0.12
query12	0.15	0.12	0.12
query13	0.63	0.62	0.62
query14	1.02	1.01	1.02
query15	0.87	0.86	0.85
query16	0.42	0.40	0.41
query17	1.04	1.07	1.07
query18	0.22	0.20	0.20
query19	1.92	1.84	1.83
query20	0.02	0.02	0.01
query21	15.43	0.91	0.58
query22	0.77	1.04	0.62
query23	15.15	1.41	0.67
query24	6.90	1.32	0.35
query25	0.49	0.22	0.09
query26	0.60	0.16	0.15
query27	0.08	0.06	0.06
query28	9.94	1.39	0.93
query29	12.54	3.87	3.32
query30	0.28	0.14	0.13
query31	2.85	0.60	0.39
query32	3.24	0.57	0.48
query33	3.19	3.14	3.07
query34	16.13	5.49	4.88
query35	4.97	4.91	4.97
query36	0.69	0.52	0.51
query37	0.11	0.08	0.07
query38	0.07	0.05	0.04
query39	0.04	0.02	0.03
query40	0.17	0.15	0.15
query41	0.09	0.03	0.04
query42	0.04	0.04	0.03
query43	0.05	0.03	0.03
Total cold run time: 105.04 s
Total hot run time: 30.09 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@englefly
Copy link
Contributor Author

run p0

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 29, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@yujun777 yujun777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@englefly englefly merged commit 6ed6be2 into apache:master Sep 29, 2025
30 of 33 checks passed
@englefly englefly deleted the bs-downgrade branch September 29, 2025 06:43
github-actions bot pushed a commit that referenced this pull request Oct 13, 2025
### What problem does this PR solve?
PR#36784: Bucket shuffle join shall be disabled if the number of scan
instances for the left table of the join is less than 10.
The value "10" should not be hardcoded here; instead, it should be
calculated based on the cluster size. This PR will replace the fixed
value "10" with a dynamic value equal to 0.8 times the maximum
concurrency of the cluster.
yiguolei pushed a commit that referenced this pull request Oct 14, 2025
Cherry-picked from #56279

Co-authored-by: minghong <zhouminghong@selectdb.com>
@yiguolei yiguolei mentioned this pull request Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.1-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants