Skip to content

Conversation

@morningman
Copy link
Contributor

@morningman morningman commented Aug 30, 2025

What problem does this PR solve?

User may not specify data format in broker load, so we can only infer the data format
after listing the files.
So we have to defer the initialization of file properties object

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

… load without format (apache#55450)

User may not specify data format in broker load, so we can only infer
the data format
after listing the files.
So we have to defer the initialization of file properties object

---------

Co-authored-by: Calvin Kirs <guoqiang@selectdb.com>
@Thearas
Copy link
Contributor

Thearas commented Aug 30, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman morningman marked this pull request as ready for review August 30, 2025 01:55
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses an issue in broker load functionality where users may not specify a data format, requiring the system to infer the format after listing files. The PR introduces a deferred initialization pattern for file format properties to handle this scenario.

Key changes include:

  • Introduction of DeferredFileFormatProperties class for delayed format property initialization
  • Updates to data description classes to use deferred properties when format is not specified
  • MySQL load command refactoring to remove duplicate methods and fix typos
  • Enhanced test coverage for HDFS load with default file format detection

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
hdfs_load_default_file_format.groovy New regression test for HDFS load with automatic file format detection
LoadCommandTest.java Updated tests to handle DeferredFileFormatProperties
MysqlLoadCommand.java Fixed typo in method name (handleMysqlLoadComandhandleMysqlLoadCommand)
MysqlDataDescription.java Updated to use deferred file format properties
LogicalPlanBuilder.java Explicitly sets CSV format for MySQL load
NereidsDataDescription.java Updated to use deferred file format properties
MysqlLoadManager.java Removed duplicate methods and refactored to use CSV properties directly
BrokerLoadPendingTask.java Added logic to initialize deferred properties after listing files
BrokerFileGroup.java Added method to initialize deferred properties based on file status
FileFormatProperties.java Added factory method for creating deferred properties
DeferredFileFormatProperties.java New wrapper class for deferred initialization of file format properties
DataDescription.java Updated to use deferred file format properties

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 7851 to 7852
// MySQL load only support csv, set it explicitly
properties.put("format", "csv");
Copy link

Copilot AI Aug 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is calling put() on an ImmutableMap which will throw an UnsupportedOperationException. Since properties is created as ImmutableMap.of() on line 7848, it cannot be modified. Consider creating a mutable map or using a builder pattern.

Copilot uses AI. Check for mistakes.
Comment on lines 305 to +306
public boolean isBinaryFileFormat() {
// Must call initDeferredFileFormatPropertiesIfNecessary before
Copy link

Copilot AI Aug 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method assumes initDeferredFileFormatPropertiesIfNecessary has been called but doesn't verify it. If fileFormatProperties is still a DeferredFileFormatProperties instance that hasn't been initialized, this will incorrectly return false. Consider checking if it's a DeferredFileFormatProperties and either throw an exception or delegate to the actual implementation.

Suggested change
public boolean isBinaryFileFormat() {
// Must call initDeferredFileFormatPropertiesIfNecessary before
// Defensive: check for uninitialized DeferredFileFormatProperties
if (fileFormatProperties instanceof DeferredFileFormatProperties) {
throw new IllegalStateException("DeferredFileFormatProperties must be initialized before calling isBinaryFileFormat()");
}

Copilot uses AI. Check for mistakes.
Comment on lines +114 to +115
delegate = FileFormatProperties.createFileFormatProperties(this.formatName);
delegate.analyzeFileFormatProperties(origProperties, false);
Copy link

Copilot AI Aug 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The createFileFormatProperties method can throw an AnalysisException, but deferInit doesn't declare this exception in its signature. This will cause a compilation error. The method signature should include throws AnalysisException.

Copilot uses AI. Check for mistakes.
@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34144 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 489c670ec3ebb03c49cf97835f30793e78724bf1, data reload: false

------ Round 1 ----------------------------------
q1	17643	5327	5114	5114
q2	2022	320	211	211
q3	10255	1295	720	720
q4	10252	1017	544	544
q5	7575	2439	2362	2362
q6	188	170	136	136
q7	937	774	633	633
q8	9338	1321	1138	1138
q9	7006	5175	5085	5085
q10	6991	2418	1957	1957
q11	478	323	281	281
q12	371	356	238	238
q13	17781	3663	3085	3085
q14	242	250	220	220
q15	571	495	481	481
q16	433	432	381	381
q17	607	868	360	360
q18	7521	7216	7070	7070
q19	1082	962	569	569
q20	354	334	234	234
q21	3919	2626	2343	2343
q22	1085	1023	982	982
Total cold run time: 106651 ms
Total hot run time: 34144 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5150	5129	5136	5129
q2	247	335	221	221
q3	2159	2699	2287	2287
q4	1359	1766	1335	1335
q5	4245	4278	4563	4278
q6	228	175	148	148
q7	2082	2015	1850	1850
q8	2704	2674	2638	2638
q9	7374	7469	7323	7323
q10	3110	3331	2856	2856
q11	586	516	510	510
q12	708	814	678	678
q13	3704	3862	3361	3361
q14	308	309	296	296
q15	524	481	496	481
q16	462	500	457	457
q17	1233	1570	1389	1389
q18	7925	7644	7655	7644
q19	874	851	844	844
q20	2030	2045	1830	1830
q21	4708	4351	4340	4340
q22	1064	1082	1001	1001
Total cold run time: 52784 ms
Total hot run time: 50896 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187798 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 489c670ec3ebb03c49cf97835f30793e78724bf1, data reload: false

query1	1069	436	426	426
query2	6593	1773	1763	1763
query3	6751	236	228	228
query4	26144	23470	23268	23268
query5	4388	656	513	513
query6	357	257	224	224
query7	4662	520	310	310
query8	323	286	250	250
query9	8643	2905	2898	2898
query10	514	370	298	298
query11	15586	14955	14796	14796
query12	174	125	126	125
query13	1672	560	431	431
query14	8645	6045	5954	5954
query15	219	194	185	185
query16	7241	674	508	508
query17	1282	781	673	673
query18	2010	421	330	330
query19	211	200	171	171
query20	136	128	125	125
query21	223	129	123	123
query22	4105	4225	3972	3972
query23	34107	33099	33143	33099
query24	8155	2400	2409	2400
query25	593	527	444	444
query26	1264	282	172	172
query27	2722	520	361	361
query28	4382	2312	2254	2254
query29	834	612	498	498
query30	291	225	217	217
query31	903	833	767	767
query32	96	84	80	80
query33	579	404	366	366
query34	816	868	535	535
query35	865	864	779	779
query36	999	1024	945	945
query37	128	113	96	96
query38	4075	4152	4154	4152
query39	1494	1451	1455	1451
query40	221	141	131	131
query41	73	63	63	63
query42	128	114	130	114
query43	529	540	481	481
query44	1372	871	875	871
query45	185	177	184	177
query46	868	1013	652	652
query47	1792	1817	1728	1728
query48	394	449	339	339
query49	750	517	429	429
query50	661	695	406	406
query51	4101	4239	4129	4129
query52	120	120	109	109
query53	245	268	211	211
query54	621	608	545	545
query55	105	91	96	91
query56	382	340	329	329
query57	1221	1219	1136	1136
query58	300	281	285	281
query59	2703	2703	2713	2703
query60	379	358	352	352
query61	168	164	164	164
query62	829	731	716	716
query63	246	201	205	201
query64	4526	1162	862	862
query65	4323	4221	4242	4221
query66	1193	438	371	371
query67	15451	15243	15147	15147
query68	6569	944	603	603
query69	520	414	305	305
query70	1256	1190	1176	1176
query71	409	353	325	325
query72	6025	5057	4947	4947
query73	664	609	364	364
query74	8958	8831	8689	8689
query75	3140	3106	2666	2666
query76	3246	1152	803	803
query77	501	446	342	342
query78	9471	9541	8941	8941
query79	2935	845	627	627
query80	679	603	541	541
query81	506	270	229	229
query82	460	146	118	118
query83	268	277	315	277
query84	260	118	94	94
query85	896	475	433	433
query86	394	317	316	316
query87	4351	4334	4192	4192
query88	3748	2227	2226	2226
query89	392	340	302	302
query90	1906	235	233	233
query91	165	165	136	136
query92	85	82	73	73
query93	2294	973	637	637
query94	750	425	340	340
query95	415	339	335	335
query96	492	573	283	283
query97	2651	2746	2594	2594
query98	256	215	221	215
query99	1353	1425	1290	1290
Total cold run time: 273152 ms
Total hot run time: 187798 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.91 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 489c670ec3ebb03c49cf97835f30793e78724bf1, data reload: false

query1	0.06	0.04	0.05
query2	0.09	0.05	0.05
query3	0.27	0.09	0.08
query4	1.61	0.12	0.12
query5	0.46	0.42	0.42
query6	1.19	0.65	0.66
query7	0.03	0.02	0.02
query8	0.06	0.04	0.05
query9	0.61	0.56	0.52
query10	0.58	0.59	0.58
query11	0.17	0.12	0.11
query12	0.16	0.13	0.12
query13	0.63	0.62	0.62
query14	0.80	0.84	0.83
query15	0.88	0.84	0.88
query16	0.39	0.40	0.39
query17	1.05	1.06	1.05
query18	0.22	0.20	0.19
query19	1.99	1.79	1.85
query20	0.02	0.01	0.02
query21	15.40	0.99	0.59
query22	0.80	1.29	0.78
query23	14.78	1.41	0.64
query24	6.46	2.71	0.96
query25	0.49	0.26	0.09
query26	0.53	0.16	0.13
query27	0.06	0.06	0.06
query28	9.56	0.87	0.44
query29	12.63	3.90	3.27
query30	3.07	3.03	2.95
query31	2.83	0.56	0.39
query32	3.23	0.56	0.47
query33	3.07	3.05	3.16
query34	16.07	5.52	4.82
query35	4.94	4.99	4.94
query36	0.71	0.52	0.51
query37	0.10	0.07	0.08
query38	0.05	0.04	0.05
query39	0.04	0.03	0.03
query40	0.20	0.15	0.14
query41	0.08	0.03	0.02
query42	0.03	0.04	0.02
query43	0.04	0.03	0.03
Total cold run time: 106.44 s
Total hot run time: 32.91 s

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 61.54% (64/104) 🎉
Increment coverage report
Complete coverage report

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34193 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 40488c68241fc09159a8cd048ca3dabc463b8fe5, data reload: false

------ Round 1 ----------------------------------
q1	17634	5269	5314	5269
q2	2007	316	211	211
q3	10280	1307	738	738
q4	10228	1011	546	546
q5	7564	2302	2392	2302
q6	198	170	146	146
q7	926	774	627	627
q8	9349	1380	1116	1116
q9	6956	5078	5148	5078
q10	6946	2404	1976	1976
q11	489	295	281	281
q12	350	357	227	227
q13	17770	3690	3033	3033
q14	235	245	220	220
q15	568	499	492	492
q16	427	444	383	383
q17	602	858	370	370
q18	7377	7181	7032	7032
q19	1292	946	591	591
q20	344	347	238	238
q21	3907	2548	2311	2311
q22	1086	1043	1006	1006
Total cold run time: 106535 ms
Total hot run time: 34193 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5183	5105	5111	5105
q2	255	330	231	231
q3	2125	2691	2250	2250
q4	1328	1774	1323	1323
q5	4196	4239	4594	4239
q6	217	179	139	139
q7	2030	2046	1855	1855
q8	2648	2655	2521	2521
q9	7380	7430	7295	7295
q10	3106	3302	2924	2924
q11	589	523	494	494
q12	694	770	702	702
q13	3574	3939	3344	3344
q14	294	327	277	277
q15	520	476	477	476
q16	450	492	442	442
q17	1195	1552	1415	1415
q18	7958	7723	7629	7629
q19	821	861	1049	861
q20	1994	2055	1851	1851
q21	4868	4335	4321	4321
q22	1108	1072	992	992
Total cold run time: 52533 ms
Total hot run time: 50686 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186656 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 40488c68241fc09159a8cd048ca3dabc463b8fe5, data reload: false

query1	1071	462	404	404
query2	6565	1733	1756	1733
query3	6754	229	218	218
query4	25820	23115	22934	22934
query5	4399	640	513	513
query6	356	247	222	222
query7	4641	531	305	305
query8	308	255	252	252
query9	9066	2921	2953	2921
query10	516	365	313	313
query11	15669	15066	14962	14962
query12	169	120	121	120
query13	1681	543	437	437
query14	8586	5809	5703	5703
query15	215	189	171	171
query16	7163	669	523	523
query17	1047	731	625	625
query18	2018	432	357	357
query19	199	203	172	172
query20	131	127	119	119
query21	223	131	112	112
query22	4216	4281	4100	4100
query23	33730	32951	32941	32941
query24	8149	2345	2438	2345
query25	581	511	450	450
query26	1238	283	167	167
query27	2733	502	347	347
query28	4406	2298	2267	2267
query29	814	604	478	478
query30	291	223	200	200
query31	904	808	741	741
query32	90	84	77	77
query33	581	396	357	357
query34	807	872	511	511
query35	849	824	753	753
query36	969	1030	953	953
query37	131	119	102	102
query38	3993	4129	3981	3981
query39	1504	1463	1412	1412
query40	234	136	133	133
query41	70	67	67	67
query42	132	162	132	132
query43	520	497	479	479
query44	1355	869	864	864
query45	185	178	167	167
query46	867	1015	650	650
query47	1775	1832	1730	1730
query48	393	421	322	322
query49	744	512	407	407
query50	662	686	417	417
query51	4117	4108	4160	4108
query52	119	109	110	109
query53	256	265	198	198
query54	612	592	539	539
query55	92	92	96	92
query56	342	377	336	336
query57	1188	1201	1126	1126
query58	289	276	280	276
query59	2695	2726	2666	2666
query60	359	359	345	345
query61	169	155	156	155
query62	804	732	659	659
query63	237	192	193	192
query64	4530	1142	800	800
query65	4330	4242	4176	4176
query66	1170	435	361	361
query67	15525	15097	15157	15097
query68	6125	965	585	585
query69	509	340	291	291
query70	1251	1137	1149	1137
query71	544	353	320	320
query72	5847	5073	5163	5073
query73	699	674	368	368
query74	8904	9162	8888	8888
query75	3171	3144	2650	2650
query76	3125	1159	757	757
query77	514	424	348	348
query78	9485	9633	8984	8984
query79	2295	821	605	605
query80	726	616	539	539
query81	508	261	238	238
query82	199	148	121	121
query83	276	265	256	256
query84	293	120	101	101
query85	955	448	432	432
query86	386	311	320	311
query87	4335	4322	4156	4156
query88	3224	2228	2269	2228
query89	397	336	292	292
query90	1996	220	227	220
query91	157	169	134	134
query92	86	81	75	75
query93	2253	1001	635	635
query94	705	408	316	316
query95	413	339	328	328
query96	489	586	284	284
query97	2620	2662	2568	2568
query98	252	227	221	221
query99	1335	1428	1294	1294
Total cold run time: 270428 ms
Total hot run time: 186656 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.58 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 40488c68241fc09159a8cd048ca3dabc463b8fe5, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.04	0.05
query3	0.25	0.08	0.08
query4	1.62	0.12	0.12
query5	0.44	0.43	0.41
query6	1.16	0.64	0.64
query7	0.03	0.03	0.02
query8	0.05	0.05	0.04
query9	0.60	0.55	0.52
query10	0.59	0.59	0.58
query11	0.17	0.12	0.11
query12	0.16	0.12	0.12
query13	0.63	0.61	0.62
query14	0.81	0.85	0.85
query15	0.88	0.85	0.86
query16	0.39	0.41	0.40
query17	1.07	1.07	1.02
query18	0.22	0.21	0.20
query19	1.96	1.84	1.76
query20	0.04	0.02	0.01
query21	15.39	0.92	0.59
query22	0.77	1.21	0.85
query23	14.80	1.37	0.63
query24	6.62	2.27	0.49
query25	0.47	0.13	0.10
query26	0.65	0.16	0.13
query27	0.06	0.06	0.05
query28	10.17	0.95	0.42
query29	12.59	3.94	3.26
query30	3.13	3.01	2.99
query31	2.82	0.60	0.39
query32	3.24	0.57	0.49
query33	3.17	3.05	3.06
query34	15.85	5.45	4.91
query35	4.94	4.90	4.95
query36	0.69	0.52	0.51
query37	0.10	0.08	0.08
query38	0.07	0.05	0.05
query39	0.04	0.03	0.03
query40	0.18	0.14	0.15
query41	0.08	0.04	0.03
query42	0.04	0.03	0.03
query43	0.04	0.03	0.03
Total cold run time: 107.12 s
Total hot run time: 32.58 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 27.19% (31/114) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 63.16% (72/114) 🎉
Increment coverage report
Complete coverage report

@github-actions
Copy link
Contributor

github-actions bot commented Sep 2, 2025

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Sep 2, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Sep 2, 2025

PR approved by anyone and no changes requested.

@morningman morningman merged commit 5c809e4 into apache:master Sep 2, 2025
26 of 28 checks passed
dataroaring pushed a commit that referenced this pull request Sep 24, 2025
Related PR: #55498

Problem Summary:

When importing a file via S3 load, if the file does not exist on S3, the
load will report an error: label has already been used.

2025-09-22 23:00:57,177 WARN (pending-load-task-scheduler-pool-0|483)
[LoadTask.exec():110] LOAD_JOB=1758553246283, error_msg={Unexpected
failed to execute load task}
java.lang.IllegalStateException: null
at
com.google.common.base.Preconditions.checkState(Preconditions.java:499)
~[guava-33.2.1-jre.jar:?]
at
org.apache.doris.nereids.load.NereidsLoadingTaskPlanner.plan(NereidsLoadingTaskPlanner.java:146)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.LoadLoadingTask.init(LoadLoadingTask.java:138)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.BrokerLoadJob.createTask(BrokerLoadJob.java:274)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.BrokerLoadJob.createLoadingTask(BrokerLoadJob.java:311)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.BrokerLoadJob.onPendingTaskFinished(BrokerLoadJob.java:204)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.BrokerLoadJob.onTaskFinished(BrokerLoadJob.java:163)
~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.load.loadv2.LoadTask.exec(LoadTask.java:102)
~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.task.MasterTask.run(MasterTask.java:31)
~[doris-fe.jar:1.2-SNAPSHOT]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
~[?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
~[?:?]
        at java.lang.Thread.run(Thread.java:833) ~[?:?]
github-actions bot pushed a commit that referenced this pull request Sep 24, 2025
Related PR: #55498

Problem Summary:

When importing a file via S3 load, if the file does not exist on S3, the
load will report an error: label has already been used.

2025-09-22 23:00:57,177 WARN (pending-load-task-scheduler-pool-0|483)
[LoadTask.exec():110] LOAD_JOB=1758553246283, error_msg={Unexpected
failed to execute load task}
java.lang.IllegalStateException: null
at
com.google.common.base.Preconditions.checkState(Preconditions.java:499)
~[guava-33.2.1-jre.jar:?]
at
org.apache.doris.nereids.load.NereidsLoadingTaskPlanner.plan(NereidsLoadingTaskPlanner.java:146)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.LoadLoadingTask.init(LoadLoadingTask.java:138)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.BrokerLoadJob.createTask(BrokerLoadJob.java:274)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.BrokerLoadJob.createLoadingTask(BrokerLoadJob.java:311)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.BrokerLoadJob.onPendingTaskFinished(BrokerLoadJob.java:204)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.BrokerLoadJob.onTaskFinished(BrokerLoadJob.java:163)
~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.load.loadv2.LoadTask.exec(LoadTask.java:102)
~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.task.MasterTask.run(MasterTask.java:31)
~[doris-fe.jar:1.2-SNAPSHOT]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
~[?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
~[?:?]
        at java.lang.Thread.run(Thread.java:833) ~[?:?]
github-actions bot pushed a commit that referenced this pull request Sep 24, 2025
Related PR: #55498

Problem Summary:

When importing a file via S3 load, if the file does not exist on S3, the
load will report an error: label has already been used.

2025-09-22 23:00:57,177 WARN (pending-load-task-scheduler-pool-0|483)
[LoadTask.exec():110] LOAD_JOB=1758553246283, error_msg={Unexpected
failed to execute load task}
java.lang.IllegalStateException: null
at
com.google.common.base.Preconditions.checkState(Preconditions.java:499)
~[guava-33.2.1-jre.jar:?]
at
org.apache.doris.nereids.load.NereidsLoadingTaskPlanner.plan(NereidsLoadingTaskPlanner.java:146)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.LoadLoadingTask.init(LoadLoadingTask.java:138)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.BrokerLoadJob.createTask(BrokerLoadJob.java:274)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.BrokerLoadJob.createLoadingTask(BrokerLoadJob.java:311)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.BrokerLoadJob.onPendingTaskFinished(BrokerLoadJob.java:204)
~[doris-fe.jar:1.2-SNAPSHOT]
at
org.apache.doris.load.loadv2.BrokerLoadJob.onTaskFinished(BrokerLoadJob.java:163)
~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.load.loadv2.LoadTask.exec(LoadTask.java:102)
~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.task.MasterTask.run(MasterTask.java:31)
~[doris-fe.jar:1.2-SNAPSHOT]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
~[?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
~[?:?]
        at java.lang.Thread.run(Thread.java:833) ~[?:?]
dataroaring pushed a commit that referenced this pull request Sep 28, 2025
### What problem does this PR solve?

introduce by #55498

Fix execute copy task fail:
```
Caused by: org.apache.doris.common.AnalysisException: errCode = 2, detailMessage = Cannot invoke "java.util.Optional.isPresent()" because the return value of "org.apache.doris.nereids.trees.plans.co
mmands.info.CopyFromDesc.getFileFilterExpr()" is null
        ... 13 more
Caused by: java.lang.NullPointerException: Cannot invoke "java.util.Optional.isPresent()" because the return value of "org.apache.doris.nereids.trees.plans.commands.info.CopyFromDesc.getFileFilterExpr()" is null
        at org.apache.doris.nereids.trees.plans.commands.info.CopyIntoInfo.doValidate(CopyIntoInfo.java:262) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.info.CopyIntoInfo.validate(CopyIntoInfo.java:195) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.CopyIntoCommand.run(CopyIntoCommand.java:57) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:668) ~[doris-fe.jar:1.2-SNAPSHOT]
        ... 12 more

```

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
github-actions bot pushed a commit that referenced this pull request Sep 28, 2025
### What problem does this PR solve?

introduce by #55498

Fix execute copy task fail:
```
Caused by: org.apache.doris.common.AnalysisException: errCode = 2, detailMessage = Cannot invoke "java.util.Optional.isPresent()" because the return value of "org.apache.doris.nereids.trees.plans.co
mmands.info.CopyFromDesc.getFileFilterExpr()" is null
        ... 13 more
Caused by: java.lang.NullPointerException: Cannot invoke "java.util.Optional.isPresent()" because the return value of "org.apache.doris.nereids.trees.plans.commands.info.CopyFromDesc.getFileFilterExpr()" is null
        at org.apache.doris.nereids.trees.plans.commands.info.CopyIntoInfo.doValidate(CopyIntoInfo.java:262) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.info.CopyIntoInfo.validate(CopyIntoInfo.java:195) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.CopyIntoCommand.run(CopyIntoCommand.java:57) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:668) ~[doris-fe.jar:1.2-SNAPSHOT]
        ... 12 more

```

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
github-actions bot pushed a commit that referenced this pull request Sep 28, 2025
### What problem does this PR solve?

introduce by #55498

Fix execute copy task fail:
```
Caused by: org.apache.doris.common.AnalysisException: errCode = 2, detailMessage = Cannot invoke "java.util.Optional.isPresent()" because the return value of "org.apache.doris.nereids.trees.plans.co
mmands.info.CopyFromDesc.getFileFilterExpr()" is null
        ... 13 more
Caused by: java.lang.NullPointerException: Cannot invoke "java.util.Optional.isPresent()" because the return value of "org.apache.doris.nereids.trees.plans.commands.info.CopyFromDesc.getFileFilterExpr()" is null
        at org.apache.doris.nereids.trees.plans.commands.info.CopyIntoInfo.doValidate(CopyIntoInfo.java:262) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.info.CopyIntoInfo.validate(CopyIntoInfo.java:195) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.CopyIntoCommand.run(CopyIntoCommand.java:57) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:668) ~[doris-fe.jar:1.2-SNAPSHOT]
        ... 12 more

```

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.1.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants