Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #52149

…ceberg Catalog (#52149)

### What problem does this PR solve?
```
CREATE CATALOG `docker2_catalog3_iceberg` PROPERTIES (
"warehouse" = "hdfs://hadoop-master-2:8620/user/iceberg/hms/",
"uri" = "thrift://hadoop-master-2:9683",
"type" = "iceberg",
"list-all-tables" = "true",
"io-impl" = "org.apache.doris.datasource.iceberg.fileio.DelegateFileIO",
"iceberg.catalog.type" = "hms",
"hive.metastore.uris" = "thrift://hadoop-master-2:9683",
"hive.metastore.sasl.enabled" = "true",
"hive.metastore.kerberos.principal" = "hive/hadoop-master-2@OTHERREALM.COM",
"hadoop.security.authentication" = "kerberos",
"hadoop.kerberos.principal" = "hdfs/test4@OTHERREALM.COM",
"hadoop.kerberos.keytab" = "/kerberos/keytab/docker2/test4",
"fs.defaultFS" = "hdfs://hadoop-master-2:8620"
); 

show create create database icebergcatalog.db
```
```
Caused by: org.apache.doris.common.AnalysisException: errCode = 2, detailMessage = Failed to connect to Hive Metastore
        ... 13 more
Caused by: org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive Metastore
        at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:85) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:34) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.iceberg.ClientPoolImpl.get(ClientPoolImpl.java:143) ~[iceberg-core-1.9.1.jar:?]
        at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:70) ~[iceberg-core-1.9.1.jar:?]
        at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:65) ~[iceberg-core-1.9.1.jar:?]
        at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:122) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.iceberg.hive.HiveCatalog.loadNamespaceMetadata(HiveCatalog.java:643) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.doris.datasource.iceberg.IcebergExternalDatabase.getLocation(IcebergExternalDatabase.java:45) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.ShowCreateDatabaseCommand.doRun(ShowCreateDatabaseCommand.java:98) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.ShowCommand.run(ShowCommand.java:54) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:769) ~[doris-fe.jar:1.2-SNAPSHOT]
        ... 12 more
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
        at org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:86) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:95) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:148) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:119) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:112) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[?:?]
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]
        at org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:60) ~[iceberg-common-1.9.1.jar:?]
        at org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:72) ~[iceberg-common-1.9.1.jar:?]
        at org.apache.iceberg.common.DynMethods$StaticMethod.invoke(DynMethods.java:189) ~[iceberg-common-1.9.1.jar:?]
        at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:63) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:34) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.iceberg.ClientPoolImpl.get(ClientPoolImpl.java:143) ~[iceberg-core-1.9.1.jar:?]
        at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:70) ~[iceberg-core-1.9.1.jar:?]
        at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:65) ~[iceberg-core-1.9.1.jar:?]
        at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:122) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.iceberg.hive.HiveCatalog.loadNamespaceMetadata(HiveCatalog.java:643) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.doris.datasource.iceberg.IcebergExternalDatabase.getLocation(IcebergExternalDatabase.java:45) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.ShowCreateDatabaseCommand.doRun(ShowCreateDatabaseCommand.java:98) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.ShowCommand.run(ShowCommand.java:54) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:769) ~[doris-fe.jar:1.2-SNAPSHOT]
        ... 12 more
Caused by: java.lang.reflect.InvocationTargetException
        at jdk.internal.reflect.GeneratedConstructorAccessor114.newInstance(Unknown Source) ~[?:?]
        at jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:?]
        at java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499) ~[?:?]
        at java.lang.reflect.Constructor.newInstance(Constructor.java:480) ~[?:?]
        at org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:84) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:95) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:148) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:119) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:112) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) ~[?:?]
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]
        at org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:60) ~[iceberg-common-1.9.1.jar:?]
        at org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:72) ~[iceberg-common-1.9.1.jar:?]
        at org.apache.iceberg.common.DynMethods$StaticMethod.invoke(DynMethods.java:189) ~[iceberg-common-1.9.1.jar:?]
        at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:63) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:34) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.iceberg.ClientPoolImpl.get(ClientPoolImpl.java:143) ~[iceberg-core-1.9.1.jar:?]
        at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:70) ~[iceberg-core-1.9.1.jar:?]
        at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:65) ~[iceberg-core-1.9.1.jar:?]
        at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:122) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.iceberg.hive.HiveCatalog.loadNamespaceMetadata(HiveCatalog.java:643) ~[hive-catalog-shade-3.0.1.jar:3.0.1]
        at org.apache.doris.datasource.iceberg.IcebergExternalDatabase.getLocation(IcebergExternalDatabase.java:45) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.ShowCreateDatabaseCommand.doRun(ShowCreateDatabaseCommand.java:98) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.ShowCommand.run(ShowCommand.java:54) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:769) ~[doris-fe.jar:1.2-SNAPSHOT]
        ... 12 more
```
Co-authored-by: Tiewei Fang <fangtiewei@selectdb.com>
@github-actions github-actions bot requested a review from dataroaring as a code owner June 24, 2025 10:14
@Thearas
Copy link
Contributor

Thearas commented Jun 24, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Jun 24, 2025
@Thearas
Copy link
Contributor

Thearas commented Jun 24, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39523 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 137b34870857b93aacbb4db588f457aa2ea49140, data reload: false

------ Round 1 ----------------------------------
q1	17586	6737	6579	6579
q2	2048	169	182	169
q3	10589	1073	1164	1073
q4	10569	755	748	748
q5	7738	2849	2744	2744
q6	214	132	132	132
q7	998	619	593	593
q8	9347	1935	2073	1935
q9	6582	6359	6368	6359
q10	6995	2269	2231	2231
q11	473	272	263	263
q12	396	221	213	213
q13	17796	2996	3030	2996
q14	241	208	208	208
q15	514	475	466	466
q16	471	395	377	377
q17	960	573	528	528
q18	7318	6625	6639	6625
q19	1398	1099	1020	1020
q20	484	214	204	204
q21	3890	3101	3086	3086
q22	1083	974	980	974
Total cold run time: 107690 ms
Total hot run time: 39523 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6584	6660	6586	6586
q2	327	225	229	225
q3	2908	2785	2776	2776
q4	2066	1784	1751	1751
q5	5783	5712	5684	5684
q6	203	133	125	125
q7	2218	1805	1786	1786
q8	3350	3523	3535	3523
q9	8952	8742	8878	8742
q10	3570	3519	3510	3510
q11	596	503	497	497
q12	804	593	595	593
q13	9114	3106	3117	3106
q14	306	286	303	286
q15	526	481	451	451
q16	487	440	423	423
q17	1843	1627	1629	1627
q18	8239	7695	7641	7641
q19	1684	1501	1581	1501
q20	2114	1832	1811	1811
q21	5092	4966	5088	4966
q22	1145	1054	1058	1054
Total cold run time: 67911 ms
Total hot run time: 58664 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 196993 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 137b34870857b93aacbb4db588f457aa2ea49140, data reload: false

query1	1303	926	898	898
query2	6370	1912	1868	1868
query3	10868	4285	4257	4257
query4	61468	29330	23430	23430
query5	5190	458	461	458
query6	428	181	182	181
query7	5509	322	314	314
query8	316	229	225	225
query9	8554	2563	2546	2546
query10	457	274	253	253
query11	17420	15124	15694	15124
query12	167	106	107	106
query13	1465	463	446	446
query14	10472	7294	7279	7279
query15	199	184	180	180
query16	7112	443	438	438
query17	1260	561	552	552
query18	1915	319	302	302
query19	209	161	182	161
query20	113	110	118	110
query21	214	106	104	104
query22	4660	4512	4736	4512
query23	34463	34492	34048	34048
query24	6113	2954	2842	2842
query25	552	429	424	424
query26	649	165	170	165
query27	1778	356	353	353
query28	3833	2168	2139	2139
query29	730	482	453	453
query30	243	159	156	156
query31	973	826	812	812
query32	67	62	59	59
query33	458	306	299	299
query34	903	524	530	524
query35	855	751	721	721
query36	1080	950	951	950
query37	110	72	73	72
query38	4046	4034	4005	4005
query39	1544	1461	1485	1461
query40	203	104	134	104
query41	134	53	47	47
query42	107	98	97	97
query43	510	484	476	476
query44	1133	800	827	800
query45	182	175	173	173
query46	1144	733	722	722
query47	1981	1908	1916	1908
query48	478	392	404	392
query49	740	386	394	386
query50	828	437	442	437
query51	7438	7240	7180	7180
query52	97	90	89	89
query53	258	188	185	185
query54	558	466	457	457
query55	78	79	76	76
query56	288	249	254	249
query57	1329	1180	1191	1180
query58	226	213	212	212
query59	3212	2917	2975	2917
query60	278	263	262	262
query61	118	115	111	111
query62	807	681	705	681
query63	225	194	196	194
query64	1400	660	639	639
query65	3326	3265	3199	3199
query66	717	286	292	286
query67	15976	15719	15851	15719
query68	4181	563	548	548
query69	430	256	254	254
query70	1112	1141	1078	1078
query71	347	257	258	257
query72	6353	4003	3957	3957
query73	744	343	356	343
query74	10444	9282	9339	9282
query75	3358	2619	2659	2619
query76	1987	1071	1090	1071
query77	472	264	284	264
query78	10567	9627	9531	9531
query79	2307	609	612	609
query80	1308	412	419	412
query81	526	220	220	220
query82	1225	88	91	88
query83	160	145	138	138
query84	276	82	84	82
query85	983	302	286	286
query86	397	303	290	290
query87	4349	4253	4194	4194
query88	3722	2373	2348	2348
query89	414	293	291	291
query90	1965	192	220	192
query91	182	148	146	146
query92	63	50	50	50
query93	2749	559	558	558
query94	782	293	299	293
query95	356	267	257	257
query96	622	277	274	274
query97	3294	3155	3171	3155
query98	215	195	196	195
query99	1604	1306	1320	1306
Total cold run time: 315331 ms
Total hot run time: 196993 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.88 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 137b34870857b93aacbb4db588f457aa2ea49140, data reload: false

query1	0.03	0.03	0.03
query2	0.06	0.03	0.03
query3	0.23	0.06	0.06
query4	1.63	0.10	0.10
query5	0.52	0.51	0.50
query6	1.15	0.73	0.72
query7	0.03	0.02	0.02
query8	0.03	0.03	0.03
query9	0.54	0.50	0.50
query10	0.56	0.56	0.54
query11	0.14	0.10	0.10
query12	0.14	0.11	0.12
query13	0.60	0.59	0.58
query14	0.78	0.82	0.80
query15	0.83	0.82	0.81
query16	0.37	0.38	0.38
query17	1.00	1.04	1.02
query18	0.25	0.22	0.22
query19	1.92	1.77	1.85
query20	0.02	0.01	0.01
query21	15.39	0.59	0.59
query22	2.28	1.92	1.63
query23	17.00	0.97	0.92
query24	2.78	1.44	0.80
query25	0.17	0.18	0.09
query26	0.42	0.13	0.14
query27	0.05	0.03	0.05
query28	10.56	0.47	0.49
query29	12.59	3.25	3.25
query30	0.25	0.06	0.06
query31	2.86	0.38	0.37
query32	3.25	0.45	0.46
query33	2.98	3.01	3.02
query34	17.32	4.50	4.48
query35	4.54	4.52	4.49
query36	0.67	0.48	0.48
query37	0.09	0.06	0.06
query38	0.06	0.03	0.03
query39	0.03	0.03	0.02
query40	0.16	0.13	0.13
query41	0.08	0.02	0.03
query42	0.04	0.02	0.02
query43	0.04	0.03	0.02
Total cold run time: 104.44 s
Total hot run time: 29.88 s

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor Author

github-actions bot commented Jul 8, 2025

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Jul 8, 2025
@github-actions
Copy link
Contributor Author

github-actions bot commented Jul 8, 2025

PR approved by anyone and no changes requested.

@dataroaring dataroaring merged commit 08b7413 into branch-3.0 Jul 9, 2025
23 of 25 checks passed
@github-actions github-actions bot deleted the auto-pick-52149-branch-3.0 branch July 9, 2025 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants