Skip to content

Conversation

@morningman
Copy link
Contributor

@morningman morningman commented Sep 24, 2025

What problem does this PR solve?

Iceberg has 3 levels of metadata: catalog, namespace and table, mapping to Doris' catalog, database and table.

Iceberg support nested namespaces, which means the following namespaces are valid:

ns1
ns1.ns2
ns1.ns2.ns3

So we need to support mapping nested namespace to Doris' database.

This PR add a global variable enable_nested_namespace to control this behavior.
Default is false, and no logic is changed.

If set to true, Doris can support following statments:

mysql> switch iceberg;
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| nested             |
| nested.db1         |
| nested.db2         |
+--------------------+

mysql> use iceberg.nested.db1;
ERROR 1049 (42000): Only one dot can be in the name: iceberg.nested.db1
mysql> use iceberg.`nested.db1`;
ERROR 5086 (42000): errCode = 2, detailMessage = Unknown catalog 'nested'

mysql> set global enable_nested_namespace=true;

mysql> use iceberg.nested.db1;
Database changed
mysql> select k1 from iceberg.`nested.db1`.nested1;
mysql> select nested1.k1 from `nested.db1`.nested1;
mysql> select `nested.db1`.nested1.k1 from iceberg.`nested.db1`.nested1;
mysql> select iceberg.`nested.db1`.nested1.k1 from nested1;
+------+
| k1   |
+------+
|    1 |
+------+

mysql> refresh catalog iceberg;
mysql> refresh database iceberg.`nested.db1`;
mysql> refresh table iceberg.`nested.db1`.nested1;
Query OK, 0 rows affected (0.01 sec)

But, I can execute statement like:

use iceberg.`nested.db1`;

I don't know why, there is a very strange behavior in MySQL client, when adding back quota,
the INIT_DB command can only receive nested.db1 part, but expect iceberg.nested.db1.

Also support creating nested database name in internal catalog:

create database `db1.db2`

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Sep 24, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman morningman changed the title [opt](code) minor [opt](catalog) support nested namespaces of iceberg Sep 24, 2025
@morningman morningman marked this pull request as draft September 24, 2025 19:58
@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

ClickBench: Total hot run time: 30.54 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 464df8fa9d799625580cf7cc7152c79d737312ab, data reload: false

query1	0.06	0.05	0.06
query2	0.09	0.05	0.07
query3	0.26	0.08	0.08
query4	1.61	0.12	0.11
query5	0.28	0.26	0.25
query6	1.20	0.65	0.64
query7	0.04	0.03	0.03
query8	0.06	0.04	0.04
query9	0.62	0.53	0.51
query10	0.57	0.57	0.59
query11	0.16	0.11	0.12
query12	0.15	0.12	0.12
query13	0.62	0.62	0.61
query14	1.02	1.05	1.04
query15	0.87	0.87	0.87
query16	0.40	0.42	0.39
query17	1.05	1.03	1.05
query18	0.22	0.19	0.20
query19	1.94	1.88	1.83
query20	0.02	0.01	0.01
query21	15.46	0.92	0.58
query22	0.77	1.15	0.66
query23	14.98	1.39	0.64
query24	6.75	2.24	0.98
query25	0.54	0.36	0.08
query26	0.40	0.15	0.14
query27	0.06	0.07	0.05
query28	10.15	1.33	0.93
query29	12.57	3.91	3.27
query30	0.27	0.16	0.10
query31	2.86	0.59	0.38
query32	3.24	0.56	0.47
query33	3.05	3.05	3.15
query34	16.16	5.45	4.88
query35	4.98	4.94	4.96
query36	0.71	0.52	0.50
query37	0.10	0.08	0.08
query38	0.07	0.05	0.04
query39	0.04	0.03	0.02
query40	0.18	0.15	0.13
query41	0.09	0.04	0.03
query42	0.03	0.03	0.03
query43	0.05	0.04	0.03
Total cold run time: 104.75 s
Total hot run time: 30.54 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 26.67% (12/45) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 42.22% (19/45) 🎉
Increment coverage report
Complete coverage report

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

ClickBench: Total hot run time: 30.61 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 78b1d3aaf08771de6d0f73a759555fad095776a5, data reload: false

query1	0.06	0.04	0.04
query2	0.09	0.05	0.06
query3	0.25	0.08	0.08
query4	1.60	0.12	0.12
query5	0.28	0.27	0.27
query6	1.18	0.66	0.66
query7	0.04	0.03	0.03
query8	0.05	0.04	0.04
query9	0.62	0.55	0.54
query10	0.59	0.57	0.58
query11	0.16	0.12	0.11
query12	0.16	0.12	0.13
query13	0.63	0.62	0.62
query14	1.00	1.04	1.03
query15	0.87	0.86	0.87
query16	0.41	0.39	0.40
query17	1.05	1.06	1.08
query18	0.22	0.20	0.20
query19	1.92	1.84	1.90
query20	0.02	0.01	0.01
query21	15.42	0.91	0.57
query22	0.77	1.18	0.63
query23	15.00	1.38	0.63
query24	6.74	1.43	0.94
query25	0.49	0.11	0.11
query26	0.60	0.18	0.14
query27	0.07	0.05	0.06
query28	9.30	1.37	0.94
query29	12.56	3.92	3.26
query30	0.28	0.13	0.11
query31	2.83	0.60	0.39
query32	3.24	0.56	0.47
query33	3.05	3.05	3.08
query34	16.15	5.43	4.90
query35	4.90	4.90	4.92
query36	0.69	0.52	0.50
query37	0.11	0.07	0.07
query38	0.06	0.05	0.04
query39	0.04	0.03	0.03
query40	0.19	0.15	0.15
query41	0.09	0.04	0.03
query42	0.03	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 103.85 s
Total hot run time: 30.61 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 43.48% (40/92) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 42.39% (39/92) 🎉
Increment coverage report
Complete coverage report

@morningman morningman marked this pull request as ready for review September 26, 2025 17:56
@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

ClickBench: Total hot run time: 30.29 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 648cc8912b8567f12c666dbd60e7a3aab6fb13c1, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.05	0.06
query3	0.25	0.08	0.08
query4	1.61	0.11	0.12
query5	0.27	0.29	0.25
query6	1.20	0.66	0.64
query7	0.03	0.02	0.02
query8	0.05	0.04	0.05
query9	0.61	0.56	0.52
query10	0.59	0.56	0.58
query11	0.16	0.12	0.11
query12	0.15	0.12	0.12
query13	0.63	0.63	0.62
query14	1.06	1.04	1.02
query15	0.86	0.89	0.88
query16	0.41	0.40	0.39
query17	1.03	1.08	1.07
query18	0.21	0.19	0.20
query19	1.90	1.81	1.85
query20	0.02	0.01	0.01
query21	15.41	0.96	0.60
query22	0.75	1.22	0.75
query23	14.90	1.38	0.62
query24	6.88	2.02	0.60
query25	0.46	0.09	0.22
query26	0.64	0.17	0.13
query27	0.07	0.06	0.06
query28	9.81	1.34	0.93
query29	12.53	4.00	3.25
query30	0.27	0.13	0.12
query31	2.82	0.60	0.39
query32	3.24	0.59	0.47
query33	3.01	3.09	3.14
query34	16.15	5.55	4.87
query35	4.94	4.95	4.89
query36	0.70	0.51	0.51
query37	0.11	0.08	0.08
query38	0.07	0.05	0.04
query39	0.03	0.03	0.03
query40	0.17	0.16	0.13
query41	0.10	0.03	0.03
query42	0.04	0.04	0.04
query43	0.04	0.03	0.04
Total cold run time: 104.32 s
Total hot run time: 30.29 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 43.48% (40/92) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 27.17% (25/92) 🎉
Increment coverage report
Complete coverage report

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 30, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit c76ff7b into apache:master Oct 1, 2025
27 of 28 checks passed
github-actions bot pushed a commit that referenced this pull request Oct 1, 2025
### What problem does this PR solve?

Iceberg has 3 levels of metadata: catalog, namespace and table, mapping
to Doris' catalog, database and table.

Iceberg support nested namespaces, which means the following namespaces
are valid:
```
ns1
ns1.ns2
ns1.ns2.ns3
```

So we need to support mapping nested namespace to Doris' database.

This PR add a global variable `enable_nested_namespace` to control this
behavior.
Default is `false`, and no logic is changed.

If set to true, Doris can support following statments:

```
mysql> switch iceberg;
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| nested             |
| nested.db1         |
| nested.db2         |
+--------------------+

mysql> use iceberg.nested.db1;
ERROR 1049 (42000): Only one dot can be in the name: iceberg.nested.db1
mysql> use iceberg.`nested.db1`;
ERROR 5086 (42000): errCode = 2, detailMessage = Unknown catalog 'nested'

mysql> set global enable_nested_namespace=true;

mysql> use iceberg.nested.db1;
Database changed
mysql> select k1 from iceberg.`nested.db1`.nested1;
mysql> select nested1.k1 from `nested.db1`.nested1;
mysql> select `nested.db1`.nested1.k1 from iceberg.`nested.db1`.nested1;
mysql> select iceberg.`nested.db1`.nested1.k1 from nested1;
+------+
| k1   |
+------+
|    1 |
+------+

mysql> refresh catalog iceberg;
mysql> refresh database iceberg.`nested.db1`;
mysql> refresh table iceberg.`nested.db1`.nested1;
Query OK, 0 rows affected (0.01 sec)
```

But, I can execute statement like:
```
use iceberg.`nested.db1`;
```

I don't know why, there is a very strange behavior in MySQL client, when
adding back quota,
the INIT_DB command can only receive `nested.db1` part, but expect
`iceberg.nested.db1`.

Also support creating nested database name in internal catalog:
```
create database `db1.db2`
```
yiguolei pushed a commit that referenced this pull request Oct 1, 2025
#56695)

Cherry-picked from #56415

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
dwdwqfwe pushed a commit to dwdwqfwe/doris that referenced this pull request Oct 4, 2025
### What problem does this PR solve?

Iceberg has 3 levels of metadata: catalog, namespace and table, mapping
to Doris' catalog, database and table.

Iceberg support nested namespaces, which means the following namespaces
are valid:
```
ns1
ns1.ns2
ns1.ns2.ns3
```

So we need to support mapping nested namespace to Doris' database.

This PR add a global variable `enable_nested_namespace` to control this
behavior.
Default is `false`, and no logic is changed.

If set to true, Doris can support following statments:

```
mysql> switch iceberg;
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| nested             |
| nested.db1         |
| nested.db2         |
+--------------------+

mysql> use iceberg.nested.db1;
ERROR 1049 (42000): Only one dot can be in the name: iceberg.nested.db1
mysql> use iceberg.`nested.db1`;
ERROR 5086 (42000): errCode = 2, detailMessage = Unknown catalog 'nested'

mysql> set global enable_nested_namespace=true;

mysql> use iceberg.nested.db1;
Database changed
mysql> select k1 from iceberg.`nested.db1`.nested1;
mysql> select nested1.k1 from `nested.db1`.nested1;
mysql> select `nested.db1`.nested1.k1 from iceberg.`nested.db1`.nested1;
mysql> select iceberg.`nested.db1`.nested1.k1 from nested1;
+------+
| k1   |
+------+
|    1 |
+------+

mysql> refresh catalog iceberg;
mysql> refresh database iceberg.`nested.db1`;
mysql> refresh table iceberg.`nested.db1`.nested1;
Query OK, 0 rows affected (0.01 sec)
```

But, I can execute statement like:
```
use iceberg.`nested.db1`;
```

I don't know why, there is a very strange behavior in MySQL client, when
adding back quota,
the INIT_DB command can only receive `nested.db1` part, but expect
`iceberg.nested.db1`.

Also support creating nested database name in internal catalog:
```
create database `db1.db2`
```
morningman added a commit that referenced this pull request Oct 16, 2025
### What problem does this PR solve?

Followup #56415

Problem Summary:

1. The previous `getNamespace` logic is wrong, we should split the
`dbName` by `.` to create namespaces.
2. Allow not specify `oauth.uri` of iceberg rest catalog, to follow the
new spec of IRC

    So we can connect Snowflake open catalog like this:
    ```
    CREATE CATALOG ice PROPERTIES (
        'type' = 'iceberg',
        'warehouse' = 'yy_external_catalog3',
        'iceberg.catalog.type' = 'rest',
'iceberg.rest.uri' =
'https://xxx.snowflakecomputing.com/polaris/api/catalog',
        'iceberg.rest.security.type' = 'oauth2',
        'iceberg.rest.oauth2.credential' = 'id:secrete,
'iceberg.rest.oauth2.scope' = 'PRINCIPAL_ROLE:yy_sn_principal_role',
        'iceberg.rest.nested-namespace-enabled' = 'true',
        's3.endpoint' = 'https://s3.us-west-2.amazonaws.com',
        's3.region' = 'us-west-2',
        'iceberg.rest.nested-namespace-enabled' = 'true'
    );
    ```
github-actions bot pushed a commit that referenced this pull request Oct 16, 2025
### What problem does this PR solve?

Followup #56415

Problem Summary:

1. The previous `getNamespace` logic is wrong, we should split the
`dbName` by `.` to create namespaces.
2. Allow not specify `oauth.uri` of iceberg rest catalog, to follow the
new spec of IRC

    So we can connect Snowflake open catalog like this:
    ```
    CREATE CATALOG ice PROPERTIES (
        'type' = 'iceberg',
        'warehouse' = 'yy_external_catalog3',
        'iceberg.catalog.type' = 'rest',
'iceberg.rest.uri' =
'https://xxx.snowflakecomputing.com/polaris/api/catalog',
        'iceberg.rest.security.type' = 'oauth2',
        'iceberg.rest.oauth2.credential' = 'id:secrete,
'iceberg.rest.oauth2.scope' = 'PRINCIPAL_ROLE:yy_sn_principal_role',
        'iceberg.rest.nested-namespace-enabled' = 'true',
        's3.endpoint' = 'https://s3.us-west-2.amazonaws.com',
        's3.region' = 'us-west-2',
        'iceberg.rest.nested-namespace-enabled' = 'true'
    );
    ```
morningman added a commit to morningman/doris that referenced this pull request Oct 16, 2025
Iceberg has 3 levels of metadata: catalog, namespace and table, mapping
to Doris' catalog, database and table.

Iceberg support nested namespaces, which means the following namespaces
are valid:
```
ns1
ns1.ns2
ns1.ns2.ns3
```

So we need to support mapping nested namespace to Doris' database.

This PR add a global variable `enable_nested_namespace` to control this
behavior.
Default is `false`, and no logic is changed.

If set to true, Doris can support following statments:

```
mysql> switch iceberg;
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| nested             |
| nested.db1         |
| nested.db2         |
+--------------------+

mysql> use iceberg.nested.db1;
ERROR 1049 (42000): Only one dot can be in the name: iceberg.nested.db1
mysql> use iceberg.`nested.db1`;
ERROR 5086 (42000): errCode = 2, detailMessage = Unknown catalog 'nested'

mysql> set global enable_nested_namespace=true;

mysql> use iceberg.nested.db1;
Database changed
mysql> select k1 from iceberg.`nested.db1`.nested1;
mysql> select nested1.k1 from `nested.db1`.nested1;
mysql> select `nested.db1`.nested1.k1 from iceberg.`nested.db1`.nested1;
mysql> select iceberg.`nested.db1`.nested1.k1 from nested1;
+------+
| k1   |
+------+
|    1 |
+------+

mysql> refresh catalog iceberg;
mysql> refresh database iceberg.`nested.db1`;
mysql> refresh table iceberg.`nested.db1`.nested1;
Query OK, 0 rows affected (0.01 sec)
```

But, I can execute statement like:
```
use iceberg.`nested.db1`;
```

I don't know why, there is a very strange behavior in MySQL client, when
adding back quota,
the INIT_DB command can only receive `nested.db1` part, but expect
`iceberg.nested.db1`.

Also support creating nested database name in internal catalog:
```
create database `db1.db2`
```
morningman added a commit to morningman/doris that referenced this pull request Oct 16, 2025
Followup apache#56415

Problem Summary:

1. The previous `getNamespace` logic is wrong, we should split the
`dbName` by `.` to create namespaces.
2. Allow not specify `oauth.uri` of iceberg rest catalog, to follow the
new spec of IRC

    So we can connect Snowflake open catalog like this:
    ```
    CREATE CATALOG ice PROPERTIES (
        'type' = 'iceberg',
        'warehouse' = 'yy_external_catalog3',
        'iceberg.catalog.type' = 'rest',
'iceberg.rest.uri' =
'https://xxx.snowflakecomputing.com/polaris/api/catalog',
        'iceberg.rest.security.type' = 'oauth2',
        'iceberg.rest.oauth2.credential' = 'id:secrete,
'iceberg.rest.oauth2.scope' = 'PRINCIPAL_ROLE:yy_sn_principal_role',
        'iceberg.rest.nested-namespace-enabled' = 'true',
        's3.endpoint' = 'https://s3.us-west-2.amazonaws.com',
        's3.region' = 'us-west-2',
        'iceberg.rest.nested-namespace-enabled' = 'true'
    );
    ```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.1.2-merged dev/4.0.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants