-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix](hive) use the remote name when get meta from hive system. #52561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 34086 ms |
FE UT Coverage ReportIncrement line coverage |
TPC-DS: Total hot run time: 185456 ms |
|
run buildall |
TPC-H: Total hot run time: 34290 ms |
TPC-DS: Total hot run time: 185094 ms |
ClickBench: Total hot run time: 29.31 s |
|
run buildall |
TPC-H: Total hot run time: 34851 ms |
TPC-DS: Total hot run time: 189259 ms |
ClickBench: Total hot run time: 29.67 s |
FE UT Coverage ReportIncrement line coverage |
c3e9e1d to
cfc391b
Compare
|
run buildall |
TPC-H: Total hot run time: 33648 ms |
TPC-DS: Total hot run time: 184169 ms |
FE UT Coverage ReportIncrement line coverage |
fe/fe-core/src/main/java/org/apache/doris/datasource/ExternalTable.java
Outdated
Show resolved
Hide resolved
ClickBench: Total hot run time: 29.99 s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我看这个TablePartitionKey类用得地方也比较少,如果只用于本地名字的比较,用LocalTablePartitionKey是更精确一点吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I add comment in this class, so I think the name is ok
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
…he#52561) ### What problem does this PR solve? #### Background This PR mainly addresses the issue of case sensitivity for databases and tables in `ExternalCatalog`. By default, the database and table names in Doris are case-sensitive. However, if the user sets the `lower_case_table_name` parameter, the case sensitivity of table names changes. For example, when `lower_case_table_name` is set to 1, all table names are stored in lowercase. This presents a problem: when accessing external data sources through a client, if the target data source is case-sensitive, using lowercase names may result in failure to retrieve information. Example: Assume there is a table named `TABLE` in the external system, and `lower_case_table_name` is set to 1, so the table name stored in Doris is `table`. When using `table` to access the external system, it may result in a "table not found" situation. Therefore, we need to modify the current code and establish a principle: 1. When accessing external systems, always use the original name. 2. When accessing databases and tables synchronized from Doris itself, use the name recognized by Doris (e.g., lowercase, or any name if case-insensitive). Currently, Doris records the synchronized name (name) and the original name (RemoteName) in `ExternalDatabase` and `ExternalTable`. We need to utilize this information to sort out all internal and external access interfaces to ensure correct behavior. #### Main Modifications 1. Added the `NameMapping` class and replaced the original `SimpleTableInfo` class. `NameMapping` records the CatalogId corresponding to a table, the local and original names of the corresponding DB, and the local and original names of the table itself. 2. Modified all `ExternalCache` CacheKeys. For most `ExternalCache`, `NameMapping` is passed into the CacheKey. Since the cache typically interacts with external systems to retrieve information, the original name is needed. Meanwhile, some caches need to interact with Doris's internal metadata, so local storage name information is also required. Through `NameMapping`, we can clearly obtain this information to make correct calls. 3. Restructured various interface parameters in the `CatalogIf` class, masking implementation differences between new and old optimizers and reducing redundant code. 4. Fixed some metadata synchronization logic. For example, after a truncate table operation, other FEs need to execute `afterTruncateTable` for information synchronization. At this point, we only need to check if the current FE has cached information for the corresponding table; if so, clear the relevant cache. If not, there is no need to access the external system to retrieve this table, as accessing the external system might fail, interrupting metadata replay.
…he#52561) ### What problem does this PR solve? #### Background This PR mainly addresses the issue of case sensitivity for databases and tables in `ExternalCatalog`. By default, the database and table names in Doris are case-sensitive. However, if the user sets the `lower_case_table_name` parameter, the case sensitivity of table names changes. For example, when `lower_case_table_name` is set to 1, all table names are stored in lowercase. This presents a problem: when accessing external data sources through a client, if the target data source is case-sensitive, using lowercase names may result in failure to retrieve information. Example: Assume there is a table named `TABLE` in the external system, and `lower_case_table_name` is set to 1, so the table name stored in Doris is `table`. When using `table` to access the external system, it may result in a "table not found" situation. Therefore, we need to modify the current code and establish a principle: 1. When accessing external systems, always use the original name. 2. When accessing databases and tables synchronized from Doris itself, use the name recognized by Doris (e.g., lowercase, or any name if case-insensitive). Currently, Doris records the synchronized name (name) and the original name (RemoteName) in `ExternalDatabase` and `ExternalTable`. We need to utilize this information to sort out all internal and external access interfaces to ensure correct behavior. #### Main Modifications 1. Added the `NameMapping` class and replaced the original `SimpleTableInfo` class. `NameMapping` records the CatalogId corresponding to a table, the local and original names of the corresponding DB, and the local and original names of the table itself. 2. Modified all `ExternalCache` CacheKeys. For most `ExternalCache`, `NameMapping` is passed into the CacheKey. Since the cache typically interacts with external systems to retrieve information, the original name is needed. Meanwhile, some caches need to interact with Doris's internal metadata, so local storage name information is also required. Through `NameMapping`, we can clearly obtain this information to make correct calls. 3. Restructured various interface parameters in the `CatalogIf` class, masking implementation differences between new and old optimizers and reducing redundant code. 4. Fixed some metadata synchronization logic. For example, after a truncate table operation, other FEs need to execute `afterTruncateTable` for information synchronization. At this point, we only need to check if the current FE has cached information for the corresponding table; if so, clear the relevant cache. If not, there is no need to access the external system to retrieve this table, as accessing the external system might fail, interrupting metadata replay.
…he#52561) ### What problem does this PR solve? #### Background This PR mainly addresses the issue of case sensitivity for databases and tables in `ExternalCatalog`. By default, the database and table names in Doris are case-sensitive. However, if the user sets the `lower_case_table_name` parameter, the case sensitivity of table names changes. For example, when `lower_case_table_name` is set to 1, all table names are stored in lowercase. This presents a problem: when accessing external data sources through a client, if the target data source is case-sensitive, using lowercase names may result in failure to retrieve information. Example: Assume there is a table named `TABLE` in the external system, and `lower_case_table_name` is set to 1, so the table name stored in Doris is `table`. When using `table` to access the external system, it may result in a "table not found" situation. Therefore, we need to modify the current code and establish a principle: 1. When accessing external systems, always use the original name. 2. When accessing databases and tables synchronized from Doris itself, use the name recognized by Doris (e.g., lowercase, or any name if case-insensitive). Currently, Doris records the synchronized name (name) and the original name (RemoteName) in `ExternalDatabase` and `ExternalTable`. We need to utilize this information to sort out all internal and external access interfaces to ensure correct behavior. #### Main Modifications 1. Added the `NameMapping` class and replaced the original `SimpleTableInfo` class. `NameMapping` records the CatalogId corresponding to a table, the local and original names of the corresponding DB, and the local and original names of the table itself. 2. Modified all `ExternalCache` CacheKeys. For most `ExternalCache`, `NameMapping` is passed into the CacheKey. Since the cache typically interacts with external systems to retrieve information, the original name is needed. Meanwhile, some caches need to interact with Doris's internal metadata, so local storage name information is also required. Through `NameMapping`, we can clearly obtain this information to make correct calls. 3. Restructured various interface parameters in the `CatalogIf` class, masking implementation differences between new and old optimizers and reducing redundant code. 4. Fixed some metadata synchronization logic. For example, after a truncate table operation, other FEs need to execute `afterTruncateTable` for information synchronization. At this point, we only need to check if the current FE has cached information for the corresponding table; if so, clear the relevant cache. If not, there is no need to access the external system to retrieve this table, as accessing the external system might fail, interrupting metadata replay.
…he#52561) ### What problem does this PR solve? #### Background This PR mainly addresses the issue of case sensitivity for databases and tables in `ExternalCatalog`. By default, the database and table names in Doris are case-sensitive. However, if the user sets the `lower_case_table_name` parameter, the case sensitivity of table names changes. For example, when `lower_case_table_name` is set to 1, all table names are stored in lowercase. This presents a problem: when accessing external data sources through a client, if the target data source is case-sensitive, using lowercase names may result in failure to retrieve information. Example: Assume there is a table named `TABLE` in the external system, and `lower_case_table_name` is set to 1, so the table name stored in Doris is `table`. When using `table` to access the external system, it may result in a "table not found" situation. Therefore, we need to modify the current code and establish a principle: 1. When accessing external systems, always use the original name. 2. When accessing databases and tables synchronized from Doris itself, use the name recognized by Doris (e.g., lowercase, or any name if case-insensitive). Currently, Doris records the synchronized name (name) and the original name (RemoteName) in `ExternalDatabase` and `ExternalTable`. We need to utilize this information to sort out all internal and external access interfaces to ensure correct behavior. #### Main Modifications 1. Added the `NameMapping` class and replaced the original `SimpleTableInfo` class. `NameMapping` records the CatalogId corresponding to a table, the local and original names of the corresponding DB, and the local and original names of the table itself. 2. Modified all `ExternalCache` CacheKeys. For most `ExternalCache`, `NameMapping` is passed into the CacheKey. Since the cache typically interacts with external systems to retrieve information, the original name is needed. Meanwhile, some caches need to interact with Doris's internal metadata, so local storage name information is also required. Through `NameMapping`, we can clearly obtain this information to make correct calls. 3. Restructured various interface parameters in the `CatalogIf` class, masking implementation differences between new and old optimizers and reducing redundant code. 4. Fixed some metadata synchronization logic. For example, after a truncate table operation, other FEs need to execute `afterTruncateTable` for information synchronization. At this point, we only need to check if the current FE has cached information for the corresponding table; if so, clear the relevant cache. If not, there is no need to access the external system to retrieve this table, as accessing the external system might fail, interrupting metadata replay.
### What problem does this PR solve? relate pr: #52561 Fix CacheException msg in loadSnapshot to make test_paimon_catalog.groovy happy
### What problem does this PR solve? relate pr: apache#52561 Fix CacheException msg in loadSnapshot to make test_paimon_catalog.groovy happy
apache#52817) Followup apache#52561 Also fix a bug introduced by apache#51471, which lost the function rules in jdbc external table.
### What problem does this PR solve? relate pr: apache#52561 Fix CacheException msg in loadSnapshot to make test_paimon_catalog.groovy happy
apache#52817) Followup apache#52561 Also fix a bug introduced by apache#51471, which lost the function rules in jdbc external table.
### What problem does this PR solve? relate pr: apache#52561 Fix CacheException msg in loadSnapshot to make test_paimon_catalog.groovy happy
apache#52817) Followup apache#52561 Also fix a bug introduced by apache#51471, which lost the function rules in jdbc external table.
### What problem does this PR solve? relate pr: apache#52561 Fix CacheException msg in loadSnapshot to make test_paimon_catalog.groovy happy
apache#52817) Followup apache#52561 Also fix a bug introduced by apache#51471, which lost the function rules in jdbc external table.
What problem does this PR solve?
Background
This PR mainly addresses the issue of case sensitivity for databases and tables in
ExternalCatalog. By default, the database and table names in Doris are case-sensitive. However, if the user sets thelower_case_table_nameparameter, the case sensitivity of table names changes. For example, whenlower_case_table_nameis set to 1, all table names are stored in lowercase.This presents a problem: when accessing external data sources through a client, if the target data source is case-sensitive, using lowercase names may result in failure to retrieve information.
Example:
Assume there is a table named
TABLEin the external system, andlower_case_table_nameis set to 1, so the table name stored in Doris istable. When usingtableto access the external system, it may result in a "table not found" situation.Therefore, we need to modify the current code and establish a principle:
Currently, Doris records the synchronized name (name) and the original name (RemoteName) in
ExternalDatabaseandExternalTable. We need to utilize this information to sort out all internal and external access interfaces to ensure correct behavior.Main Modifications
Added the
NameMappingclass and replaced the originalSimpleTableInfoclass.NameMappingrecords the CatalogId corresponding to a table, the local and original names of the corresponding DB, and the local and original names of the table itself.Modified all
ExternalCacheCacheKeys.For most
ExternalCache,NameMappingis passed into the CacheKey. Since the cache typically interacts with external systems to retrieve information, the original name is needed. Meanwhile, some caches need to interact with Doris's internal metadata, so local storage name information is also required. ThroughNameMapping, we can clearly obtain this information to make correct calls.Restructured various interface parameters in the
CatalogIfclass, masking implementation differences between new and old optimizers and reducing redundant code.Fixed some metadata synchronization logic.
For example, after a truncate table operation, other FEs need to execute
afterTruncateTablefor information synchronization. At this point, we only need to check if the current FE has cached information for the corresponding table; if so, clear the relevant cache. If not, there is no need to access the external system to retrieve this table, as accessing the external system might fail, interrupting metadata replay.Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)