Report native ClickHouse data types and pupulate missing type information#478
Report native ClickHouse data types and pupulate missing type information#478slabko merged 5 commits intoClickHouse:unique-type-names-betafrom slabko:unique-type-names
Conversation
|
discussed offline: we need to test how the existing dashboards react to the data type mapping changes |
|
Unfortunately, after testing, we noticed that this change breaks the MS Power BI Direct Query aggregation functions, such as |
The RowBinary format is not fully supported in the ODBC driver, and maintaining feature parity while advancing and improving our ODBCDriver2 format gets more and more difficult. For now, we stop any development of the RowBinary format, as it has never been officially supported, and focus solely on the more advanced ODBCDriver2 format.
Update SQLGetTypeInfo results to comply with [SQLGetTypeInfo](https://learn.microsoft.com/en-us/sql/odbc/reference/syntax/sqlgettypeinfo-function). Applications must use the type names returned in the TYPE_NAME column of the SQLGetTypeInfo result set in ALTER TABLE and CREATE TABLE statements
| }; | ||
|
|
||
| std::vector<TypeMappingTestEntry> types = { | ||
| {"Bool", "0", SQL_VARCHAR}, |
There was a problem hiding this comment.
Bools are currently represented as strings: Bool columns are represented as a string. This is a known issue and should definitely be fixed. However, this extends the scope of this PR a bit too much, so it will be addressed in a different one.
| {"Int16", "0", SQL_SMALLINT}, | ||
| {"UInt16", "0", SQL_SMALLINT}, | ||
| {"Int32", "0", SQL_INTEGER}, | ||
| {"UInt32", "0", SQL_BIGINT}, |
There was a problem hiding this comment.
Should be SQL_INTEGER (although we used to support some Perl's weird behavior, that should no longer be a problem)
There was a problem hiding this comment.
pyodbc, and probably other libraries in other languages, binds SQL_INTEGER to int32_t producing -1 for the max value of UInt32: 4294967295. So UInt32 have to be SQL_BIGINT to avoid this problem. Not sure if this is even a bug.
| {"String", "0", SQL_VARCHAR}, | ||
| {"FixedString(1)", "'0'", SQL_VARCHAR}, | ||
| {"Date", "0", SQL_TYPE_DATE}, | ||
| {"Date32", "0", SQL_VARCHAR}, |
There was a problem hiding this comment.
Should be SQL_TYPE_DATE
There was a problem hiding this comment.
Note, , Date32DateTime64 are missing in our Python integration tests
There was a problem hiding this comment.
The driver does not currently support Date32: Date32 in not supported. The missing test for Date32 will be added as part of the fix for this issue.
| {"LowCardinality(String)", { | ||
| SQL_VARCHAR, "String", msz, 0, na, na, false, SQL_VARCHAR, na, msz }}, | ||
| // TODO(slabko): Low cardinality nullable types do not detect nullability correctly | ||
| // Notice `null` is set to `false` here, but must be `true` |
There was a problem hiding this comment.
Create an issue in GitHub for that
There was a problem hiding this comment.
There was a problem hiding this comment.
Copilot reviewed 4 out of 19 changed files in this pull request and generated no comments.
Files not reviewed (15)
- driver/CMakeLists.txt: Language not supported
- driver/api/impl/impl.cpp: Language not supported
- driver/api/odbc.cpp: Language not supported
- driver/api/sql_columns_resultset_mutator.cpp: Language not supported
- driver/api/sql_columns_resultset_mutator.h: Language not supported
- driver/environment.cpp: Language not supported
- driver/result_set.cpp: Language not supported
- driver/test/CMakeLists.txt: Language not supported
- driver/test/result_set_reader.hpp: Language not supported
- driver/test/statement_parameters_it.cpp: Language not supported
- driver/test/utils_ut.cpp: Language not supported
- driver/utils/sql_encoding.h: Language not supported
- driver/utils/type_info.cpp: Language not supported
- driver/utils/type_info.h: Language not supported
- driver/utils/utils.h: Language not supported
Comments suppressed due to low confidence (4)
.github/workflows/macOS.yml:104
- Removing 'ClickHouse DSN (ANSI, RBWNAT)' from the TEST_DSN_LIST may reduce overall test coverage. Please verify that this DSN configuration is no longer required in any test scenarios.
+ -DTEST_DSN_LIST="ClickHouse DSN (ANSI);ClickHouse DSN (Unicode)"
.github/workflows/macOS.yml:168
- The removal of the DSN mapping for 'ClickHouse DSN (ANSI, RBWNAT)' should be double-checked to ensure no critical testing scenarios are lost.
- ClickHouse DSN (ANSI, RBWNAT) = ClickHouse ODBC Driver (ANSI)
.github/workflows/Windows.yml:81
- Ensure that removing 'ClickHouse DSN (ANSI, RBWNAT)' from the TEST_DSN_LIST in Windows does not omit necessary test coverage for specific DSN configurations.
+ -DTEST_DSN_LIST="ClickHouse DSN (ANSI);ClickHouse DSN (Unicode)"
.github/workflows/Linux.yml:102
- Verify that the removal of 'ClickHouse DSN (ANSI, RBWNAT)' from the TEST_DSN_LIST does not exclude important test cases for DSN connectivity on Linux.
+ -DTEST_DSN_LIST="ClickHouse DSN (ANSI);ClickHouse DSN (Unicode)"
The driver previously bound SQL_TYPE_TIMESTMP exclusively to `DateTime`, which meant it could not bind values with fractions of seconds. Consequently, it was not possible to use precise timestamps in WHERE clauses. This change updates the driver to bind SQL_TYPE_TIMESTMP to `DateTime64`. This allows inserting values into both `DateTime` and `DateTime64` columns and using precise values in WHERE clauses.
|
/windsurf-review |
|
I ran into an unexpected issue while reviewing this PR. Please try again later. |
|
/windsurf-review |
|
I ran into an unexpected issue while reviewing this PR. Please try again later. |
MS Power BI and other applications were not able to recognize the types of uncommon SQL-world data types, such as
UInt8orInt64. It was unable to parse the data types fromSQLColumnsbecause the dataset returned by this function did not comply with the requirements. This is fixed by changing theSQLColumnsdataset.Additionally,
SQLGetTypeInforeturned SQL types such asINT,SMALLINT,BIGINT, etc. Each of these types had to have two entries in theSQLGetTypeInforesult: one for signed and one for unsigned types. This duplication confused MS Power BI. The fix for this issue involved following the documentation and using database-native types such asInt32,UInt64, etc.