more TIME_NS support (v1.4)#19613
Conversation
pdet
left a comment
There was a problem hiding this comment.
Hi @jraymakers thanks for the PR, is looking good!
Just had a couple minor comments.
| // time_ns columns | ||
| REQUIRE_NO_FAIL(tester.Query("CREATE TABLE time_ns_values(d TIME_NS)")); | ||
| REQUIRE_NO_FAIL(tester.Query("INSERT INTO time_ns_values VALUES ('12:34:56.789123456'), (NULL), ('12:34:56')")); | ||
|
|
||
| result = tester.Query("SELECT * FROM time_ns_values ORDER BY d"); | ||
| REQUIRE_NO_FAIL(*result); | ||
| REQUIRE(result->IsNull(0, 0)); | ||
| duckdb_time_ns time_ns_val = result->Fetch<duckdb_time_ns>(0, 1); | ||
| REQUIRE(time_ns_val.nanos == 45296000000000); | ||
| REQUIRE(result->Fetch<string>(0, 1) == Value::TIME_NS(dtime_ns_t(45296000000000)).ToString()); | ||
| time_ns_val = result->Fetch<duckdb_time_ns>(0, 2); | ||
| REQUIRE(time_ns_val.nanos == 45296789123456); | ||
| REQUIRE(result->Fetch<string>(0, 2) == Value::TIME_NS(dtime_ns_t(45296789123456)).ToString()); | ||
|
|
There was a problem hiding this comment.
Can we add tests for the other conversion values as well?
There was a problem hiding this comment.
Which other conversion values are you referring to? I tried to cover all the same cases as the similar code above, testing TIME.
There was a problem hiding this comment.
This PR implements the conversion for many different precisions, no?
There was a problem hiding this comment.
Ah, I assume you mean the new case for TIME_NS I added to ColumnArrowToDuckDB.
It's true this case, like the one for TIME it is patterned after, implements conversions from all four precision levels defined by Arrow (second, millisecond, microsecond, and nanosecond). But, since DuckDB only writes microseconds for TIME, and (now) nanoseconds for TIME_NS, it's not clear to me how to exercise the other code paths. The existing "Arrow roundtrip" test infrastructure can only converts from Arrow to DuckDB what is converted from DuckDB to Arrow, and I'm not aware of any way to generate the second or millisecond precision levels through this path. There don't appear to be any existing tests for these paths for TIME.
With some effort, it might be possible to write additional test infrastructure that could test non-roundtrip Arrow conversions, such as writing microseconds (using TIME) and then reading nanoseconds (using TIME_NS), or vice versa. But I don't feel confident in my ability to implement that infrastructure; in this PR, I'm just trying to plug some gaps in TIME_NS support by closely following existing patterns for TIME.
f7be3d5 to
0bb38f4
Compare
|
@pdet Thanks for the review. I'd love to get this in for 1.4.2 (scheduled for next week), to unblock TIME_NS support in Node Neo (and other users of the C API). Let me know if you think the remaining issue about the test coverage of Arrow conversions for other precision levels is a blocker, or whether it can wait for later. |
| case VariantLogicalType::TIME_NANOS: { | ||
| auto val = Value::TIME_NS(Load<dtime_ns_t>(ptr)); | ||
| auto val_str = val.ToString(); | ||
| return yyjson_mut_strncpy(doc, val_str.c_str(), val_str.size()); |
There was a problem hiding this comment.
This is covered by test/sql/types/variant/test_all_types.test. Without this addition, that test (now) fails with INTERNAL Error: VariantLogicalType(21) not handled, because of the addition of time_ns to test_all_types (in this PR).
|
|
||
| statement ok | ||
| CREATE TABLE all_types AS SELECT * FROM test_all_types(); | ||
| CREATE TABLE all_types AS SELECT * EXCLUDE(time_ns) FROM test_all_types(); |
There was a problem hiding this comment.
I don't think just excluding it is the right move
|
Hi @jraymakers, after chatting with @Tishj, the basic conclusion is that we need to remove all test exclusions of As it stands, there are untested code paths with potential undesirable side effects. |
|
I hear you that the exclusions are not desirable. They were not my first choice; I added them because I couldn't figure out another path forward. I want to point out that the exclusions don't reduce test coverage. They are only necessary because of the change in this PR that adds a Here are some possible paths forward:
I'm open to any of these paths, but I'm doubtful I'll be unable to accomplish the first on my own. |
|
Note that my reason for expanding the scope of this PR to include the addition to |
Hi @jraymakers, I think your reasoning makes perfect sense. Either @Tishj or I can pick it up and fix the test for all types. The last thing is that it’s not possible to get this PR into v1.4.2, considering it’s a feature and we still need to implement the aforementioned missing tests first. I understand it’s necessary for TIME_NS support in Node Neo, but again, this is not a bug, and we should refrain from adding features in minor releases. I think the way to move forward is to get your latest changes to #19606, close this PR and continue from there. |
|
I'd argue there is a bug, or several:
But, it's your call, and I can understand how adding a new column to |
|
I've updated the |
|
Thanks @jraymakers I'll close this one for now in favor of #19606 |
#19606 rebased to
v1.4-andium.Differences:
main).main(parquet/variant) didn't need changes in this PR.