Nightly test encryption fixes by ccfelius · Pull Request #20450 · duckdb/duckdb

ccfelius · 2026-01-08T15:19:35Z

Fixes https://github.com/duckdblabs/duckdb-internal/issues/7080 and other potential nightly test failures related to USE + restart in encryption tests.

Same cause as here #20409

…, and grab a vacuum lock when initiating a delete operation

The original code had an invalid check: ```c++ if (s_ele.__isset.num_children && s_ele.num_children > 0) { // inner node ``` This incorrectly assumed that `num_children == 0 or !__isset.num_children` meant the schema element was a leaf node. According to the Parquet specification: - Leaf nodes are defined by having a type field set - Inner nodes (groups) are defined by NOT having a type field set This caused DuckDB to fail when reading parquet files with schemas that contain empty groups, mistakenly stating that the file wasn't following the parquet specification.

removed in Python3.12 see: https://docs.python.org/3/library/asyncore.html

…s not named 'duckdb'

* Fix CI failures

… make appending to "removed_data_during_checkpoint" conditional

…b#19930) Note that I currently kept the checkpointing in the destructor as a last back-up, but I am not sure if we should? As technically that should never happen now - maybe we can instead log an error there? See my `FIXME` in `src/main/attached_database.cpp`. Fixes duckdblabs/duckdb-internal#6643

) * Fix CI failures in duckdb#20081

- fix `make generate-files` to also work in case the root of the repo is not checked out in directory `duckdb` - removed deprecated `asyncore` import, not longer supported since `Python 3.12` - see: https://docs.python.org/3/library/asyncore.html

Fixes duckdb#19666 Fixes duckdblabs/duckdb-internal#6508

The PR fixes big-endian related issues with these: - FSST compression ("gracefully" merges the latest of https://github.com/cwida/fsst to pick up [BE support](cwida/fsst#36)); - arrow conversion; - GEOMETRY and HASH types; - md5 functions.

Basically all automatic, BUT for duckdb@1cc1ab7 that needs to be reverted via duckdb@d29f8cb. Note: on 1.4.4 and subsequent minor releases on v1.4-andium branch, same dance again. See more details at duckdb#20227 (comment) and linked conversations.

@Mytherin

duckdb#20107) This PR introduces a new scalar function `parse_formatted_bytes(VARCHAR)` to DuckDB, which converts a human-readable byte size (e.g. "16 KiB) into a numeric (UBIGINT) number of bytes. Changes: - Added `parse_formatted_bytes` (throws errors) and `try_parse_formatted_bytes` (returns `NULL`s on Errors) - Implementation of parsing logic in `StringUtil` (originally taken from `DBConfig`) - Updated the existing component (DBConfig) that relied on the old parsing logic to use the new `StringUtil` version. - Tests for the new scalar function ## Why? The initial motivation came from the implementation of duckdb#19726, which aimed to expose another column representing byte sizes. During the review, @Mytherin suggested exposing a dedicated function instead, keeping `duckdb_settings` lean while providing a more generally useful utility. `try_parse_formatted_bytes` can now be reused across various use cases where parsing byte-size strings is needed. Note - `try(parse_formatted_bytes)` acted differently than expected, it seems like it doesn't handle runtime `Exception`s. Exposing `try_parse_formatted_bytes` was my workaround.

…uckdb#20272) Fixes duckdb#20187 ## Description - This PR fixes the internal error "Pivot aggregate count mismatch" that occurs when using PIVOT with duplicate aggregates on ranges >= 21. - Currently, the binder was not correctly duplicating aggregate expressions for repeated aggregates in the USING clause. ## Changes - Modified `BindBoundPivot()` in `src/planner/binder/tableref/bind_pivot.cpp` - Added logic to detect and handle duplicate aggregate. - Expanded the deduplicated aggregates to match the user's expected column count. ## Testing Added test cases in `test/sql/pivot/test_pivot_duplicate_aggregates.test` to test Mixed duplicate and unique aggregates and different range sizes.

I met this issue when I tried to wrap `JsonReader`'s filesystem with caching filesystem. Current implementation for `CachingFileSystemWrapper` is buggy for `CanSeek`: - There're two types of filesystem instances in duckdb + One is raw filesystem, which interacts directly with storage backends, for example, local filesystem and S3 filesystem + Another is wrapper filesystem, which wraps filesystem(s) and provides additional features (i.e., virtual FS provides abstraction layer over multiple FS, caching FS enables read cache) - File handle's invocation [delegates to](https://github.com/duckdb/duckdb/blob/041f4cac6889afa6604c8deee9e6574b8d6ae3aa/src/common/file_system.cpp#L748-L750) filesystem's - For caching fs wrapper, it's the [cache-wrapped internal filesystem](https://github.com/duckdb/duckdb/blob/041f4cac6889afa6604c8deee9e6574b8d6ae3aa/src/storage/caching_file_system_wrapper.cpp#L339-L341), which could be either raw filesystem or wrapped filesystem and they should be treated separately ~~Proposed solution implemented in this PR:~~ ~~- Seek-ability is an immutable attribute for a filesystem~~ ~~+ For raw file systems, `CanSeek` (the one with no arguments provided) has been already implemented~~ ~~+ For wrapped filesystems, the function call should be properly delegated to internal filesystem instances, just as what we do for other APIs~~ Update: follow Lauren's suggestion to integrate `CachingMode` into `FileOpenFlags` to get the caching work handled inside of VirtualFileSystem.

This PR resolves several Windows‑specific inconsistencies in DuckDB’s formatting scripts, ensuring they behave deterministically across Windows and Unix‑like environments. ### **Improvements** - **Force Linux‑style line endings (`\n`)** when writing temporary formatted files. The `open_utf8` helper now forwards `newline='\n'` (and any additional keyword arguments), preventing Windows from injecting `\r\n` and causing spurious diffs in `scripts/format.py`. - **Normalize all paths to POSIX (`/`) form** when generating test metadata. `format_test_benchmark.py` now uses `Path(...).as_posix()` to ensure stable, platform‑independent `# name:` headers. **Example:** ``` good # name: benchmark/appian_benchmarks/q01.benchmark # description: Run query 01 from the appian benchmarks bad # name: benchmark\appian_benchmarks\q01.benchmark # description: Run query 01 from the appian benchmarks ``` ### **Why This Matters** These changes eliminate Windows‑specific formatting noise, making the formatting scripts fully deterministic across developer environments. This improves contributor experience and prevents CI failures caused by platform‑dependent EOL or path differences. ### **Scope** - No functional changes to DuckDB itself - Only affects developer tooling under `scripts/`

When environment variable DUCKDB_FORMAT_SKIP_VERSION_CHECKS is defined, skip failing on specific version This is out of spite, I could/should fixup my local setup, but also I think a way to skip the checks does make sense.

This PR try fixes issue duckdb#20233. hi duckdb team，I added the `DoesColumnAliasExist` function to `QualifyBinder` based on the code in the `where_bingder.cpp` file. Testing shows that it works correctly, but I'm not sure if it will affect other modules. thanks~

…kdb#20306) (duckdb#20320) this pr fix: duckdb#20306

…uckdb#20407) This PR removes unused and redundant headers across the storage subsystem. The cleanup exposed missing explicit includes that were previously satisfied via transitive dependencies; these have been fixed by adding the required headers in the appropriate locations, improving include hygiene and Windows/MSVC compatibility. All CI workflows (Linux, macOS, Windows) pass on this branch.

…#20336) See issue duckdb#20335 for details.

Fixes duckdblabs/duckdb-internal#6956 and duckdblabs/duckdb-internal#7005 Test failed due to usage of `USE db_name` + `restart`. This pattern fails in any case, i.e. ``` loop i 1 2 statement ok ATTACH '__TEST_DIR__/test.db' as test; statement ok USE test restart endloop ``` also fails with ``` Catalog Error: SET search_path: No catalog + schema named "test.main" found. ``` removing `USE` and using declarative `db_name.tbl` solved the issue. Not 100% sure if this is desired behavior, but I would assume so.

Picking up duckdb#19606, solving merge conflict, and removing unnecessary exclusions of `time_ns` type. We are still missing the tests for the different precision types of `TIME`, these should be added in our Python client after this code gets merged, as we need arrow to produce these precision types.

@lidavidm

…ion and improve ingest name resolution (duckdb#20369) Add support for `adbc.ingest.target_catalog` in the DuckDB ADBC ingest path. Close duckdb#20128 This allows ingesting into attached databases (catalogs) by propagating the catalog into both the CREATE/DROP SQL and the appender. When only target_catalog is provided (no target_db_schema), the driver defaults the schema to `main` and uses a fully-qualified 3-part name (`catalog_name.main.table_name`) to avoid DuckDB’s catalog/schema ambiguity with 2-part names. Temporary ingestion is also updated to align with common ADBC expectations: temporary tables are created in the `temp` schema and are distinct from persistent tables with the same name. `temporary` remains incompatible with target_db_schema / target_catalog at execution time, but enabling `temporary` after setting schema/catalog clears those options so ingest can proceed (mirroring behavior in the Postgres driver in the arrow-adbc repo). cc @lidavidm

`FileOpener` is needed for HTTP utils, which should be propagated from virtual filesystem, to caching wrapper, to caching filesystem. Checked with previously failed SQL and confirmed to work: ```sql memory D select unnest(data) as customers from read_json('https://non.existant/endpoint'); IO Error: Could not establish connection error for HTTP HEAD to 'https://non.existant/endpoint' LINE 4: read_json('https://non.existant/endpoint'); ^ ```

We may need to eliminate NOT operator for efficiency. It can 1. save an operator 2. beneficial for statistics propagation 3. more efficient ternary process In this pr, the rule eliminates NOT operator with the following tree patterns 1. Nested NOT ``` NOT NOT col1 > 1 TO col1 > 1 NOT NOT NOT col1 > 1 TO NOT col1 > 1 ``` 2. NOT with IS_NOT_NULL/IS_NULL ``` NOT IS_NOT_NULL/IS_NULL ==> IS_NULL/IS_NOT_NULL ``` 4. NOT with AND ``` NOT (col1 > 1 AND col2 <= 2) TO col1 <= 1 OR col2 > 2 ``` 5. NOT with OR ``` NOT (col1 > 1 OR col2 <= 2) TO col1 <= 1 AND col2 > 2 ``` Please let me know if there is anything wrong with the above understanding ：）, thanks

Fixes duckdblabs/duckdb-internal#7080 and other potential nightly test failures related to USE + restart in encryption tests. Same cause as here #20409 Was #20450, now targeted to the right branch

yan-alex and others added 30 commits December 19, 2025 15:10

Fix CI failure for Symbol Leakage

fa0d077

Add test

2842c81

More unused includes

45fe2f1

Add a vacuum lock that prevents vacuuming while other threads hold it…

bd49606

…, and grab a vacuum lock when initiating a delete operation

Remove shared lock in scan again

8fddef0

removed depricated asyncore import

b5b17b7

removed in Python3.12 see: https://docs.python.org/3/library/asyncore.html

fix 'make generate-files' to also work in case the root of the repo i…

43dec08

…s not named 'duckdb'

Skip also httpfs test around headers

4627d76

Update the test

dbd7d53

Trigger CI

6969139

return_stats for boolean and numeric types (128 bits)

174ec83

Issue duckdb#20076: Implement CountWindowElimination Logic

4653357

* Fix CI failures

no move for metadata

1f254fe

Merge branch 'main' into hjiang/parquet-metadata-memory-control

205e709

no use-after-free

c7a3302

peg grammar: start rule first

36b6b41

Simplify delta indexes: always append to "deleted_rows_in_use" - only…

a4805dd

… make appending to "removed_data_during_checkpoint" conditional

Lock during constraint verification message

9e2535b

Skip checkpointing instead of erroring out

d5ee404

formatting

53ccdfc

Preserve CONSTANT vector for scalar exprs

9b4bb97

Ignore storage compatibility for added tests

418c787

Issue duckdb#20076: Implement CountWindowElimination Logic (duckdb#20273

425aae5

) * Fix CI failures in duckdb#20081

Minor python fixes (duckdb#20269)

49923cb

- fix `make generate-files` to also work in case the root of the repo is not checked out in directory `duckdb` - removed deprecated `asyncore` import, not longer supported since `Python 3.12` - see: https://docs.python.org/3/library/asyncore.html

Fix concat(null) error (duckdb#20266)

e6cbdca

Fixes duckdb#19666 Fixes duckdblabs/duckdb-internal#6508

Add DeltaIndexType, and rename CreateEmptyCopy to CreateDeltaIndex

df71489

lnkuiper and others added 24 commits January 6, 2026 15:04

fix nightly test

353284a

Fix concat(null) is null return false, it should be true(issue: duc…

60a92c8

…kdb#20306) (duckdb#20320) this pr fix: duckdb#20306

remove nightly trigger

c81a65d

Fix missing file opener for http util creation

2790578

Fix file opener propagation

c15e1b9

NightlyTests fixes (duckdb#20408)

2d3414f

format

3b481bd

Avoid frequent checkpoints triggered by optimistic insertions (duckdb…

3f4d7b1

…#20336) See issue duckdb#20335 for details.

[Optimizer] Support NOT elimination

d19aa67

remove use enc

774599b

remove USE and DETACH

4334d1b

undo remove detach

00bade7

ccfelius changed the title ~~Nightly encryption test fixes~~ Nightly test encryption fixes Jan 8, 2026

ccfelius changed the base branch from main to v1.4-andium January 9, 2026 07:13

ccfelius closed this Jan 9, 2026

ccfelius deleted the encryption_test_fixes branch January 9, 2026 07:50

ccfelius mentioned this pull request Jan 9, 2026

Nightly test encryption fixes #20461

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nightly test encryption fixes#20450

Nightly test encryption fixes#20450
ccfelius wants to merge 4504 commits intoduckdb:v1.4-andiumfrom
ccfelius:encryption_test_fixes

ccfelius commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

ccfelius commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants