Nightly test encryption fixes#20450
Closed
ccfelius wants to merge 4504 commits intoduckdb:v1.4-andiumfrom
Closed
Conversation
…, and grab a vacuum lock when initiating a delete operation
The original code had an invalid check:
```c++
if (s_ele.__isset.num_children && s_ele.num_children > 0) { // inner node
```
This incorrectly assumed that `num_children == 0 or
!__isset.num_children` meant the schema element was a leaf node.
According to the Parquet
specification:
- Leaf nodes are defined by having a type field set
- Inner nodes (groups) are defined by NOT having a type field set
This caused DuckDB to fail when reading parquet files with schemas that
contain empty groups, mistakenly stating that the file wasn't following
the parquet specification.
removed in Python3.12 see: https://docs.python.org/3/library/asyncore.html
…s not named 'duckdb'
* Fix CI failures
… make appending to "removed_data_during_checkpoint" conditional
…b#19930) Note that I currently kept the checkpointing in the destructor as a last back-up, but I am not sure if we should? As technically that should never happen now - maybe we can instead log an error there? See my `FIXME` in `src/main/attached_database.cpp`. Fixes duckdblabs/duckdb-internal#6643
) * Fix CI failures in duckdb#20081
- fix `make generate-files` to also work in case the root of the repo is not checked out in directory `duckdb` - removed deprecated `asyncore` import, not longer supported since `Python 3.12` - see: https://docs.python.org/3/library/asyncore.html
The PR fixes big-endian related issues with these:
- FSST compression ("gracefully" merges the latest of
https://github.com/cwida/fsst to pick up [BE
support](cwida/fsst#36));
- arrow conversion;
- GEOMETRY and HASH types;
- md5 functions.
Basically all automatic, BUT for duckdb@1cc1ab7 that needs to be reverted via duckdb@d29f8cb. Note: on 1.4.4 and subsequent minor releases on v1.4-andium branch, same dance again. See more details at duckdb#20227 (comment) and linked conversations.
duckdb#20107) This PR introduces a new scalar function `parse_formatted_bytes(VARCHAR)` to DuckDB, which converts a human-readable byte size (e.g. "16 KiB) into a numeric (UBIGINT) number of bytes. Changes: - Added `parse_formatted_bytes` (throws errors) and `try_parse_formatted_bytes` (returns `NULL`s on Errors) - Implementation of parsing logic in `StringUtil` (originally taken from `DBConfig`) - Updated the existing component (DBConfig) that relied on the old parsing logic to use the new `StringUtil` version. - Tests for the new scalar function ## Why? The initial motivation came from the implementation of duckdb#19726, which aimed to expose another column representing byte sizes. During the review, @Mytherin suggested exposing a dedicated function instead, keeping `duckdb_settings` lean while providing a more generally useful utility. `try_parse_formatted_bytes` can now be reused across various use cases where parsing byte-size strings is needed. Note - `try(parse_formatted_bytes)` acted differently than expected, it seems like it doesn't handle runtime `Exception`s. Exposing `try_parse_formatted_bytes` was my workaround.
…uckdb#20272) Fixes duckdb#20187 ## Description - This PR fixes the internal error "Pivot aggregate count mismatch" that occurs when using PIVOT with duplicate aggregates on ranges >= 21. - Currently, the binder was not correctly duplicating aggregate expressions for repeated aggregates in the USING clause. ## Changes - Modified `BindBoundPivot()` in `src/planner/binder/tableref/bind_pivot.cpp` - Added logic to detect and handle duplicate aggregate. - Expanded the deduplicated aggregates to match the user's expected column count. ## Testing Added test cases in `test/sql/pivot/test_pivot_duplicate_aggregates.test` to test Mixed duplicate and unique aggregates and different range sizes.
I met this issue when I tried to wrap `JsonReader`'s filesystem with caching filesystem. Current implementation for `CachingFileSystemWrapper` is buggy for `CanSeek`: - There're two types of filesystem instances in duckdb + One is raw filesystem, which interacts directly with storage backends, for example, local filesystem and S3 filesystem + Another is wrapper filesystem, which wraps filesystem(s) and provides additional features (i.e., virtual FS provides abstraction layer over multiple FS, caching FS enables read cache) - File handle's invocation [delegates to](https://github.com/duckdb/duckdb/blob/041f4cac6889afa6604c8deee9e6574b8d6ae3aa/src/common/file_system.cpp#L748-L750) filesystem's - For caching fs wrapper, it's the [cache-wrapped internal filesystem](https://github.com/duckdb/duckdb/blob/041f4cac6889afa6604c8deee9e6574b8d6ae3aa/src/storage/caching_file_system_wrapper.cpp#L339-L341), which could be either raw filesystem or wrapped filesystem and they should be treated separately ~~Proposed solution implemented in this PR:~~ ~~- Seek-ability is an immutable attribute for a filesystem~~ ~~+ For raw file systems, `CanSeek` (the one with no arguments provided) has been already implemented~~ ~~+ For wrapped filesystems, the function call should be properly delegated to internal filesystem instances, just as what we do for other APIs~~ Update: follow Lauren's suggestion to integrate `CachingMode` into `FileOpenFlags` to get the caching work handled inside of VirtualFileSystem.
This PR resolves several Windows‑specific inconsistencies in DuckDB’s formatting scripts, ensuring they behave deterministically across Windows and Unix‑like environments. ### **Improvements** - **Force Linux‑style line endings (`\n`)** when writing temporary formatted files. The `open_utf8` helper now forwards `newline='\n'` (and any additional keyword arguments), preventing Windows from injecting `\r\n` and causing spurious diffs in `scripts/format.py`. - **Normalize all paths to POSIX (`/`) form** when generating test metadata. `format_test_benchmark.py` now uses `Path(...).as_posix()` to ensure stable, platform‑independent `# name:` headers. **Example:** ``` good # name: benchmark/appian_benchmarks/q01.benchmark # description: Run query 01 from the appian benchmarks bad # name: benchmark\appian_benchmarks\q01.benchmark # description: Run query 01 from the appian benchmarks ``` ### **Why This Matters** These changes eliminate Windows‑specific formatting noise, making the formatting scripts fully deterministic across developer environments. This improves contributor experience and prevents CI failures caused by platform‑dependent EOL or path differences. ### **Scope** - No functional changes to DuckDB itself - Only affects developer tooling under `scripts/`
When environment variable DUCKDB_FORMAT_SKIP_VERSION_CHECKS is defined, skip failing on specific version This is out of spite, I could/should fixup my local setup, but also I think a way to skip the checks does make sense.
This PR try fixes issue duckdb#20233. hi duckdb team,I added the `DoesColumnAliasExist` function to `QualifyBinder` based on the code in the `where_bingder.cpp` file. Testing shows that it works correctly, but I'm not sure if it will affect other modules. thanks~
…uckdb#20407) This PR removes unused and redundant headers across the storage subsystem. The cleanup exposed missing explicit includes that were previously satisfied via transitive dependencies; these have been fixed by adding the required headers in the appropriate locations, improving include hygiene and Windows/MSVC compatibility. All CI workflows (Linux, macOS, Windows) pass on this branch.
…#20336) See issue duckdb#20335 for details.
Fixes duckdblabs/duckdb-internal#6956 and duckdblabs/duckdb-internal#7005 Test failed due to usage of `USE db_name` + `restart`. This pattern fails in any case, i.e. ``` loop i 1 2 statement ok ATTACH '__TEST_DIR__/test.db' as test; statement ok USE test restart endloop ``` also fails with ``` Catalog Error: SET search_path: No catalog + schema named "test.main" found. ``` removing `USE` and using declarative `db_name.tbl` solved the issue. Not 100% sure if this is desired behavior, but I would assume so.
Picking up duckdb#19606, solving merge conflict, and removing unnecessary exclusions of `time_ns` type. We are still missing the tests for the different precision types of `TIME`, these should be added in our Python client after this code gets merged, as we need arrow to produce these precision types.
…ion and improve ingest name resolution (duckdb#20369) Add support for `adbc.ingest.target_catalog` in the DuckDB ADBC ingest path. Close duckdb#20128 This allows ingesting into attached databases (catalogs) by propagating the catalog into both the CREATE/DROP SQL and the appender. When only target_catalog is provided (no target_db_schema), the driver defaults the schema to `main` and uses a fully-qualified 3-part name (`catalog_name.main.table_name`) to avoid DuckDB’s catalog/schema ambiguity with 2-part names. Temporary ingestion is also updated to align with common ADBC expectations: temporary tables are created in the `temp` schema and are distinct from persistent tables with the same name. `temporary` remains incompatible with target_db_schema / target_catalog at execution time, but enabling `temporary` after setting schema/catalog clears those options so ingest can proceed (mirroring behavior in the Postgres driver in the arrow-adbc repo). cc @lidavidm
`FileOpener` is needed for HTTP utils, which should be propagated from
virtual filesystem, to caching wrapper, to caching filesystem.
Checked with previously failed SQL and confirmed to work:
```sql
memory D select
unnest(data) as customers
from
read_json('https://non.existant/endpoint');
IO Error:
Could not establish connection error for HTTP HEAD to 'https://non.existant/endpoint'
LINE 4: read_json('https://non.existant/endpoint');
^
```
We may need to eliminate NOT operator for efficiency. It can 1. save an operator 2. beneficial for statistics propagation 3. more efficient ternary process In this pr, the rule eliminates NOT operator with the following tree patterns 1. Nested NOT ``` NOT NOT col1 > 1 TO col1 > 1 NOT NOT NOT col1 > 1 TO NOT col1 > 1 ``` 2. NOT with IS_NOT_NULL/IS_NULL ``` NOT IS_NOT_NULL/IS_NULL ==> IS_NULL/IS_NOT_NULL ``` 4. NOT with AND ``` NOT (col1 > 1 AND col2 <= 2) TO col1 <= 1 OR col2 > 2 ``` 5. NOT with OR ``` NOT (col1 > 1 OR col2 <= 2) TO col1 <= 1 AND col2 > 2 ``` Please let me know if there is anything wrong with the above understanding :), thanks
lnkuiper
added a commit
that referenced
this pull request
Jan 12, 2026
Fixes duckdblabs/duckdb-internal#7080 and other potential nightly test failures related to USE + restart in encryption tests. Same cause as here #20409 Was #20450, now targeted to the right branch
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes https://github.com/duckdblabs/duckdb-internal/issues/7080 and other potential nightly test failures related to USE + restart in encryption tests.
Same cause as here #20409