Skip to content

Make convertToFullIfNeeded recursive for compound column types#97493

Merged
alexey-milovidov merged 6 commits intomasterfrom
fix-convertToFullIfNeeded-recursive
Feb 22, 2026
Merged

Make convertToFullIfNeeded recursive for compound column types#97493
alexey-milovidov merged 6 commits intomasterfrom
fix-convertToFullIfNeeded-recursive

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

Summary

  • IColumn::convertToFullIfNeeded now recursively converts inner columns of compound types (ColumnTuple, ColumnArray, ColumnNullable, ColumnMap, etc.) using forEachSubcolumn
  • Previously only the outermost column was converted (const/replicated/sparse/LC), leaving inner sparse columns intact
  • This caused assertion failures when Set::appendSetElements or MergeTreeIndexAggregatorSet::update processed batches from different MergeTree parts with different sparse serialization profiles for inner tuple elements

Closes #97474
Closes #97335

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=97431&sha=6fe0e8c44dd5c6a12583390b4f8fc45e226a3cb4&name_0=PR&name_1=Stateless%20tests%20%28arm_asan%2C%20azure%2C%20parallel%29

Changelog category (leave one):

  • Critical Bug Fix (crash, data loss, RBAC) or LOGICAL_ERROR

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix assertion failure (exception in debug/sanitizer builds) in Set and MergeTreeIndexSet when processing columns with inner sparse subcolumns (e.g., Tuple columns from MergeTree parts with different sparse serialization profiles).

🤖 Generated with Claude Code

`IColumn::convertToFullIfNeeded` only converted the outermost column
(const, replicated, sparse, low cardinality) but did not recurse into
subcolumns of compound types like `ColumnTuple`, `ColumnArray`,
`ColumnNullable`, etc. This caused assertion failures in debug/sanitizer
builds when `Set::appendSetElements` or `MergeTreeIndexAggregatorSet::update`
processed batches from different MergeTree parts that had different sparse
serialization profiles for inner tuple elements.

The fix uses `forEachSubcolumn` to recursively convert all inner columns,
which generically handles all compound column types.

Closes #97474
Closes #97335

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=97431&sha=6fe0e8c44dd5c6a12583390b4f8fc45e226a3cb4&name_0=PR&name_1=Stateless%20tests%20%28arm_asan%2C%20azure%2C%20parallel%29

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Feb 20, 2026

Workflow [PR], commit [b039c03]

Summary:

The test creates a MergeTree table with a Tuple column where different
parts have different sparse serialization profiles for inner elements,
then uses `IN` with a subquery to trigger `Set::appendSetElements` which
previously failed with an assertion because `ColumnTuple` inner columns
were not recursively converted from sparse to full.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@@ -0,0 +1,27 @@
-- Tags: no-random-merge-tree-settings
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not reproduce.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, now it does.

@alexey-milovidov alexey-milovidov marked this pull request as draft February 20, 2026 16:04
alexey-milovidov and others added 2 commits February 20, 2026 17:08
The assertion in `ColumnTuple::doInsertRangeFrom` only fires when the
destination is non-sparse (`ColumnVector`) and the source is sparse
(`ColumnSparse`), because `ColumnSparse::doInsertRangeFrom` handles
both cases. Swap insert order so the non-sparse part (all_1_1_0) is
read first into `set_elements`, and the sparse part comes second.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`Set::appendSetElements` is only called when `fill_set_elements` is true,
which happens when `KeyCondition` calls `buildOrderedSetInplace` for index
evaluation. The IN column must be part of the ORDER BY key for this to happen.

Changed `ORDER BY key` to `ORDER BY val` so that the tuple column is in the
primary key and `KeyCondition` triggers the `buildOrderedSetInplace` path.

Verified: aborts on unpatched debug build (exit code 134), passes on patched.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@alexey-milovidov alexey-milovidov marked this pull request as ready for review February 20, 2026 18:15
@Avogar Avogar self-assigned this Feb 20, 2026
@alexey-milovidov alexey-milovidov added this pull request to the merge queue Feb 22, 2026
Merged via the queue into master with commit 4913faf Feb 22, 2026
148 checks passed
@alexey-milovidov alexey-milovidov deleted the fix-convertToFullIfNeeded-recursive branch February 22, 2026 15:39
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label Feb 22, 2026
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Feb 22, 2026
robot-ch-test-poll4 added a commit that referenced this pull request Feb 22, 2026
Cherry pick #97493 to 25.11: Make convertToFullIfNeeded recursive for compound column types
robot-clickhouse added a commit that referenced this pull request Feb 22, 2026
robot-ch-test-poll4 added a commit that referenced this pull request Feb 22, 2026
Cherry pick #97493 to 25.12: Make convertToFullIfNeeded recursive for compound column types
robot-clickhouse added a commit that referenced this pull request Feb 22, 2026
robot-ch-test-poll4 added a commit that referenced this pull request Feb 22, 2026
Cherry pick #97493 to 26.1: Make convertToFullIfNeeded recursive for compound column types
robot-clickhouse added a commit that referenced this pull request Feb 22, 2026
clickhouse-gh bot added a commit that referenced this pull request Feb 22, 2026
Backport #97493 to 25.12: Make convertToFullIfNeeded recursive for compound column types
clickhouse-gh bot added a commit that referenced this pull request Feb 22, 2026
Backport #97493 to 25.11: Make convertToFullIfNeeded recursive for compound column types
alexey-milovidov added a commit that referenced this pull request Feb 22, 2026
The `concat` function's `executeFormatImpl` called `convertToFullIfNeeded`
which, after it was made recursive in #97493, strips `LowCardinality` from
inside compound column types like `Variant`. However, the argument type is
not updated to match, so `SerializationLowCardinality` tries to cast a
non-LC column, triggering a LOGICAL_ERROR exception in debug builds.

Fix by using only top-level column conversions (Const, Sparse, LowCardinality)
instead of the recursive `convertToFullIfNeeded`, keeping the column
structure consistent with the type's serialization.

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=97581&sha=320e7c9d8876b04a971bd26214b9ac0ab433c250&name_0=PR&name_1=AST%20fuzzer%20%28amd_debug%29

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Feb 22, 2026
alexey-milovidov added a commit that referenced this pull request Feb 23, 2026
The `concat` function's `executeFormatImpl` called `convertToFullIfNeeded`
which, after it was made recursive in #97493, strips `LowCardinality` from
inside compound column types like `Variant`. However, the argument type is
not updated to match, so `SerializationLowCardinality` tries to cast a
non-LC column, triggering a LOGICAL_ERROR exception in debug builds.

Fix by using only top-level column conversions (Const, Sparse, LowCardinality)
instead of the recursive `convertToFullIfNeeded`, keeping the column
structure consistent with the type's serialization.

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=97581&sha=320e7c9d8876b04a971bd26214b9ac0ab433c250&name_0=PR&name_1=AST%20fuzzer%20%28amd_debug%29

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Algunenano added a commit to Algunenano/ClickHouse that referenced this pull request Feb 24, 2026
…ke `Tuple`

`convertToFullIfNeeded` (made recursive in ClickHouse#97493) strips `LowCardinality`
from inner subcolumns of compound types (Tuple, Array, etc.) in
`Set::insertFromColumns`, but `set_elements_types` in `Set::setHeader`
only stripped top-level `LowCardinality`. This created a column/type
mismatch: `ColumnVector<Int8>` in the column vs `LowCardinality(Int8)`
in the type.

When `KeyCondition::tryPrepareSetColumnsForIndex` unpacked the tuple to
build index conditions for `IN` subqueries, `castColumn` tried to
interpret the plain column as `ColumnLowCardinality`, causing a
LOGICAL_ERROR exception in debug builds.

Fix by using `recursiveRemoveLowCardinality` for `set_elements_types` in
both `Set::setHeader` and the static `Set::getElementTypes`, matching
what `convertToFullIfNeeded` does to columns.

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=97831&sha=6f56284303c16ecc5f5a437491c661d5019f2977&name_0=PR&name_1=AST%20fuzzer%20%28amd_debug%29

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Avogar added a commit that referenced this pull request Feb 24, 2026
Backport #97493 to 26.1: Make convertToFullIfNeeded recursive for compound column types
Algunenano pushed a commit to Algunenano/ClickHouse that referenced this pull request Feb 24, 2026
…IfNeeded-recursive

Make convertToFullIfNeeded recursive for compound column types
Algunenano pushed a commit to Algunenano/ClickHouse that referenced this pull request Feb 24, 2026
The `concat` function's `executeFormatImpl` called `convertToFullIfNeeded`
which, after it was made recursive in ClickHouse#97493, strips `LowCardinality` from
inside compound column types like `Variant`. However, the argument type is
not updated to match, so `SerializationLowCardinality` tries to cast a
non-LC column, triggering a LOGICAL_ERROR exception in debug builds.

Fix by using only top-level column conversions (Const, Sparse, LowCardinality)
instead of the recursive `convertToFullIfNeeded`, keeping the column
structure consistent with the type's serialization.

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=97581&sha=320e7c9d8876b04a971bd26214b9ac0ab433c250&name_0=PR&name_1=AST%20fuzzer%20%28amd_debug%29

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

4 participants