Skip to content

Randomize new settings in 25.12: skip index for topK#91782

Merged
alexey-milovidov merged 7 commits intomasterfrom
randomize-skip-index-topk-settings
Mar 19, 2026
Merged

Randomize new settings in 25.12: skip index for topK#91782
alexey-milovidov merged 7 commits intomasterfrom
randomize-skip-index-topk-settings

Conversation

@fm4v
Copy link
Copy Markdown
Member

@fm4v fm4v commented Dec 9, 2025

Changelog category (leave one):

  • Build/Testing/Packaging Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Randomize settings in clickhouse-test: use_skip_indexes_for_top_k, use_top_k_dynamic_filtering, query_plan_max_limit_for_top_k_optimization

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Dec 9, 2025

Workflow [PR], commit [e7857ea]

Summary:


AI Review

Summary

This PR randomizes use_skip_indexes_for_top_k, use_top_k_dynamic_filtering, and query_plan_max_limit_for_top_k_optimization in clickhouse-test, then pins/disables those settings in several stateless tests to preserve determinism. Most adjustments are appropriate, but one change hides instability rather than fixing it: adding no-flaky-check to a vector-search test. High-level verdict: request changes to keep flaky detection active.

Missing context

  • ⚠️ Full CI is still running (many build jobs are pending), so I could not validate final cross-configuration stability from completed CI reports.

Findings

  • ⚠️ Majors
    • [tests/queries/0_stateless/02354_vector_search_expansion_search.sql:1] Adding the no-flaky-check tag suppresses flaky detection for this test. This can mask real regressions introduced by randomized top-k settings instead of enforcing determinism.
    • Suggested fix: remove no-flaky-check and stabilize the test behavior (deterministic input/assertions), or if temporary, keep detection enabled and track via a linked issue with a clear cleanup plan.

ClickHouse Rules

Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
Safe rollout ⚠️ no-flaky-check weakens rollout signal quality by hiding potential instability
Compilation time

Final Verdict

  • Status: ⚠️ Request changes
  • Minimum required actions:
    • Remove no-flaky-check from 02354_vector_search_expansion_search.sql (or provide a narrowly scoped, issue-linked temporary strategy that does not suppress flaky detection for the whole test).

@clickhouse-gh clickhouse-gh bot added the pr-build Pull request with build/testing/packaging improvement label Dec 9, 2025
alexey-milovidov added a commit that referenced this pull request Mar 16, 2026
Tests that check EXPLAIN output or row counts for ORDER BY ... LIMIT
queries need deterministic topK optimization behavior. Pin
`use_skip_indexes_for_top_k` and `use_top_k_dynamic_filtering` to 0
in tests that don't test topK, and pin
`query_plan_max_limit_for_top_k_optimization` to a high value in tests
that rely on topK being active.

#91782

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alexey-milovidov alexey-milovidov self-assigned this Mar 16, 2026
Tests that verify specific query plans, pipeline structures, or row
counts break when `use_skip_indexes_for_top_k`, `use_top_k_dynamic_filtering`,
and `query_plan_max_limit_for_top_k_optimization` are randomized.

Pin `use_skip_indexes_for_top_k = 0` and `use_top_k_dynamic_filtering = 0`
in tests that depend on deterministic read_in_order behavior, EXPLAIN
output, or lazy materialization. Pin `query_plan_max_limit_for_top_k_optimization = 1000`
in the topK-specific test to ensure LIMIT 10 always qualifies for the
optimization.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 17, 2026

CLA assistant check
All committers have signed the CLA.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alexey-milovidov alexey-milovidov force-pushed the randomize-skip-index-topk-settings branch from 87a9277 to 07f74f6 Compare March 17, 2026 03:22
alexey-milovidov and others added 4 commits March 17, 2026 06:40
`03711_top_k_by_skip_index_negative`: pin `query_plan_max_limit_for_top_k_optimization`
so that LIMIT 5 always qualifies for the optimization.

`03924_sorting_step_stats`: disable `use_skip_indexes_for_top_k` and
`use_top_k_dynamic_filtering` to prevent extra steps in EXPLAIN output.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alexey-milovidov alexey-milovidov added this pull request to the merge queue Mar 19, 2026
Merged via the queue into master with commit d5b0bee Mar 19, 2026
163 checks passed
@alexey-milovidov alexey-milovidov deleted the randomize-skip-index-topk-settings branch March 19, 2026 16:12
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 19, 2026
@alexey-milovidov
Copy link
Copy Markdown
Member

I think it broke the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-build Pull request with build/testing/packaging improvement pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants