Optimize index analysis by applying query condition cache earlier and caching filtered ranges#82380
Conversation
02118d5 to
58689af
Compare
| } | ||
|
|
||
| /// Fill query condition cache with ranges excluded by index analysis. | ||
| if (reader_settings.use_query_condition_cache && query_info_.filter_actions_dag) |
There was a problem hiding this comment.
Function filterPartsByQueryConditionCache also considers the PREWHERE info:
if (const auto & prewhere_info = select_query_info.prewhere_info)
{
for (const auto * outputs : prewhere_info->prewhere_actions.getOutputs())
{
if (outputs->result_name == prewhere_info->prewhere_column_name)
{
auto stats = drop_mark_ranges(outputs);
LOG_DEBUG(log,
"Query condition cache has dropped {}/{} granules for PREWHERE condition {}.",
stats.granules_dropped,
stats.total_granules,
prewhere_info->prewhere_column_name);
break;
}
}
}
if (const auto & filter_actions_dag = select_query_info.filter_actions_dag)
{
const auto * output = filter_actions_dag->getOutputs().front();
auto stats = drop_mark_ranges(output);
LOG_DEBUG(log,
"Query condition cache has dropped {}/{} granules for WHERE condition {}.",
stats.granules_dropped,
stats.total_granules,
filter_actions_dag->getOutputs().front()->result_name);
}... whereas here, we don't. Should we?
There was a problem hiding this comment.
filter_actions_dag is a superset of the prewhere information, encompassing the full filtering context—including WHERE clauses, row-level security policies, and additional filters. It is the only source used for index analysis, as prewhere alone is incomplete and insufficient for that purpose.
Note: To support skip indexes that rely on ORDER BY ... LIMIT semantics (e.g., vector search TopN), the filter will need to be extended to capture ordering and limit information as well. This enhancement is currently a TODO.
| auto data_part = remaining_ranges.data_part; | ||
| String part_name = data_part->isProjectionPart() ? fmt::format("{}:{}", data_part->getParentPartName(), data_part->name) | ||
| : data_part->name; | ||
| query_condition_cache->write( |
There was a problem hiding this comment.
So with this PR, the same query now writes potentially more than once in the cache (during index analysis and at the end of scan).
It is not clear to me what entry will prevail in the cache?
There was a problem hiding this comment.
There's no actual duplication in cache writes. Let's assume that the index analysis didn't happen at all, prewhere or where would still evaluate the predicate, potentially drop the entire granule, and update the cache accordingly. So the write during index analysis isn't extra, it just happens earlier in the pipeline.
As a side note, the current QCC implementation tracks granule states at the read task level rather than per granule. While this could be refined, I didn't observe any measurable performance gain on ClickBench, so it's left as a TODO for now.
… caching filtered ranges
|
As far as I can tell, the current QCC implementation does not properly recognize function determinism. For example, it caches predicates involving Additionally, it's unclear how the system behaves in the event of a hash collision. While such collisions are unlikely in practice, they are theoretically possible and should be accounted for. |
58689af to
9a40aa8
Compare
| /// Fill query condition cache with ranges excluded by index analysis. | ||
| if (condition_hash) | ||
| { | ||
| RangesInDataParts remaining; |
There was a problem hiding this comment.
I had the same question as Robert about this code block. But I got it now - we don't want to repeat PK analysis on the ranges that were already rejected by the first PK analysis for a predicate. Hence we update the query condition cache with the excluded ranges.
If a part is fully skipped, then CPU use will be ignorable But if a part's partial ranges are selected, does the computation of remaining ranges take more than a bit of CPU? And this code block will be executed on every query execution with the same predicate?
There was a problem hiding this comment.
Yes, but the impact should be negligible. While it does a small diff calculation on the part ranges every time, this calculation has linear complexity and is insignificant compared to the I/O cost of each part range may occur later.
Vector search queries don't use query condition cache since the condition matching logic does not analyze Thanks for the PR! Please refresh the PR and I will complete the review. |
Yes. It's #82380 (comment)
Sure. Will do it today. |
@shankar-iyer Could you help take another look? It seems CH Inc sync has some test failures and it's likely need |
Failure is in ASAN tests, looks like runtime limit exceed. I will get back after confirming. |
|
Sure, refreshed. |
|
Fast test failed. @amosbird Please check the latest merge, specially for tests prefixed with |
Something went wrong in |
a80044b
|
@amosbird Could you clarify why |
|
Because these tests check the |
|
Issue for broken |
|
This may work worse for simple queries - https://pastila.nl/?043a6f47/7cfc432f86e6fd44263b0b6a3960eb71#k/dvV6ZpLnWx3No9s8pmvA== |
|
Doesn't look related to this change but I see multiple integration tests failures in #88477 where I explicitly disabled use_query_condition_cache and ran all the tests. |
This could mean lots of things, but in your case it is due to you enabled this setting for old server as well (see tests with non standard tag, i.e. P.S. You can find all the details in artifacts |
Yes, too many parts to do index analysis + only 1 range / part matches the predicate + not much data to scan. There is some contention and CPU in looking up and updating the cache. Query condition cache is best when - Mix of PK & non-PK predicates + lots of ranges to scan (with or without skip indexes) + low selectivity of the predicates. I think we need to look at #87498 and consider this - would a PK only predicate be always better evaluated by binary search / exclusion scan + cost of reading the primary key v/s looking up the query condition cache for the PK only predicate |
…ry condition cache PR #82380 moved Query Condition Cache filtering before the distributed index analysis code path. QCC can split a single mark range into multiple ranges when it has cached data about which marks don't match. PR #98269 removed the same assertion from `distributedIndexAnalysis.cpp` but missed this one in `ReadFromMergeTree.cpp`. The assertion is unnecessary since the ranges are immediately replaced on the next line. https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=98770&sha=ad7e094fdd4b057b0bac0acb5b505270f79f3b3d&name_0=PR&name_1=AST%20fuzzer%20%28amd_debug%29 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ref: ClickHouse/ClickBench#412
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Improved query performance by refactoring the order and integration of Query Condition Cache (QCC) with index analysis. QCC filtering is now applied before primary key and skip index analysis, reducing unnecessary index computation. Index analysis has been extended to support multiple range filters, and its filtering results are now stored back into the QCC. This significantly speeds up queries where index analysis dominates execution time—especially those relying on skip indexes (e.g. vector or inverted indexes)
Resolves #85779.