Fixes row-count inconsistency in MergeTreeReaderIndex when the part’s only or final granule contains fewer rows than index_granularity.#89555
Conversation
|
Workflow [PR], commit [a1cf4a8] Summary: ❌
|
Ergus
left a comment
There was a problem hiding this comment.
Please add more details about the issue and steps on how to reproduce it.
I would recommend to open a new issue describing the problem and reference it as Fixes in this PR description.
Also if you open the issue you can rename the Test as: 03534_skip_index_bug####.sql to have all the information if something similar happens again in the future or if the tests starts failing with other changes. Please check other similar tests.
| -- { echo ON } | ||
|
|
||
| SET use_skip_indexes_on_data_read = 1; | ||
|
|
There was a problem hiding this comment.
Please remove these spaces between sets
|
|
||
| SET merge_tree_read_split_ranges_into_intersecting_and_non_intersecting_injection_probability=0; | ||
|
|
||
|
|
There was a problem hiding this comment.
The style checker will complain if there are two empty lines
| CREATE TABLE tab | ||
| ( | ||
| `id` Int64, | ||
| `trace_id` FixedString(16) CODEC(ZSTD(1)), |
There was a problem hiding this comment.
Is the compression really needed to reproduce the issue?
There was a problem hiding this comment.
You’re right — compression isn’t required to reproduce the issue. I’ve simplified the test accordingly.
|
|
||
| insert into tab select 100, unhex('31E5C3CAFA300A8DE5A84B740E2F2DB0'), 'aaa'; | ||
|
|
||
| SELECT id,text FROM tab WHERE trace_id = unhex('31E5C3CAFA300A8DE5A84B740E2F2DB0'); |
There was a problem hiding this comment.
Considering that this is intended to address two different issues, could you add a test for each? And a comment is very welcome.
There was a problem hiding this comment.
Makes sense — I’ll add individual tests for each case and include a clarifying comment in the code.
| } | ||
|
|
||
| void MergeTreeRangeReader::ReadResult::adjustLastGranule() | ||
| void MergeTreeRangeReader::ReadResult::adjustLastGranule(std::optional<size_t> actual_num_read_rows) |
There was a problem hiding this comment.
Just a Suggestion: If you use a ssize_t with -1 as default instead of an std::optional it will be simpler and lighter. In simple use cases like this, using an optional is a bit overkill IMO ;)
There was a problem hiding this comment.
Ok, thanks for your suggestions.
@Ergus Thanks a lot for the detailed review and suggestions! I’ll open a new issue describing the problem in detail, including clear reproduction steps and context about the affected settings. |
Ergus
left a comment
There was a problem hiding this comment.
Hi @fastio
If you created the issue related with this; please, explain it briefly in the test header and rename your new test like: 02346_text_index_bugNNNN.sql similar to other tests. So we keep all the information in the future if this test fails ;)
| @@ -0,0 +1,31 @@ | |||
| -- Tags: no-parallel-replicas | |||
| -- no-parallel-replicas: use_skip_indexes_on_data_read is not supported with parallel replicas. | |||
There was a problem hiding this comment.
we need to check this because it has changed. IIRC use_skip_indexes_on_data_read now works with parallel replicas
|
|
||
| SET use_skip_indexes_on_data_read = 1; | ||
| SET use_skip_indexes = 1; | ||
| SET use_query_condition_cache=0; |
There was a problem hiding this comment.
Very Minor: use_query_condition_cache = 0
| SET use_skip_indexes_on_data_read = 1; | ||
| SET use_skip_indexes = 1; | ||
| SET use_query_condition_cache=0; | ||
| SET merge_tree_read_split_ranges_into_intersecting_and_non_intersecting_injection_probability=0; |
|
Hi @fastio Many fast tests seem to be failing. It is not obvious for me how are them related with your changes. Could you give it a look? |
Thanks. This PR currently fixes two separate bugs. I think it would be better to split them. |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Fixes a row-count mismatch in MergeTreeReaderIndex when the part's only or last granule has fewer rows than index_granularity. Resolves #89691
Details
This PR addresses two distinct corner cases in the read path, both related to row count inconsistencies when MergeTreeReaderIndex is the first reader in the MergeTreeReadersChain.
Case 1 — Single-granule part with fewer rows than index_granularity
When a part consists of only one granule, and its total number of rows is less than the configured index_granularity, the current implementation of MergeTreeReaderIndex unconditionally returns index_granularity rows.
The subsequent reader in the chain (e.g., MergeTreeReader) then reads the actual number of rows from the part.
This discrepancy leads to a mismatch during the continueReadingChain consistency check and triggers a LOGICAL_ERROR.
How to reproduce it?
https://fiddle.clickhouse.com/73b7c1a9-ec7f-462e-9e79-b507e1b2b948
Case 2 — The last granule with fewer rows than index_granularity
A similar issue occurs when the final granule of a part has fewer rows than index_granularity.
In this case as well, MergeTreeReaderIndex reports index_granularity rows, while the following reader produces the true row count, again causing an internal row-count mismatch and resulting in a LOGICAL_ERROR.
How to reproduce it?
https://fiddle.clickhouse.com/5f030c29-feaa-4f47-89c3-6c3beafbbfc7
Resolves #89691