Fixes row-count inconsistency in MergeTreeReaderIndex when the part’s only or final granule contains fewer rows than index_granularity. by fastio · Pull Request #89555 · ClickHouse/ClickHouse

fastio · 2025-11-05T13:16:35Z

Changelog category (leave one):

Critical Bug Fix (crash, data loss, RBAC) or LOGICAL_ERROR

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fixes a row-count mismatch in MergeTreeReaderIndex when the part's only or last granule has fewer rows than index_granularity. Resolves #89691

Details

This PR addresses two distinct corner cases in the read path, both related to row count inconsistencies when MergeTreeReaderIndex is the first reader in the MergeTreeReadersChain.

Case 1 — Single-granule part with fewer rows than index_granularity

When a part consists of only one granule, and its total number of rows is less than the configured index_granularity, the current implementation of MergeTreeReaderIndex unconditionally returns index_granularity rows.
The subsequent reader in the chain (e.g., MergeTreeReader) then reads the actual number of rows from the part.
This discrepancy leads to a mismatch during the continueReadingChain consistency check and triggers a LOGICAL_ERROR.

How to reproduce it?

https://fiddle.clickhouse.com/73b7c1a9-ec7f-462e-9e79-b507e1b2b948

Case 2 — The last granule with fewer rows than index_granularity

A similar issue occurs when the final granule of a part has fewer rows than index_granularity.
In this case as well, MergeTreeReaderIndex reports index_granularity rows, while the following reader produces the true row count, again causing an internal row-count mismatch and resulting in a LOGICAL_ERROR.

How to reproduce it?

https://fiddle.clickhouse.com/5f030c29-feaa-4f47-89c3-6c3beafbbfc7

Resolves #89691

clickhouse-gh · 2025-11-05T13:25:36Z

Workflow [PR], commit [a1cf4a8]

Summary: ❌

job_name	test_name	status	info
Fast test		failure
	02233_set_enable_with_statement_cte_perf	FAIL	cidb
	03275_matview_with_union	FAIL	cidb
	03166_skip_indexes_vertical_merge_1	FAIL	cidb
	01710_projections_partial_optimize_aggregation_in_order	FAIL	cidb
	03408_limit_by_rows_before_limit	FAIL	cidb
	02131_used_row_policies_in_query_log	FAIL	cidb
	03100_lwu_27_update_after_on_fly_mutations	FAIL	cidb
	02751_parallel_replicas_bug_chunkinfo_not_set	FAIL	cidb
	00804_test_custom_compression_codecs	FAIL	cidb
	03008_optimize_equal_ranges	FAIL	cidb
	490 more test cases not shown
Build (amd_debug)		dropped
Build (amd_asan)		dropped
Build (amd_tsan)		dropped
Build (amd_msan)		dropped
Build (amd_ubsan)		dropped
Build (amd_binary)		dropped
Build (arm_asan)		dropped
Build (arm_binary)		dropped
Build (arm_tsan)		dropped

Ergus

Please add more details about the issue and steps on how to reproduce it.

I would recommend to open a new issue describing the problem and reference it as Fixes in this PR description.

Also if you open the issue you can rename the Test as: 03534_skip_index_bug####.sql to have all the information if something similar happens again in the future or if the tests starts failing with other changes. Please check other similar tests.

Ergus · 2025-11-06T12:52:23Z

tests/queries/0_stateless/03534_skip_index_on_data_reading_crash.sql

+-- { echo ON }
+
+SET use_skip_indexes_on_data_read = 1;
+


Please remove these spaces between sets

Ok, got it.

Ergus · 2025-11-06T12:53:04Z

tests/queries/0_stateless/03534_skip_index_on_data_reading_crash.sql

+
+SET merge_tree_read_split_ranges_into_intersecting_and_non_intersecting_injection_probability=0;
+
+


The style checker will complain if there are two empty lines

Thanks, I'll fix it.

Ergus · 2025-11-06T12:54:01Z

tests/queries/0_stateless/03534_skip_index_on_data_reading_crash.sql

+CREATE TABLE tab
+(
+    `id` Int64,
+    `trace_id` FixedString(16) CODEC(ZSTD(1)),


Is the compression really needed to reproduce the issue?

You’re right — compression isn’t required to reproduce the issue. I’ve simplified the test accordingly.

Ergus · 2025-11-06T12:55:26Z

tests/queries/0_stateless/03534_skip_index_on_data_reading_crash.sql

+
+insert into tab select 100, unhex('31E5C3CAFA300A8DE5A84B740E2F2DB0'), 'aaa';
+
+SELECT id,text FROM tab WHERE trace_id = unhex('31E5C3CAFA300A8DE5A84B740E2F2DB0');


Considering that this is intended to address two different issues, could you add a test for each? And a comment is very welcome.

Makes sense — I’ll add individual tests for each case and include a clarifying comment in the code.

Ergus · 2025-11-06T13:01:04Z

src/Storages/MergeTree/MergeTreeRangeReader.cpp

 }

-void MergeTreeRangeReader::ReadResult::adjustLastGranule()
+void MergeTreeRangeReader::ReadResult::adjustLastGranule(std::optional<size_t> actual_num_read_rows)


Just a Suggestion: If you use a ssize_t with -1 as default instead of an std::optional it will be simpler and lighter. In simple use cases like this, using an optional is a bit overkill IMO ;)

Ok, thanks for your suggestions.

src/Storages/MergeTree/MergeTreeRangeReader.cpp

fastio · 2025-11-07T03:13:31Z

Please add more details about the issue and steps on how to reproduce it.

I would recommend to open a new issue describing the problem and reference it as Fixes in this PR description.

Also if you open the issue you can rename the Test as: 03534_skip_index_bug####.sql to have all the information if something similar happens again in the future or if the tests starts failing with other changes. Please check other similar tests.

@Ergus Thanks a lot for the detailed review and suggestions!

I’ll open a new issue describing the problem in detail, including clear reproduction steps and context about the affected settings.
I’ll push the updates shortly.

…astGranule

Ergus

Hi @fastio

If you created the issue related with this; please, explain it briefly in the test header and rename your new test like: 02346_text_index_bugNNNN.sql similar to other tests. So we keep all the information in the future if this test fails ;)

Ergus · 2025-11-17T13:19:12Z

tests/queries/0_stateless/03534_skip_index_bug_in_read_chain.sql

@@ -0,0 +1,31 @@
+-- Tags: no-parallel-replicas
+-- no-parallel-replicas: use_skip_indexes_on_data_read is not supported with parallel replicas.


we need to check this because it has changed. IIRC use_skip_indexes_on_data_read now works with parallel replicas

Ergus · 2025-11-17T13:20:54Z

tests/queries/0_stateless/03534_skip_index_bug_in_read_chain.sql

+
+SET use_skip_indexes_on_data_read = 1;
+SET use_skip_indexes = 1;
+SET use_query_condition_cache=0;


Very Minor: use_query_condition_cache = 0

Ergus · 2025-11-17T13:21:03Z

tests/queries/0_stateless/03534_skip_index_bug_in_read_chain.sql

+SET use_skip_indexes_on_data_read = 1;
+SET use_skip_indexes = 1;
+SET use_query_condition_cache=0;
+SET merge_tree_read_split_ranges_into_intersecting_and_non_intersecting_injection_probability=0;


same than above

Ergus · 2025-11-17T23:22:44Z

Hi @fastio

Many fast tests seem to be failing. It is not obvious for me how are them related with your changes. Could you give it a look?

fastio · 2025-11-18T02:20:04Z

Hi @fastio

Many fast tests seem to be failing. It is not obvious for me how are them related with your changes. Could you give it a look?

Thanks. This PR currently fixes two separate bugs. I think it would be better to split them.
I’ll close this PR and submit separate PRs for each issue.

Fix MergeTreeReaderIndex row count mismatch for last granule

92ff1a5

nikitamikhaylov added the can be tested Allows running workflows for external contributors label Nov 5, 2025

clickhouse-gh bot added pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! labels Nov 5, 2025

add test case

785de72

fastio changed the title ~~Fix crash when use_skip_indexes_on_data_read=1 and last granule has fewer rows~~ Fix MergeTreeReaderIndex row count mismatch for last granule Nov 6, 2025

Ergus self-assigned this Nov 6, 2025

Ergus reviewed Nov 6, 2025

View reviewed changes

fastio changed the title ~~Fix MergeTreeReaderIndex row count mismatch for last granule~~ Fixes row-count inconsistency in MergeTreeReaderIndex when the part’s only or final granule contains fewer rows than index_granularity. Nov 7, 2025

test cases

66c72df

fastio requested a review from Ergus November 17, 2025 11:49

fastio marked this pull request as draft November 17, 2025 12:14

fastio marked this pull request as ready for review November 17, 2025 13:09

Merge branch 'ClickHouse:master' into bugfix-MergeTreeReaderIndexForL…

a1cf4a8

…astGranule

Ergus reviewed Nov 17, 2025

View reviewed changes

fastio closed this Nov 18, 2025

This was referenced Nov 18, 2025

Fixbug merge tree reader rows incorrect #90254

Merged

Fix row count mismatch in MTReaderIndex when the last granule contains fewer rows than index_granularity #90614

Closed


		SET merge_tree_read_split_ranges_into_intersecting_and_non_intersecting_injection_probability=0;


		insert into tab select 100, unhex('31E5C3CAFA300A8DE5A84B740E2F2DB0'), 'aaa';

		SELECT id,text FROM tab WHERE trace_id = unhex('31E5C3CAFA300A8DE5A84B740E2F2DB0');

		@@ -0,0 +1,31 @@
		-- Tags: no-parallel-replicas
		-- no-parallel-replicas: use_skip_indexes_on_data_read is not supported with parallel replicas.

Conversation

fastio commented Nov 5, 2025 • edited by clickhouse-gh bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Details

Uh oh!

clickhouse-gh bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ergus left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fastio Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fastio commented Nov 7, 2025

Uh oh!

Ergus left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ergus commented Nov 17, 2025

Uh oh!

fastio commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fastio commented Nov 5, 2025 •

edited by clickhouse-gh bot

Loading

clickhouse-gh bot commented Nov 5, 2025 •

edited

Loading

fastio Nov 7, 2025 •

edited

Loading