Reduce redundant pack reads during building bitmap filter for delta merge case (#9876) by ti-chi-bot · Pull Request #10458 · pingcap/tiflash

ti-chi-bot · 2025-09-25T14:26:04Z

This is an automated cherry-pick of #9876

What problem does this PR solve?

Issue Number: close #9875

Problem Summary:
See #9875
This optimization can provide a significant performance boost when the data being read is not updated frequently.

What is changed and how it works?

Reduce redundant pack reads during bitmap filter building for delta merge case

Run CHBenchmark 1500: Overall 26.5% Improvement

mysql> select sum(DELTA_MEMTABLE_ROWS+DELTA_PERSISTED_ROWS) delta_rows, TIDB_TABLE from INFORMATION_SCHEMA.TIFLASH_SEGMENTS where TIDB_DATABASE='chbench_delta' group by TIDB_TABLE order by delta_rows;
+------------+------------+
| delta_rows | TIDB_TABLE |
+------------+------------+
|          0 | item       |
|         15 | region     |
|         75 | nation     |
|       4704 | warehouse  |
|      20000 | supplier   |
|      49790 | district   |
|     183924 | new_order  |
|    1022080 | history    |
|    1193288 | orders     |
|    1996224 | customer   |
|    7977425 | stock      |
|   41277141 | order_line |
+------------+------------+
12 rows in set (0.09 sec)

mysql> select sum(DELTA_MEMTABLE_ROWS+DELTA_PERSISTED_ROWS) delta_rows, sum(stable_rows) stable_rows, 100 * sum(DELTA_MEMTABLE_ROWS+DELTA_PERSISTED_ROWS) / sum(stable_rows) delta_percent from INFORMATION_SCHEMA.TIFLASH_SEGMENTS where TIDB_DATABASE='chbench_delta';
+------------+-------------+---------------+
| delta_rows | stable_rows | delta_percent |
+------------+-------------+---------------+
|   53724666 |  2283695006 |        2.3525 |
+------------+-------------+---------------+
1 row in set (0.09 sec)

The above table shows the delta rows for each table in chbench_delta database.

Question	Original	Optimization	Improvement
Q1	3.30	1.17	182.05%
Q2	0.90	0.88	2.27%
Q3	4.34	1.86	133.33%
Q4	5.11	2.88	77.43%
Q5	14.07	11.56	21.71%
Q6	2.98	1.00	198.00%
Q7	8.01	5.62	42.53%
Q8	5.05	2.83	78.45%
Q9	46.22	44.01	5.02%
Q10	27.34	25.09	8.97%
Q11	0.56	0.46	21.74%
Q12	5.34	3.22	65.84%
Q13	0.75	0.39	92.31%
Q14	4.05	2.04	98.53%
Q15	13.11	9.36	40.06%
Q16	10.24	10.26	-0.19%
Q17	6.88	2.49	176.31%
Q18	35.46	32.14	10.33%
Q19	0.17	0.04	325.00%
Q20	3.72	1.59	133.96%
Q21	11.53	6.66	73.12%
Q22	0.44	0.10	340.00%
Sum	209.57	165.65	26.51%

All Stable Performance Comparison

The following table presents the performance improvements for All Stable compared to both the original version (with Delta) and the optimized version (with Delta):

After compacting the data, All Stable only shows a 5.86% improvement compared to the current optimized version with delta data.
However, compared to the original version with delta data, All Stable shows a significant 33.93% improvement.

The chbenchmark data was inserted into another database. Then the compact table command is executed. As a result, the data might differ. So there are some differences in the query plans. For example, in Q20, the query plan uses hash partition join and includes late materialization for order_line table in the All Stable version, while original version uses broadcast join and no late materialization for order_line table. This causes Q20 to run slower in All Stable compared to the original version due to exchange overhead and low selectivity.

Query	All Stable	All Stable Improvement over Original	All Stable Improvement over Optimization
Q1	1.00	230.00%	17.00%
Q2	0.87	3.45%	1.15%
Q3	1.39	212.23%	33.81%
Q4	2.71	88.56%	6.27%
Q5	11.10	26.76%	4.14%
Q6	0.76	292.11%	31.58%
Q7	5.47	46.44%	2.74%
Q8	2.39	111.30%	18.41%
Q9	39.85	15.98%	10.44%
Q10	24.50	11.59%	2.41%
Q11	0.51	9.80%	-9.80%
Q12	2.99	78.60%	7.69%
Q13	0.38	97.37%	2.63%
Q14	1.86	117.74%	9.68%
Q15	9.02	45.34%	3.77%
Q16	10.32	-0.78%	-0.58%
Q17	1.55	343.87%	60.65%
Q18	32.01	10.78%	0.41%
Q19	0.04	325.00%	0.00%
Q20	1.81	105.52%	-12.15%
Q21	5.88	96.09%	13.27%
Q22	0.07	528.57%	42.86%
Sum	156.48	33.93%	5.86%

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

Improve the performance of `TableScan` by skipping unnecessary row reads

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

ti-chi-bot · 2025-09-25T14:26:09Z

@gengliqi This PR has conflicts, I have hold it.
Please resolve them or ask others to resolve them, then comment /unhold to remove the hold label.

ti-chi-bot · 2025-09-25T14:26:11Z

@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

ti-chi-bot · 2025-09-25T14:26:16Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign fzhedu for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

JaySon-Huang · 2025-09-25T14:27:08Z

Replaced by #10293

ti-chi-bot · 2025-09-25T14:35:46Z

@ti-chi-bot: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-integration-test	`984716f`	link	true	`/test pull-integration-test`
pull-unit-test	`984716f`	link	true	`/test pull-unit-test`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

This is an automated cherry-pick of pingcap#9876

984716f

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

ti-chi-bot mentioned this pull request Sep 25, 2025

Reduce redundant pack reads during building bitmap filter for delta merge case #9876

Merged

12 tasks

ti-chi-bot bot added the do-not-merge/cherry-pick-not-approved label Sep 25, 2025

ti-chi-bot assigned gengliqi Sep 25, 2025

JaySon-Huang closed this Sep 25, 2025

ti-chi-bot bot added cherry-pick-approved Cherry pick PR approved by release team. and removed do-not-merge/cherry-pick-not-approved labels Sep 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce redundant pack reads during building bitmap filter for delta merge case (#9876)#10458

Reduce redundant pack reads during building bitmap filter for delta merge case (#9876)#10458
ti-chi-bot wants to merge 1 commit intopingcap:release-8.5from
ti-chi-bot:cherry-pick-9876-to-release-8.5

ti-chi-bot commented Sep 25, 2025

Uh oh!

ti-chi-bot commented Sep 25, 2025

Uh oh!

ti-chi-bot bot commented Sep 25, 2025

Uh oh!

ti-chi-bot bot commented Sep 25, 2025

Uh oh!

JaySon-Huang commented Sep 25, 2025

Uh oh!

ti-chi-bot bot commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ti-chi-bot commented Sep 25, 2025

What problem does this PR solve?

What is changed and how it works?

Run CHBenchmark 1500: Overall 26.5% Improvement

All Stable Performance Comparison

Check List

Release note

Uh oh!

ti-chi-bot commented Sep 25, 2025

Uh oh!

ti-chi-bot bot commented Sep 25, 2025

Uh oh!

ti-chi-bot bot commented Sep 25, 2025

Uh oh!

JaySon-Huang commented Sep 25, 2025

Uh oh!

ti-chi-bot bot commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants