Skip to content

Reduce redundant pack reads during building bitmap filter for delta merge case (#9876)#10458

Closed
ti-chi-bot wants to merge 1 commit intopingcap:release-8.5from
ti-chi-bot:cherry-pick-9876-to-release-8.5
Closed

Reduce redundant pack reads during building bitmap filter for delta merge case (#9876)#10458
ti-chi-bot wants to merge 1 commit intopingcap:release-8.5from
ti-chi-bot:cherry-pick-9876-to-release-8.5

Conversation

@ti-chi-bot
Copy link
Member

This is an automated cherry-pick of #9876

What problem does this PR solve?

Issue Number: close #9875

Problem Summary:
See #9875
This optimization can provide a significant performance boost when the data being read is not updated frequently.

What is changed and how it works?

Reduce redundant pack reads during bitmap filter building for delta merge case

Run CHBenchmark 1500: Overall 26.5% Improvement

mysql> select sum(DELTA_MEMTABLE_ROWS+DELTA_PERSISTED_ROWS) delta_rows, TIDB_TABLE from INFORMATION_SCHEMA.TIFLASH_SEGMENTS where TIDB_DATABASE='chbench_delta' group by TIDB_TABLE order by delta_rows;
+------------+------------+
| delta_rows | TIDB_TABLE |
+------------+------------+
|          0 | item       |
|         15 | region     |
|         75 | nation     |
|       4704 | warehouse  |
|      20000 | supplier   |
|      49790 | district   |
|     183924 | new_order  |
|    1022080 | history    |
|    1193288 | orders     |
|    1996224 | customer   |
|    7977425 | stock      |
|   41277141 | order_line |
+------------+------------+
12 rows in set (0.09 sec)

mysql> select sum(DELTA_MEMTABLE_ROWS+DELTA_PERSISTED_ROWS) delta_rows, sum(stable_rows) stable_rows, 100 * sum(DELTA_MEMTABLE_ROWS+DELTA_PERSISTED_ROWS) / sum(stable_rows) delta_percent from INFORMATION_SCHEMA.TIFLASH_SEGMENTS where TIDB_DATABASE='chbench_delta';
+------------+-------------+---------------+
| delta_rows | stable_rows | delta_percent |
+------------+-------------+---------------+
|   53724666 |  2283695006 |        2.3525 |
+------------+-------------+---------------+
1 row in set (0.09 sec)

The above table shows the delta rows for each table in chbench_delta database.

Question Original Optimization Improvement
Q1 3.30 1.17 182.05%
Q2 0.90 0.88 2.27%
Q3 4.34 1.86 133.33%
Q4 5.11 2.88 77.43%
Q5 14.07 11.56 21.71%
Q6 2.98 1.00 198.00%
Q7 8.01 5.62 42.53%
Q8 5.05 2.83 78.45%
Q9 46.22 44.01 5.02%
Q10 27.34 25.09 8.97%
Q11 0.56 0.46 21.74%
Q12 5.34 3.22 65.84%
Q13 0.75 0.39 92.31%
Q14 4.05 2.04 98.53%
Q15 13.11 9.36 40.06%
Q16 10.24 10.26 -0.19%
Q17 6.88 2.49 176.31%
Q18 35.46 32.14 10.33%
Q19 0.17 0.04 325.00%
Q20 3.72 1.59 133.96%
Q21 11.53 6.66 73.12%
Q22 0.44 0.10 340.00%
Sum 209.57 165.65 26.51%

All Stable Performance Comparison

The following table presents the performance improvements for All Stable compared to both the original version (with Delta) and the optimized version (with Delta):

  • After compacting the data, All Stable only shows a 5.86% improvement compared to the current optimized version with delta data.
  • However, compared to the original version with delta data, All Stable shows a significant 33.93% improvement.

The chbenchmark data was inserted into another database. Then the compact table command is executed. As a result, the data might differ. So there are some differences in the query plans. For example, in Q20, the query plan uses hash partition join and includes late materialization for order_line table in the All Stable version, while original version uses broadcast join and no late materialization for order_line table. This causes Q20 to run slower in All Stable compared to the original version due to exchange overhead and low selectivity.

Query All Stable All Stable Improvement over Original All Stable Improvement over Optimization
Q1 1.00 230.00% 17.00%
Q2 0.87 3.45% 1.15%
Q3 1.39 212.23% 33.81%
Q4 2.71 88.56% 6.27%
Q5 11.10 26.76% 4.14%
Q6 0.76 292.11% 31.58%
Q7 5.47 46.44% 2.74%
Q8 2.39 111.30% 18.41%
Q9 39.85 15.98% 10.44%
Q10 24.50 11.59% 2.41%
Q11 0.51 9.80% -9.80%
Q12 2.99 78.60% 7.69%
Q13 0.38 97.37% 2.63%
Q14 1.86 117.74% 9.68%
Q15 9.02 45.34% 3.77%
Q16 10.32 -0.78% -0.58%
Q17 1.55 343.87% 60.65%
Q18 32.01 10.78% 0.41%
Q19 0.04 325.00% 0.00%
Q20 1.81 105.52% -12.15%
Q21 5.88 96.09% 13.27%
Q22 0.07 528.57% 42.86%
Sum 156.48 33.93% 5.86%

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Improve the performance of `TableScan` by skipping unnecessary row reads

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot ti-chi-bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/cherry-pick-for-release-8.5 This PR is cherry-picked to release-8.5 from a source PR. labels Sep 25, 2025
@ti-chi-bot
Copy link
Member Author

@gengliqi This PR has conflicts, I have hold it.
Please resolve them or ask others to resolve them, then comment /unhold to remove the hold label.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Sep 25, 2025

@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Sep 25, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign fzhedu for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@JaySon-Huang
Copy link
Contributor

Replaced by #10293

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Sep 25, 2025

@ti-chi-bot: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-integration-test 984716f link true /test pull-integration-test
pull-unit-test 984716f link true /test pull-unit-test

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ti-chi-bot ti-chi-bot bot added cherry-pick-approved Cherry pick PR approved by release team. and removed do-not-merge/cherry-pick-not-approved labels Sep 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-pick-approved Cherry pick PR approved by release team. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/cherry-pick-for-release-8.5 This PR is cherry-picked to release-8.5 from a source PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants