Skip to content

raftstore: remove stale ranges by DeleteByKeys rather than ingesting.#18040

Merged
ti-chi-bot[bot] merged 18 commits intotikv:masterfrom
LykxSassinator:opt_remove_stale_range
Jan 21, 2025
Merged

raftstore: remove stale ranges by DeleteByKeys rather than ingesting.#18040
ti-chi-bot[bot] merged 18 commits intotikv:masterfrom
LykxSassinator:opt_remove_stale_range

Conversation

@LykxSassinator
Copy link
Contributor

@LykxSassinator LykxSassinator commented Dec 23, 2024

What is changed and how it works?

Issue Number: Close #18107, Ref #18042

This PR mainly contains the following parts for optimization on scaling, used to mitigate the impacts, introduced by unnecessary ingesting sst files:

  • Directly clearing stale ranges by DeleteByKeys during the balancing regions process.
  • Do not clear data of offline stores during the scale-in process, as this data will be automatically cleared when the corresponding node goes offline.

What's Changed:

Optimizing the processing of clearing stale-ranges by DeleteByKeys, rather than ingesting.

This optimization is used to mitigate the impacts, introduced by unnecessary ingesting sst
files on latency when scaling.

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Taking the grpc messages duration in TiKV metrics panel as examples, positive performance feedbacks on reducing the long-tail latency can be reviewed from the following comparison results:

Commit Comparisons on grpc messages duration
master image
This PR image

Meanwhile, the following E2E long-tail reduction also proves that this PR makes positive improvements effects:

Commit Scale-Out (Before -> Scaling) Scale-in (Before -> Scaling)
master P99: 7.2ms -> ~ 11.3 ms P999: 62ms -> ~89.2 ms P99: 7.2ms -> ~ 16.1 ms P999: 62ms -> ~100 ms
This PR P99: 7.2 ms -> ~10.6 ms P999: 62 ms -> ~76.9ms P99: 7.2ms -> ~ 15.3 ms P999: 62ms -> ~96 ms

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Release note

Optimizing the processing of clearing stale-ranges by DeleteByKeys to mitigate the impacts
on latency.

Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Dec 23, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has signed the dco. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 23, 2024
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
@ti-chi-bot ti-chi-bot bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/needs-linked-issue do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Dec 23, 2024
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
@LykxSassinator LykxSassinator marked this pull request as ready for review January 6, 2025 02:13
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 6, 2025
@LykxSassinator
Copy link
Contributor Author

/retest

@LykxSassinator
Copy link
Contributor Author

/cc @hhwyt ptal

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jan 7, 2025

@LykxSassinator: GitHub didn't allow me to request PR reviews from the following users: hhwyt, ptal.

Note that only tikv members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/cc @hhwyt ptal

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

)
} else {
(
DeleteStrategy::DeleteByWriter {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it becomes dead code after this PR, could you remove it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can add additional annotations here for reviewers and developers, as this approach could be applied to other delete operations if necessary.

Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
}

fn delete_all_in_range(&self, ranges: &[Range<'_>]) -> Result<()> {
fn delete_all_in_range(&self, ranges: &[Range<'_>], forcely_by_key: bool) -> Result<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems all places pass forcely_by_key = true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, introducing this extra parameter serves as a reminder to retain the DeleteByWriter strategy, which might be applied to other delete operations if necessary.

@ti-chi-bot ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jan 8, 2025
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jan 21, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Connor1996, overvenus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [Connor1996,overvenus]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
@LykxSassinator LykxSassinator removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 21, 2025
@ti-chi-bot ti-chi-bot bot merged commit effe615 into tikv:master Jan 21, 2025
1 check passed
@ti-chi-bot ti-chi-bot bot added this to the Pool milestone Jan 21, 2025
@LykxSassinator LykxSassinator added the needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. label May 16, 2025
ti-chi-bot pushed a commit to ti-chi-bot/tikv that referenced this pull request May 16, 2025
ref tikv#18042, close tikv#18107

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.5: #18461.
But this PR has conflicts, please resolve them!

LykxSassinator added a commit to LykxSassinator/tikv that referenced this pull request Jun 6, 2025
LykxSassinator added a commit that referenced this pull request Jun 6, 2025
ti-chi-bot bot pushed a commit that referenced this pull request Jun 10, 2025
 

Rollup of following sst ingestion optimizations

* raftstore: directly write kvs rather than ingestion when merging small regions. (#17408) (#18518)
* raftstore: remove stale ranges by DeleteByKeys rather than ingesting. (#18040) (#18519)
* raftstore: support rocksdb writes during ingestion #18096 (#18520)
* storage: support online config for flow-control module (#17396). (#18523)
* sst_importer: allow write during ingesting sst (#18514) (#18522)

Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Signed-off-by: Neil Shen <overvenus@gmail.com>

Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>
@hhwyt
Copy link
Contributor

hhwyt commented Jun 19, 2025

/cherry-pick 8.1

@hhwyt hhwyt added the affects-8.1 This bug affects the 8.1.x(LTS) versions. label Jun 19, 2025
@ti-chi-bot
Copy link
Member

@hhwyt: cannot checkout 8.1: error checking out 8.1: exit status 1. output: error: pathspec '8.1' did not match any file(s) known to git

Details

In response to this:

/cherry-pick 8.1

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@hhwyt
Copy link
Contributor

hhwyt commented Jun 19, 2025

/cherry-pick release-8.1

ti-chi-bot pushed a commit to ti-chi-bot/tikv that referenced this pull request Jun 19, 2025
ref tikv#18042, close tikv#18107

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

@hhwyt: new pull request created to branch release-8.1: #18569.
But this PR has conflicts, please resolve them!

Details

In response to this:

/cherry-pick release-8.1

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot bot added the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label Jul 2, 2025
ti-chi-bot pushed a commit to ti-chi-bot/tikv that referenced this pull request Jul 2, 2025
ref tikv#18042, close tikv#18107

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.5: #18622.
But this PR has conflicts, please resolve them!

ti-chi-bot bot pushed a commit that referenced this pull request Jul 7, 2025
…#18040) (#18622)

ref #18042, close #18107

Optimizing the processing of clearing stale-ranges by DeleteByKeys, rather than ingesting.

This optimization is used to mitigate the impacts, introduced by unnecessary ingesting sst
files on latency when scaling.

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>

Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>
ti-chi-bot bot pushed a commit that referenced this pull request Jul 9, 2025
…#18040) (#18461)

ref #18042, close #18107

Optimizing the processing of clearing stale-ranges by DeleteByKeys, rather than ingesting.

This optimization is used to mitigate the impacts, introduced by unnecessary ingesting sst
files on latency when scaling.

Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>

Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>
ti-chi-bot bot pushed a commit that referenced this pull request Jul 9, 2025
…#18040) (#18569)

ref #18042, close #18107

Optimizing the processing of clearing stale-ranges by DeleteByKeys, rather than ingesting.

This optimization is used to mitigate the impacts, introduced by unnecessary ingesting sst
files on latency when scaling.

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>

Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

affects-8.1 This bug affects the 8.1.x(LTS) versions. approved dco-signoff: yes Indicates the PR's author has signed the dco. lgtm needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scaling: optimize the clearing peer mechanism to mitigate the impacts to tail latency.

5 participants