raftstore: calculate the slow score by considering individual disk performance factors#17801
Merged
ti-chi-bot[bot] merged 23 commits intotikv:masterfrom Nov 29, 2024
Merged
Conversation
…isk performance factors. Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Contributor
|
Skipping CI for Draft Pull Request. |
Contributor
Author
|
/cc @hbisheng ptal, thx |
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
hbisheng
reviewed
Nov 12, 2024
hbisheng
reviewed
Nov 12, 2024
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
9 tasks
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
ti-chi-bot
pushed a commit
to ti-chi-bot/tikv
that referenced
this pull request
Dec 2, 2024
close tikv#17884 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Member
|
In response to a cherrypick label: new pull request created to branch |
9 tasks
ti-chi-bot bot
pushed a commit
that referenced
this pull request
Dec 2, 2024
…rformance factors (#17801) (#17912) close #17884 This pr introduces an extra and individual inspector to detect whether there exists I/O hung issues on kvdb disk, if the kvdb is deployed with a separate mount path. Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io> Signed-off-by: lucasliang <nkcs_lykx@hotmail.com> Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>
Member
|
In response to a cherrypick label: new pull request created to branch |
ti-chi-bot
pushed a commit
to ti-chi-bot/tikv
that referenced
this pull request
Dec 2, 2024
close tikv#17884 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
9 tasks
ti-chi-bot
pushed a commit
to ti-chi-bot/tikv
that referenced
this pull request
Dec 11, 2024
close tikv#17884 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Member
|
In response to a cherrypick label: new pull request created to branch |
9 tasks
ti-chi-bot bot
pushed a commit
that referenced
this pull request
Dec 11, 2024
…rformance factors (#17801) (#17980) close #17884 This pr introduces an extra and individual inspector to detect whether there exists I/O hung issues on kvdb disk, if the kvdb is deployed with a separate mount path. Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io> Signed-off-by: lucasliang <nkcs_lykx@hotmail.com> Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>
9 tasks
9 tasks
This was referenced Jul 18, 2025
okJiang
added a commit
to okJiang/tikv
that referenced
this pull request
Oct 13, 2025
…idering individual disk performance factors.(tikv#17801) (tikv#17901)" This reverts commit 8b006a5. Signed-off-by: okjiang <819421878@qq.com>
okJiang
pushed a commit
to okJiang/tikv
that referenced
this pull request
Oct 13, 2025
…rformance factors (tikv#17801) close tikv#17884 This pr introduces an extra and individual inspector to detect whether there exists I/O hung issues on kvdb disk, if the kvdb is deployed with a separate mount path. Signed-off-by: lucasliang <nkcs_lykx@hotmail.com> Co-authored-by: Bisheng Huang <hbisheng@gmail.com> Signed-off-by: okjiang <819421878@qq.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is changed and how it works?
Issue Number: Close #17884
What's Changed:
The current SlowScore mechanism of TiKV can detect and handle the IO latency issues of Raft disks. As an improvement, the write latency of KV disks can also be taken into account to identify abnormal situations of KV disks.

As the above designs, this pr introduce an extra and individual inspector, triggered by
inspect-kvdb-interval, to inspect the I/O latency of kvdb if deployed with separate mount path. And the inspector will periodically sends the inspecting latency to SlowScore algorithm, to make it can detect whether there exists disk I/O hung issues on kvdb disk.Additionally, to mitigate the effects of complex foreground and background I/O operations triggered by RocksDB, the inspector simply writes a string to a designated file and records the time cost for this operation, logging it as the
apply_process_duration. And by testing, it's proved to be valid and more accurate than directly recording the duration of applying on RocksDB.Moreover, if raft-engine and kvdb uses the same mount path when deploying, this newly introduced inspector will not be created to make the inspecting of disk health triggered by
inspect-inervalas previous work does.Related changes
pingcap/docs/pingcap/docs-cn:Check List
Tests
And the following example shows that this mechanism make senses when injecting I/O delays to kvdb disk, using

tpcc-1kworkloads:Side effects
Release note