Unsafe Recovery: Fix TiFlash blocks commit idx forwarding in unsafe recovery#18211
Unsafe Recovery: Fix TiFlash blocks commit idx forwarding in unsafe recovery#18211ti-chi-bot[bot] merged 5 commits intotikv:masterfrom
Conversation
Signed-off-by: Yang Zhang <yang.zhang@pingcap.com>
|
/test pull-unit-test |
Signed-off-by: Yang Zhang <yang.zhang@pingcap.com>
| // Learners like TiFlash are ignored, because they may be tombstoned in earlier | ||
| // stage of the unsafe recovery process, so that forced leader | ||
| // cannot append logs to them. | ||
| if failed_stores.contains(&peer.get_store_id()) || peer.get_role() == PeerRole::Learner |
There was a problem hiding this comment.
we can't simply pass learner, considering the case where all left peers are learners. We should check whether it is on tiflash by checking the label. I think you can get the store from StoreGroup
There was a problem hiding this comment.
I considered this case. This change still works in this case, doesn't it? After all, Raft itself only update commit idx based on voters' progresses.
There was a problem hiding this comment.
Yes, but in the case of unsafe recovery, learner itself could be the forced leader
There was a problem hiding this comment.
Updated, only ignoring TiFlash nodes.
But, just FYI, using StoreGroup has a potential bug. In non-dr-auto-sync deployment, any newly provisioned TiFlash node after the regular TiKV nodes started won't be ignored, since labels are only updated when switching replication status.
I still think learner is better, even if the learner itself could be the forced leader, it is still OK, since let mut replicated_idx = self.raft_group.raft.raft_log.persisted; replicated_idx is initialized with its own log index.
51a1396 to
4eea002
Compare
[LGTM Timeline notifier]Timeline:
|
|
/retest |
|
@v01dstar: Your PR was out of date, I have automatically updated it for you. If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Connor1996, glorv The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
In response to a cherrypick label: new pull request created to branch |
|
In response to a cherrypick label: new pull request created to branch |
What is changed and how it works?
Issue Number: Close #18197
What's Changed:
Related changes
pingcap/docs/pingcap/docs-cn:Check List
Tests
Side effects
Release note