raftstore: Execute recovery plan via raft by v01dstar · Pull Request #12022 · tikv/tikv

v01dstar · 2022-02-25T01:30:16Z

Signed-off-by: v01dstar yang.zhang@pingcap.com

What is changed and how it works?

Issue Number: ref #10483

What's Changed:

This PR makes TiKV execute unsafe recovery plan through Raft.

Related changes

N/A

Check List

Tests

Integration Test

Side effects

N/A

Release note

None

ti-chi-bot · 2022-02-25T01:30:18Z

[REVIEW NOTIFICATION]

This pull request has been approved by:

5kbpers
Connor1996

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Details

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

components/raftstore/src/store/fsm/peer.rs

components/raftstore/src/store/worker/pd.rs

NingLin-P · 2022-03-04T11:28:16Z

Remove peer one by one maybe problematic, consider the following case:

region with peer list (1, 2, 3, 4, 5, 6, 7) where 4, 5, 6 and 7 are in down stores, so we try to remove them one by one
after removing 6, 7, the remaining three alive peers 1, 2, 3 can form the quorum of (1, 2, 3, 4, 5), so they may (and can) elect a new leader
so there may have two leaders, although they have a different term, the force_leader can commit logs without replicating to majority peers, and may produce conflict committed logs, causing data corruption.

Using joint consensus to atomic removing multiple peers by one command (one log) may avoid the issue, and also be more efficient.

Connor1996 · 2022-03-07T02:37:59Z

3. so there may have two leaders, although they have a different term, the force_leader can commit logs without replicating to majority peers, and may produce conflict committed logs, causing data corruption.

Reasonable, joint consensus is safer. You can propose the leave joint consensus just like how this PR does. @v01dstar

Connor1996 · 2022-03-07T02:39:18Z

@NingLin-P BTW, can we use auto_leave == true for this scenario, so we don't need to propose leave joint explicitly?

NingLin-P · 2022-03-07T07:56:26Z

@NingLin-P BTW, can we use auto_leave == true for this scenario, so we don't need to propose leave joint explicitly?

lgtm, previously we don't use auto_leave because we still need a retry mechanism for the LeaveJoint command anyway so the auto_leave doesn't help much. But in the unsafe recovery use case, the LeaveJoint command won't fail because there won't be any new leader until the majority of alive voters had leave joint state.

v01dstar · 2022-03-08T13:03:22Z

auto_leave

I don't think this gonna work after investigated the auto leave API (or ConfChangeTransition::Implicit which is the external switch to use the auto leave functionality of raft joint consensus). The way auto leave works is that Raft appends a empty v2 conf change once the enter joint consensus entry is applied and auto leave is turned on, and apparently, the entry added by Raft itself does not have the context field being filled, so the generated request in the apply FSM does not have necessary meta info like epoch, and thus blocked here. I don't think we should manually fill up those meta info neither, since it may bring more problems.

To conclude, the current design of Raftstore does not support auto leave (or ConfChangeTransition::Implicit). To make it work, I guess we need to change both Raftstore and Raft code.

For now, I suggest us using ConfChangeV2 + ConfChangeTransition::Explicit

Connor1996 · 2022-03-09T03:35:38Z

auto_leave

I don't think this gonna work after investigated the auto leave API (or ConfChangeTransition::Implicit which is the external switch to use the auto leave functionality of raft joint consensus). The way auto leave works is that Raft appends a empty v2 conf change once the enter joint consensus entry is applied and auto leave is turned on, and apparently, the entry added by Raft itself does not have the context field being filled, so the generated request in the apply FSM does not have necessary meta info like epoch, and thus blocked here. I don't think we should manually fill up those meta info neither, since it may bring more problems.

To conclude, the current design of Raftstore does not support auto leave (or ConfChangeTransition::Implicit). To make it work, I guess we need to change both Raftstore and Raft code.

For now, I suggest us using ConfChangeV2 + ConfChangeTransition::Explicit

Okay, Let's use ConfChangeV2 + ConfChangeTransition::Explicit

components/raftstore/src/store/util.rs

components/raftstore/src/store/peer.rs

components/raftstore/src/store/fsm/peer.rs

* raftstore: Introduce force leader state (tikv#11932) close tikv#6107, ref tikv#10483 Signed-off-by: Connor1996 <zbk602423539@gmail.com> * raftstore: Wait ticks for hibernated peer when doing force leader (tikv#12364) ref tikv#10483 Force leader is rejected on a peer who is already a leader. For the hibernated leader, it doesn't step down to follower when quorum is missing due to not ticking election. So wait ticks in force leader process for hibernated peers to make sure election ticking is performed. Signed-off-by: Connor1996 <zbk602423539@gmail.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> * raftstore: Make unsafe recovery wait apply cover snapshot apply cases ref tikv#10483 (tikv#12308) ref tikv#10483 Signed-off-by: v01dstar <yang.zhang@pingcap.com> * raftstore: Execute recovery plan via raft (tikv#12022) Signed-off-by: Connor1996 <zbk602423539@gmail.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> Co-authored-by: Yang Zhang <yang.zhang@pingcap.com>

Signed-off-by: v01dstar <yang.zhang@pingcap.com>

Connor1996 · 2022-05-19T06:30:20Z

/test

Connor1996 · 2022-05-19T09:20:19Z

/merge

ti-chi-bot · 2022-05-19T09:20:20Z

@Connor1996: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

ti-chi-bot · 2022-05-19T09:20:23Z

This pull request has been accepted and is ready to merge.

Details

Commit hash: 91b17c0

ti-chi-bot · 2022-05-19T09:20:35Z

@v01dstar: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

…afe recovery state (#12657) ref #12022, close #12644 Cleaning up the unsafe recovery state after exiting previous joint state before proposing the recovery demotion which may return early if any error happens and leave the state untouched. Signed-off-by: v01dstar <yang.zhang@pingcap.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

…afe recovery state (#12657) (#12675) ref #12022, close #12644, ref #12657 Cleaning up the unsafe recovery state after exiting previous joint state before proposing the recovery demotion which may return early if any error happens and leave the state untouched. Signed-off-by: ti-srebot <ti-srebot@pingcap.com> Co-authored-by: Yang Zhang <yang.zhang@pingcap.com>

…afe recovery state (tikv#12657) (tikv#12675) ref tikv#12022, close tikv#12644, ref tikv#12657 Cleaning up the unsafe recovery state after exiting previous joint state before proposing the recovery demotion which may return early if any error happens and leave the state untouched. Signed-off-by: ti-srebot <ti-srebot@pingcap.com> Co-authored-by: Yang Zhang <yang.zhang@pingcap.com> Signed-off-by: joccau <zak.zhao@pingcap.com>

overvenus · 2023-07-25T09:19:30Z

components/raftstore/src/store/fsm/peer.rs

+                DiskFullOpt::AllowedOnAlmostFull,
+            );
+
+            if !*failed.lock().unwrap() {


Should we abort this plan if proposal fails? Because it's still in joint state and the syncer has dropped.

The plan should be aborted only when there is already a plan being executed. When there is something wrong like not force leader or proposal fail, just print the error and let's syncer continue to trigger store report. Then the PD side would know the state is not changed and retry in the next round.

ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. contribution This PR is from a community contributor. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 25, 2022

Connor1996 changed the title ~~Recover via raft~~ raftstore: Execute recovery plan via raft Feb 28, 2022

Connor1996 mentioned this pull request Feb 28, 2022

Support online recovery for Raft majority failure #10483

Closed

44 tasks

Connor1996 requested review from Connor1996 and NingLin-P February 28, 2022 06:21

ti-chi-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 28, 2022

Connor1996 reviewed Feb 28, 2022

View reviewed changes

components/raftstore/src/store/fsm/peer.rs Outdated Show resolved Hide resolved

components/raftstore/src/store/fsm/peer.rs Outdated Show resolved Hide resolved

components/raftstore/src/store/fsm/peer.rs Outdated Show resolved Hide resolved

v01dstar force-pushed the recover_via_raft branch from 6f02ab5 to b94c08e Compare March 1, 2022 10:32

ti-chi-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 1, 2022

Connor1996 reviewed Mar 3, 2022

View reviewed changes

components/raftstore/src/store/worker/pd.rs Outdated Show resolved Hide resolved

ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 3, 2022

NingLin-P reviewed Mar 9, 2022

View reviewed changes

components/raftstore/src/store/util.rs Outdated Show resolved Hide resolved

NingLin-P reviewed Mar 15, 2022

View reviewed changes

components/raftstore/src/store/util.rs Outdated Show resolved Hide resolved

components/raftstore/src/store/peer.rs Outdated Show resolved Hide resolved

ti-chi-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 16, 2022

v01dstar force-pushed the recover_via_raft branch from ba7bdfe to 2226501 Compare March 25, 2022 00:50

Connor1996 reviewed Mar 30, 2022

View reviewed changes

components/raftstore/src/store/fsm/peer.rs Outdated Show resolved Hide resolved

Connor1996 reviewed Mar 30, 2022

View reviewed changes

components/raftstore/src/store/fsm/peer.rs Outdated Show resolved Hide resolved

Connor1996 reviewed Mar 30, 2022

View reviewed changes

components/raftstore/src/store/fsm/peer.rs Outdated Show resolved Hide resolved

ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label May 18, 2022

5kbpers approved these changes May 18, 2022

View reviewed changes

ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels May 18, 2022

v01dstar and others added 9 commits May 18, 2022 18:38

Exit force leader after wait apply

c8cfc9a

Signed-off-by: v01dstar <yang.zhang@pingcap.com>

Fix test failure, remove redundant log

baf9e6c

Signed-off-by: v01dstar <yang.zhang@pingcap.com>

Merge branch 'master' into recover_via_raft

0f746ff

Prolong force leader timeout

b2ff3fc

Signed-off-by: v01dstar <yang.zhang@pingcap.com>

Fix test failure

0f1ea21

Signed-off-by: v01dstar <yang.zhang@pingcap.com>

Remove configure for unsafe recovery from all force leader test cases

a1707b6

Signed-off-by: v01dstar <yang.zhang@pingcap.com>

Revert force leader timeout

8613d98

Signed-off-by: v01dstar <yang.zhang@pingcap.com>

Remove out-dated test

792e0e6

Signed-off-by: v01dstar <yang.zhang@pingcap.com>

Comment out out-dated test instead of removing it

91b17c0

Signed-off-by: v01dstar <yang.zhang@pingcap.com>

ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label May 19, 2022

Merge branch 'master' into recover_via_raft

acf29ca

ti-chi-bot merged commit 5dd92a5 into tikv:master May 19, 2022

v01dstar deleted the recover_via_raft branch May 24, 2022 04:21

v01dstar mentioned this pull request May 26, 2022

raftstore: fix the recovery process hanging issue caused orphaned unsafe recovery state #12657

Merged

ti-srebot mentioned this pull request May 27, 2022

raftstore: fix the recovery process hanging issue caused orphaned unsafe recovery state (#12657) #12675

Merged

overvenus mentioned this pull request Jul 12, 2023

[Dynamic Region] Support online recovery for raftstore-v2 #15108

Closed

6 tasks

overvenus reviewed Jul 25, 2023

View reviewed changes

Conversation

v01dstar commented Feb 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is changed and how it works?

Related changes

Check List

Release note

Uh oh!

ti-chi-bot commented Feb 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NingLin-P commented Mar 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Connor1996 commented Mar 7, 2022

Uh oh!

Connor1996 commented Mar 7, 2022

Uh oh!

NingLin-P commented Mar 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

v01dstar commented Mar 8, 2022

Uh oh!

Connor1996 commented Mar 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Connor1996 commented May 19, 2022

Uh oh!

Connor1996 commented May 19, 2022

Uh oh!

ti-chi-bot commented May 19, 2022

Uh oh!

ti-chi-bot commented May 19, 2022

Uh oh!

ti-chi-bot commented May 19, 2022

Uh oh!

overvenus Jul 25, 2023

Choose a reason for hiding this comment

Uh oh!

Connor1996 Jul 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

v01dstar commented Feb 25, 2022 •

edited

Loading

ti-chi-bot commented Feb 25, 2022 •

edited

Loading

NingLin-P commented Mar 4, 2022 •

edited

Loading

NingLin-P commented Mar 7, 2022 •

edited

Loading

Connor1996 commented Mar 9, 2022 •

edited

Loading

Connor1996 Jul 25, 2023 •

edited

Loading