Skip to content

raftstore: Wait ticks for hibernated peer when doing force leader#12364

Merged
ti-chi-bot merged 9 commits intotikv:masterfrom
Connor1996:force-leader-on-hibernate
Apr 20, 2022
Merged

raftstore: Wait ticks for hibernated peer when doing force leader#12364
ti-chi-bot merged 9 commits intotikv:masterfrom
Connor1996:force-leader-on-hibernate

Conversation

@Connor1996
Copy link
Member

@Connor1996 Connor1996 commented Apr 15, 2022

What is changed and how it works?

Issue Number: Ref #10483

What's Changed:

Force leader is rejected on a peer who is already a leader. For the hibernated leader,
it doesn't step down to follower when quorum is missing due to not ticking election. 
So wait ticks in force leader process for hibernated peers to make sure election ticking
is performed.

Check List

Tests

  • Unit test

Release note

None

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Apr 15, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • BusyJay
  • tabokie

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Details

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 15, 2022
@Connor1996 Connor1996 requested a review from BusyJay April 15, 2022 01:48
@Connor1996
Copy link
Member Author

PTAL @v01dstar

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@Connor1996 Connor1996 force-pushed the force-leader-on-hibernate branch from 23a9a75 to 5ab3918 Compare April 15, 2022 02:28
);
if *ticks == 0 {
let s = mem::take(failed_stores);
self.on_enter_pre_force_leader(s);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of calling back on on_enter_pre_force_leader() maybe it looks cleaner if we separate the on_enter_pre_force_leader() into several functions like

SignificantMsg::EnterForceLeaderState => {
    self.on_enter_force_leader();
}

fn on_enter_force_leader() {
    if state.is_some() {
        ...
    }
    if is_leader || (promotable && has_leader){
        self.leader_absence_check()
    } else {
        self.voter_liveness_check()
    }
}

fn check_force_leader() {
    if leader_absence_check_passes {
        self.voter_liveness_check()
    }
    if voter_liveness_check_passes {
        ...
    }
}

Copy link
Member Author

@Connor1996 Connor1996 Apr 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After waiting ticks, we need to re-check all the conditions in on_enter_pre_force_leader. It may still be the leader due to it doesn't lose the quorum. Reusing is straightforward.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's also something we can optimize here, instead of waiting for certain ticks, and then check the leader absence in the on_enter_pre_force_leader(), maybe we can proactively check whether the leader is absent or not in each tick at check_force_leader()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't. For leader, CheckQuorum needs to wait for a round to see if peers are recent active; for follower, only after an election timeout, it can make sure the leader lease is expired. Anyway, both of them need to wait for some ticks to get the correct information.

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@Connor1996 Connor1996 force-pushed the force-leader-on-hibernate branch from 67ec4bf to a41677f Compare April 15, 2022 06:20
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better if we assert the target region is hibernated here. Something like assert_eq!(is_hibernated(region.get_id()), ture);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't, it's not exported

@Connor1996 Connor1996 requested a review from BusyJay April 19, 2022 06:26
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@Connor1996 Connor1996 force-pushed the force-leader-on-hibernate branch from 9be6b92 to 94449b7 Compare April 19, 2022 06:27
@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Apr 19, 2022
@ti-chi-bot
Copy link
Member

@v01dstar: Thanks for your review. The bot only counts approvals from reviewers and higher roles in list, but you're still welcome to leave your comments.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Apr 20, 2022
@Connor1996
Copy link
Member Author

/merge

@ti-chi-bot
Copy link
Member

@Connor1996: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

DetailsCommit hash: 94449b7

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 20, 2022
@ti-chi-bot
Copy link
Member

@Connor1996: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot merged commit 8b509a7 into tikv:master Apr 20, 2022
@Connor1996 Connor1996 deleted the force-leader-on-hibernate branch April 20, 2022 11:14
Connor1996 added a commit to Connor1996/tikv that referenced this pull request May 18, 2022
…kv#12364)

ref tikv#10483

Force leader is rejected on a peer who is already a leader. For the hibernated leader,
it doesn't step down to follower when quorum is missing due to not ticking election. 
So wait ticks in force leader process for hibernated peers to make sure election ticking
is performed.

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Connor1996 added a commit to Connor1996/tikv that referenced this pull request May 18, 2022
…kv#12364)

ref tikv#10483

Force leader is rejected on a peer who is already a leader. For the hibernated leader,
it doesn't step down to follower when quorum is missing due to not ticking election. 
So wait ticks in force leader process for hibernated peers to make sure election ticking
is performed.

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
CalvinNeo pushed a commit to pingcap/tidb-engine-ext that referenced this pull request May 18, 2022
* raftstore: Introduce force leader state (tikv#11932)

close tikv#6107, ref tikv#10483

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

* raftstore: Wait ticks for hibernated peer when doing force leader (tikv#12364)

ref tikv#10483

Force leader is rejected on a peer who is already a leader. For the hibernated leader,
it doesn't step down to follower when quorum is missing due to not ticking election. 
So wait ticks in force leader process for hibernated peers to make sure election ticking
is performed.

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>

* raftstore: Make unsafe recovery wait apply cover snapshot apply cases ref tikv#10483 (tikv#12308)

ref tikv#10483

Signed-off-by: v01dstar <yang.zhang@pingcap.com>

* raftstore: Execute recovery plan via raft (tikv#12022)

Signed-off-by: Connor1996 <zbk602423539@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Co-authored-by: Yang Zhang <yang.zhang@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants