Skip to content

Leader cannot recover corrupted follower after data cleanup #924

@mattisonchao

Description

@mattisonchao

Motivation

When a follower gets corrupted and its data is cleaned up, the leader cannot recover it and keeps failing with "invalid status". This happens because:

  1. Follower remains in NOT_MEMBER status: InstallSnapshot() has no status check and does not update the follower's serving status. The follower stays NOT_MEMBER throughout, so subsequent AppendEntries calls are rejected with ErrInvalidStatus.

  2. Leader's follower cursor has a stale position: The leader's FollowerCursor retains the cursor position from before the follower restarted with clean data. shouldSendSnapshot() returns false because the cursor position indicates the follower already has enough data, so it skips the snapshot and goes straight to streamEntries, which fails repeatedly.

Expected behavior

When a follower restarts with clean data (NOT_MEMBER), the leader should automatically detect this and send a full snapshot to recover it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions