Skip to content

Assertion failure and inconsistent Raft state after replica transitions from anonymous #11938

@sergepetrenko

Description

@sergepetrenko

Bug description:

Raft has its own instance_id field, which gets set right in initial box.cfg() call, after recovery/join finishes.

For an anonymous replica this field remains 0, as an anonymous replica id is 0. But it isn't updated when an anonymous replica registers. This triggers an assertion in debug build, and leaves Raft in inconsistent state (with wrong instance id) in release.

Steps to reproduce:
Start 2 tarantool instances in interactive mode:

-- Instance 1.
-- Step 1.
box.cfg{
    listen = 3301,
    election_mode = 'candidate',
}
box.schema.user.grant('guest', 'replication')

-- Instance 2.
-- Step 2.
box.cfg{
    replication = 3301,
    replication_anon = true,
    read_only = true,
}

-- Step 3.
box.cfg{replication_anon = false}

In debug build, instance 2 will fail with an assertion:

./src/lib/raft/raft.c:476: raft_leader_see: Assertion `source > 0 && source < VCLOCK_MAX' failed.

In release build, it'll silently ignore the instance id mismatch, leading to something bad later on.

Metadata

Metadata

Assignees

Labels

2.11Target is 2.11 and all newer release/master branchesbugSomething isn't workingraftRAFT protocolreplication

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions