Conversation
raft/progress.go
Outdated
| // received entry. | ||
| ins *inflights | ||
|
|
||
| isLearner bool |
There was a problem hiding this comment.
do we need to export this, I guess maybe user may need it later
/cc @xiang90
raft/raft.go
Outdated
| step stepFunc | ||
|
|
||
| // voterCount caches the total voter which can vote. | ||
| voterCount int |
There was a problem hiding this comment.
i still want to try to separate the learners out of the prs. so we can keep the old logic without adding this "cache". i feel the code can be cleaner overall too.
can you give it a try at least?
|
it looks good to me overall. just want to another approach to keep track of learners. |
|
I have separated the learner progress, but the code seems not be simpler. PTAL @xiang90 |
raft/raft.go
Outdated
| for id := range r.prs { | ||
| if id == r.id { | ||
| continue | ||
| f := func(prs map[uint64]*Progress) { |
There was a problem hiding this comment.
make this a method forEachProgress(prs [], prs[], f())? so we do not need to repeat this quite a few times.
|
thanks i will give this a closer look tomorrow. |
raft/progress.go
Outdated
| // received entry. | ||
| ins *inflights | ||
|
|
||
| // IsLearner is true if the the follower is a learner. |
There was a problem hiding this comment.
if this progress is tracked for a learner.
raft/raft.go
Outdated
| // used for testing right now. | ||
| peers []uint64 | ||
|
|
||
| // learners can not promote or vote. |
There was a problem hiding this comment.
// learners contains the IDs of all leaner nodes (including self if the local node is a leaner) in the raft cluster.
// learners only receives entries from the leader node. It does not vote or promote itself.
|
|
||
| state StateType | ||
|
|
||
| isLearner bool |
There was a problem hiding this comment.
// ieLearner is true if the local raft node is a learner.
raft/raft.go
Outdated
| } | ||
| for _, p := range learners { | ||
| if _, ok := r.prs[p]; ok { | ||
| panic(fmt.Sprintf("cannot specify both Voter and Learner for node: %x", p)) |
There was a problem hiding this comment.
node %x is in both learner and peer list.
| // r.bcastAppend). | ||
| func (r *raft) maybeCommit() bool { | ||
| // TODO(bmizerany): optimize.. Currently naive | ||
| mis := make(uint64Slice, 0, len(r.prs)) |
There was a problem hiding this comment.
is this change really needed in this PR?
| // All other message types require a progress for m.From (pr). | ||
| pr, prOk := r.prs[m.From] | ||
| if !prOk { | ||
| pr := r.getProgress(m.From) |
There was a problem hiding this comment.
getProgress can return *p,ok, just like what the previous map struct does. it minimizes code changes. but not a big problem anyway.
There was a problem hiding this comment.
If here returns *Progress, bool, I will check the ok boolean in the places which originally use getProgess().xxx to replace prs[id].xxx, and this may cause code complex.
Another way is to add a mustGetProgress function, but I think it is not necessary.
raft/raft.go
Outdated
| r.logger.Debugf("%x failed to send message to %x because it is unreachable [%s]", r.id, m.From, pr) | ||
| case pb.MsgTransferLeader: | ||
| if pr.IsLearner { | ||
| r.logger.Debugf("%x is Learner. Ignored transferring leadership", r.id) |
raft/raft.go
Outdated
| if pr != nil { | ||
| if isLearner && !pr.IsLearner { | ||
| // can only change Learner to Voter | ||
| r.logger.Infof("%x ignore addLearner for existing node %x [%s]", r.id, id, pr) |
There was a problem hiding this comment.
r.id ignored addLeaner: do not support changing from raft peer to learner.
raft/raft.go
Outdated
| pr.IsLearner = false | ||
| r.prs[id] = pr | ||
| } else { | ||
| r.setProgress(id, 0, r.raftLog.lastIndex()+1, isLearner) |
There was a problem hiding this comment.
we should always put the short case first in if...else... statement for readability. so move this case first, then change the pr exist case.
raft/raft.go
Outdated
| func (r *raft) setProgress(id, match, next uint64, isLearner bool) { | ||
| if isLearner { | ||
| if _, ok := r.prs[id]; ok { | ||
| panic(fmt.Sprintf("%x can't change voter to learner for %x", r.id, id)) |
There was a problem hiding this comment.
unexpected changing from x to x.
raft/raft.go
Outdated
| } | ||
|
|
||
| delete(r.learnerPrs, id) | ||
| r.prs[id] = &Progress{Next: next, Match: match, ins: newInflights(r.maxInflight)} |
There was a problem hiding this comment.
i would handle not learner case first, since it is easier. so it improves readability.
|
PTAL @xiang90 |
| } | ||
| } | ||
|
|
||
| func TestLearnerElectionTimeout(t *testing.T) { |
There was a problem hiding this comment.
we need to add a description for this test.
raft/raft_test.go
Outdated
|
|
||
| nt := newNetwork(n1, n2) | ||
|
|
||
| // Learner can't start election |
There was a problem hiding this comment.
n2 is learner. Learner should not start election even when times out.
raft/raft_test.go
Outdated
| for i := 0; i < n2.electionTimeout; i++ { | ||
| n2.tick() | ||
| } | ||
| if n1.state != StateFollower { |
There was a problem hiding this comment.
no need to check the state of n1 here.
| t.Errorf("peer 2 state: %s, want %s", n2.state, StateFollower) | ||
| } | ||
|
|
||
| // n1 should become leader |
There was a problem hiding this comment.
check n1 is not leader here.
|
|
||
| nt.send(pb.Message{From: 1, To: 1, Type: pb.MsgBeat}) | ||
|
|
||
| n1.addNode(2) |
There was a problem hiding this comment.
well... this tests that a learner can be promoted to a normal node. we should separate this into another test.
raft/raft_test.go
Outdated
| } | ||
|
|
||
| // n1 should become leader | ||
| nt.send(pb.Message{From: 1, To: 1, Type: pb.MsgHup}) |
There was a problem hiding this comment.
why we do use tick to trigger timeout above, but use msghub here?
| } | ||
| } | ||
|
|
||
| func TestRestoreWithLearner(t *testing.T) { |
raft/raft_test.go
Outdated
| t.Errorf("nodes = %v, want %v", nodes, wnodes) | ||
| } | ||
| if !r.learnerPrs[2].IsLearner { | ||
| t.Fatalf("node 2 has suffrage %t, want %t", r.prs[2].IsLearner, true) |
There was a problem hiding this comment.
use t.errorf. rename suffrage to learner.
raft/raft_test.go
Outdated
| } | ||
|
|
||
| if ok := sm.restore(s); ok { | ||
| t.Fatal("restore succeed, want fail") |
There was a problem hiding this comment.
use t.errorf here for checking purpose. also doc string about why do we expect this to fail.
|
Here are a minimal list of tests I want to see initially:
|
|
@siddontang lgtm. there are a couple of things left though, but we can do them in a follow up PR:
|
|
Please file some issues so I can't forget it. I will try to use this feature in TiKV at first. |
d20cbf6 to
35926cc
Compare
|
Done @xiang90 |
|
Yeah could you fix this line https://travis-ci.org/coreos/etcd/jobs/299973342#L841? |
35926cc to
509c9bc
Compare
|
Hi @gyuho I update the format but still find https://travis-ci.org/coreos/etcd/jobs/300033398 failed, but I don't know why. |
|
@siddontang Oh that was fixed just today. If you rebase from current master, it should pass. Thanks! |
509c9bc to
63e5b66
Compare
|
I rebase the master but the CI still failed. |
raft/raft_test.go
Outdated
| sm := newRaft(cfg) | ||
| npeers[id] = sm | ||
| case *raft: | ||
| learners := make(map[uint64]bool, 0) |
There was a problem hiding this comment.
learners := make(map[uint64]bool, len(v.learnerPrs))? Our static analysis tool complains about this :0
63e5b66 to
c6f2db2
Compare
|
Jenkins failed but the failure seems irrelevant to Raft learner. |
|
CI failures are not relevant. Merging. |
|
I reviewed some codes for this pr and found there may be an issue which can affect the Since we add new However, from the code above, we return all the nodes and learners written into the |
|
Can you please open an issue for this? Ideally with a test case to
illustrate this problem. Thanks.
…On Thu, Dec 28, 2017 at 4:43 AM Vincent Lee ***@***.***> wrote:
I reviewed some codes for this pr and found there may be an issue which
can affect the
ApplyConfChange in the raft node.
// ApplyConfChange applies config change to the local node.
// Returns an opaque ConfState protobuf which must be recorded
// in snapshots. Will never return nil; it returns a pointer only
// to match MemoryStorage.Compact.
ApplyConfChange(cc pb.ConfChange) *pb.ConfState
Since we add new Learners to pb.ConfState, we should return the
pb.ConfState with Learners in the raft.
case cc := <-n.confc:
if cc.NodeID == None {
r.resetPendingConf()
select {
case n.confstatec <- pb.ConfState{Nodes: r.nodes()}:
case <-n.done:
}
break
}
switch cc.Type {
case pb.ConfChangeAddNode:
r.addNode(cc.NodeID)
case pb.ConfChangeAddLearnerNode:
r.addLearner(cc.NodeID)
case pb.ConfChangeRemoveNode:
// block incoming proposal when local node is
// removed
if cc.NodeID == r.id {
propc = nil
}
r.removeNode(cc.NodeID)
case pb.ConfChangeUpdateNode:
r.resetPendingConf()
default:
panic("unexpected conf type")
}
select {
case n.confstatec <- pb.ConfState{Nodes: r.nodes()}:
case <-n.done:
}
However, from the code above, we return all the nodes and learners written
into the Nodes and leave the Learners empty. Is that by designed or a
potential bug?
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#8751 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AERbywOgweP8c2eRFR5qyBDmmguHTr0Cks5tE2K7gaJpZM4QDxdR>
.
|
At the time of writing, we don't allow configuration changes to change voters to learners directly, but note that a snapshot may compress multiple changes to the configuration into one: the voter could have been removed, then readded as a learner and the snapshot reflects both changes. In that case, a voter receives a snapshot telling it that it is now a learner. In fact, the node has to accept that snapshot, or it is permanently cut off from the Raft log. I think this just wasn't realized in the original work, but this is just my guess since there generally is very little rationale on the various decisions made. I also generally haven't been able to figure out whether the decision to prevent voters from becoming learners without first having been removed was motivated by some particular concern, or if it just wasn't deemed necessary. I suspect it is the latter because demoting a voter seems perfectly safe. See etcd-io#8751 (comment).
At the time of writing, we don't allow configuration changes to change voters to learners directly, but note that a snapshot may compress multiple changes to the configuration into one: the voter could have been removed, then readded as a learner and the snapshot reflects both changes. In that case, a voter receives a snapshot telling it that it is now a learner. In fact, the node has to accept that snapshot, or it is permanently cut off from the Raft log. I think this just wasn't realized in the original work, but this is just my guess since there generally is very little rationale on the various decisions made. I also generally haven't been able to figure out whether the decision to prevent voters from becoming learners without first having been removed was motivated by some particular concern, or if it just wasn't deemed necessary. I suspect it is the latter because demoting a voter seems perfectly safe. See etcd-io#8751 (comment).
At the time of writing, we don't allow configuration changes to change voters to learners directly, but note that a snapshot may compress multiple changes to the configuration into one: the voter could have been removed, then readded as a learner and the snapshot reflects both changes. In that case, a voter receives a snapshot telling it that it is now a learner. In fact, the node has to accept that snapshot, or it is permanently cut off from the Raft log. I think this just wasn't realized in the original work, but this is just my guess since there generally is very little rationale on the various decisions made. I also generally haven't been able to figure out whether the decision to prevent voters from becoming learners without first having been removed was motivated by some particular concern, or if it just wasn't deemed necessary. I suspect it is the latter because demoting a voter seems perfectly safe. See etcd-io#8751 (comment).
This is a derived PR from #8605 which I think we can close now.
I just cache the voter count in this PR.
Using two processes may be easy, but I tried and found that many codes need to be changed, seem no big benefit now.
PTAL @xiang90