Conversation
|
More q's based on the discussion:
|
multiraft/multiraft.go
Outdated
There was a problem hiding this comment.
@bdarnell looked for this in the raft paper and in the protobufs but no dice. Is there an explanation of what this does anywhere?
There was a problem hiding this comment.
I'm not sure. This may be for updating the cc.Context field (which we don't use). I don't think we need to support this.
|
We need similar coalescing/fan-out for MsgHeartbeatResp as well. This will be similar on the receiving sides (except that we fan out heartbeat responses to the groups that the receiving node is leader of instead of those that the sending node leads). For sending, we should just keep track of which nodes we have sent MsgHeartbeatResp and not send to the same node twice (per call to handleRaftReady). |
|
In the works. For testing, I would replace the local transport's client by one that supports intercepting (and holding back) outgoing messages, and probably add a demux for consuming it during tests (akin to the eventsDemux). With that we should be able to get pretty good testability. |
|
Yeah, we should test this with a custom transport (see also blockingStorage in storage_test.go). I'm not sure whether the eventsDemux pattern makes sense here but if it turns out to be useful in your tests go for it. |
a8b39fd to
2f262eb
Compare
|
Ok. I'll continue development in this PR, as I wouldn't want to merge anything before I can test it anyways. |
93c0ec4 to
645fbe9
Compare
|
@bdarnell Can you take another look? |
There was a problem hiding this comment.
I looked into combining this with fanoutHeartbeatRequest, but in the end I liked it better as is.
645fbe9 to
8aa4329
Compare
multiraft/transport.go
Outdated
There was a problem hiding this comment.
I don't think we should build this into localRPCTransport. There should be a separate interceptingTransport type that either wraps an existing Transport or simply is a transport itself (I think the latter is probably better - you don't need the RPC machinery, just have Go call the ServerInterface method directly). Then this can all go in transport_test.go unless and until we have a use for it outside this package.
There was a problem hiding this comment.
Ok, will go for the thin version without RPC then. Can you explain what bothers you about the current modifications? I thought about creating an extra client but then it just seemed silly to write the same thing twice with a feature that the other one might also want to use at times.
There was a problem hiding this comment.
My concern is mainly with the dramatically different implementation (including a change to blocking behavior) depending on whether EnableEvents has been called. Since EnableEvents can only be called immediately after creation it's effectively a different type anyway.
|
@bdarnell did you see this above: |
|
I had missed that note the first time it appeared. It's good to be concerned about this; I think there are some scenarios in which these limited-information heartbeats could cause problems. For example, consider a group containing nodes A, B, and C:
We might be able to solve this on a case-by-case basis (e.g. I think two-round elections would help in the above scenario), but to be safe we should probably add a payload to the coalesced heartbeat messages to indicate which groups the sending node considers itself to be leader of. This thread on raft-dev discusses some related issues: https://groups.google.com/forum/?fromgroups#!topic/raft-dev/VgF47vIsezg. The term and log index is occasionally useful (but not, I think, strictly required. Etcd is not currently sending this information with their heartbeats). |
7b3de30 to
bc34681
Compare
|
@bdarnell new transport, refactored test inside. not complete yet but that's what i would build on for the rest of the tests. edit: will also take a look at your last comment above while doing that. probably some progress this WE. |
There was a problem hiding this comment.
If heartbeatCountMap contained pointers instead of struct values then you could mutate them without going through this read-then-write pattern.
There was a problem hiding this comment.
It's convenient though because I can avoid the usual v, ok := ...; if !ok { m[key] = ... } code.
|
LGTM |
Extracted from PR cockroachdb#280; I also need this for another change.
|
cc @cockroachdb/developers |
|
The new test LGTM. It's complicated, but I don't have any ideas for simplifying it. |
* outgoing heartbeats originating from raft are ignored * instead, each node will periodically send coalesced heartbeats to each of the other nodes. * upon receipt of a heartbeat (which is coalesced, according to the above), individual heartbeats are fed into raft for each group that the sending node belongs to and follows a leader from the origin node. * this necessitated adding a list of groupIDs to the internal `node` type. * heartbeat responses: Raft's (potentially many) generated responses are bundled up and only a single response sent. * fan out heartbeat responses to all groups considering themselves leader of a group of the sending node
* elect a leader and have it tick a number of times * count and compare the heartbeats sent and responses received TODO: add test that has followers tick as well; a closer interplay between ticks and acknowledged messages would be needed to prevent followers from turning into candidates.
currently this still tests only one group, but it sets a good baseline for basic sanity. Even writing this test took a lot of care; next step will be unittests and some multi-group tests with fairly explicit (i.e. unparametrized) setups.
|
@bdarnell I've added an extra ping and rebased. Do we want to merge it and tackle the safety considerations above separately? I have a feeling that sending around the groups could be tricky as well - thinking of millions of ranges... |
d628f27 to
c8dec08
Compare
|
Yes, let's go ahead and merge. |
currently work in progress, see #229 for early discussions (further discussions should be held here).
closes #229
cc @cockroachdb/developers