WIP: blockchain/reactor: respondWithNoResponseMessage for missing height#648
WIP: blockchain/reactor: respondWithNoResponseMessage for missing height#648odeke-em wants to merge 1 commit intotendermint:developfrom orijtech:react-to-no-block-found
Conversation
|
@ebuchman PTAL at this work in progress, it's super tricky trying to test this out because setting up the reactor itself is quite involved, the only part that's left is to figure out how to add the blocks in despite the requester being unset at the *BlockchainReactor.pool(*blockPool).requests. Am off to bed for now but I'd appreciate some feedback and will take a look at it. Thank you! |
ebuchman
left a comment
There was a problem hiding this comment.
The code looks fine (other than 2 comments). The testing is obviously quite difficult given the current structure of reactors. It can be simplified a bunch as is, but to do it right we probably need a peer interface.
blockchain/reactor.go
Outdated
| // Otherwise peer is asking for things we don't have. | ||
| // According to the Tendermint spec, if the peer asks for | ||
| // something that we don't have have, some of the nodes aren't honest | ||
| bcR.Logger.Debug("Respond", "src", src, "flagging", "potentially malicious peer", |
There was a problem hiding this comment.
Let's simplify this message. Should be Info level and msg can just be Peer asking for block we don't have. Tags can just be src and height
There was a problem hiding this comment.
Roger that, thanks. I'll update that.
| return cmn.Fmt("[bcBlockRequestMessage %v]", m.Height) | ||
| } | ||
|
|
||
| type bcNoBlockResponseMessage struct { |
There was a problem hiding this comment.
Needs to be registered with go-wire. See call to RegisterInterface
There was a problem hiding this comment.
Roger that, updating, thanks.
blockchain/reactor_test.go
Outdated
| "github.com/tendermint/tmlibs/log" | ||
| ) | ||
|
|
||
| type bcBlockRequestMessage struct { |
blockchain/reactor_test.go
Outdated
| fmt.Printf("% x\n", buf.Bytes()) | ||
| } | ||
|
|
||
| func newBlockchainReactor(logger log.Logger, blockNumbers ...int) (*blockchain.BlockchainReactor, *sm.State) { |
There was a problem hiding this comment.
We shouldn't need such complex setup. See how we setup consensus tests for reference.
We make whole networks in process here: https://github.com/tendermint/tendermint/blob/master/consensus/common_test.go#L357
And just the state machine here: https://github.com/tendermint/tendermint/blob/master/consensus/common_test.go#L232
We can hard code an inproc dummy app. No need for handshaker. No need for tx indexing. No need to log if we're a validator. Fast sync should be on. etc
There was a problem hiding this comment.
Awesome, thank you @ebuchman for the reference. Yeah, it's mostly out of lack of familiarity, and it yeah I was also telling myself that y'all def wrote simpler test setups, but TIL!
blockchain/reactor_test.go
Outdated
| } | ||
|
|
||
| tests := [...]struct { | ||
| heightBytes []byte |
There was a problem hiding this comment.
why is this a byte array ?
There was a problem hiding this comment.
It is a byte array, preserialized by go-wire so that we can pass them directly in *BlockchainReactor.Receive(byte, *p2p.Peer, []byte)
blockchain/reactor_test.go
Outdated
| for i, tt := range tests { | ||
| msgBytes := append([]byte{0x10, 0x02}, tt.heightBytes...) | ||
| bcr.Receive(chID, peer, msgBytes) | ||
| gotErr := bytes.Contains(logBuf.Bytes(), nonExistentSubstr) |
There was a problem hiding this comment.
Oh boy. This is unfortunate. This is why I opened #631.
| } | ||
| } else { | ||
| // TODO peer is asking for things we don't have. | ||
| if queued := bcR.respondToPeer(msg, src); !queued { |
There was a problem hiding this comment.
why move this out into its own function ?
There was a problem hiding this comment.
It seemed quite noisy in there, plus the extra logic for responding to non-existent blocks would have crowded it more.
|
This should be easier to fix up the tests for now that #653 (peer interfaces) is merged. |
blockchain/reactor.go
Outdated
| // According to the Tendermint spec, if all nodes are honest, no node should | ||
| // be requesting for a block that's non-existent. | ||
| func (bcR *BlockchainReactor) respondToPeer(msg *bcBlockRequestMessage, src *p2p.Peer) (queued bool) { | ||
| // Got a request for a block. Respond with block if we have it. |
There was a problem hiding this comment.
I think this comment is unnecessary. the code looks simple enough
blockchain/reactor.go
Outdated
| return src.TrySend(BlockchainChannel, struct{ BlockchainMessage }{msg}) | ||
| } | ||
|
|
||
| // Otherwise peer is asking for things we don't have. |
There was a problem hiding this comment.
Again, no need to duplicate the log message
Otherwise peer is asking for things we don't have.
Peer asking for a block we don't have
According to the Tendermint spec,
do we need to focus on this? maybe we can skip this part? i.e. just if the peer asks ...
There was a problem hiding this comment.
I've removed it entirely, thanks.
| } else { | ||
| // TODO peer is asking for things we don't have. | ||
| if queued := bcR.respondToPeer(msg, src); !queued { | ||
| // Unfortunately not queued since the queue is full. |
There was a problem hiding this comment.
I like this emotional touch :) haha
Codecov Report
@@ Coverage Diff @@
## develop #648 +/- ##
===========================================
+ Coverage 55.66% 57.15% +1.49%
===========================================
Files 80 79 -1
Lines 8356 8306 -50
===========================================
+ Hits 4651 4747 +96
+ Misses 3317 3156 -161
- Partials 388 403 +15 |
|
push to next non-breaking release |
| type bcrTestPeer struct { | ||
| key string | ||
| ch chan *keyValue | ||
| kvm map[byte]interface{} |
There was a problem hiding this comment.
not seeing where it is being used...
There was a problem hiding this comment.
Right, thanks for this catch. It is a vestige from the previous round of review.
| // Currently the repsonses take asynchronous paths so | ||
| // the results are skewed. However, we can ensure that | ||
| // the (expectedHeight, expectedExistence) all tally up. | ||
| heightTally := map[int]bool{} |
There was a problem hiding this comment.
could you help me understand why we need this tally stuff and the code below?
can't we just test that if we call Receive(bcBlockRequestMessage{-100}) we will get bsNoBlockResponseMessage?
(p testPeer) TrySend() {
map[channel] = message
}
bcr.Receive(bcBlockRequestMessage{-100})
assert.Equal(p.map[blockchainChannel], bcNoBlockResponse)
There was a problem hiding this comment.
For sure, so in the tests we use different heights https://github.com/tendermint/tendermint/pull/648/files#diff-4b45e760192d8a11e9b4e278a8c8ea32R72.
Responses are sent with different heights echoed back, the channels might be the same for the same type of response, but the heights vary thus for different heights we strictly expect a response back. Does that make sense?
|
Ok - I have dramatically simplified the test code. @odeke-em please tell me if I'm failing to test something there that you were testing here. #714 The test in this PR was hard to read. It may have been correct, but too difficult to follow. We need to try to pare down our tests to be as simple as possible so it's easy to understand what's going on. If that means we have to refactor an entire package (as we did for this), so be it. And even then, we need to continue to ensure the tests are simple :) |
|
Thanks for the simplification @ebuchman. I agree that we should aim for tests that are simple and easy to follow. |
|
Hmm, so since your comment I force pushed a change to check the height.
This seems to contradict what's now implemented, where we explicitly check the height in the response message, and it matches the tt.height. Can you check again please? Am I still missing something?
I don't understand this part. How so ? We are calling Receive synchronously in a loop, which calls TrySend, which just loads the msg onto a buffered channel, and then we read it off immediately after. There are no go routines. How could the messages come out of order?
Valid point. We can add a final select statement to pull on the peers internal channel and ensure theres no rogue messages. |
Interesting, I think I caused the confusion myself, my apologies. So when adding blocks, I'd get blockStatusResponse messages which skewed the results. However, I compensated for that with https://github.com/tendermint/tendermint/pull/648/files#diff-4b45e760192d8a11e9b4e278a8c8ea32R198 and never simplified the tally code, of which this you did for me, thank you! In short, thank you for the PR, I hadn't thought through it to propagate the simplifications I previously encountered; my last comments were outdated and the PR update that you made fixed it. LGTM! |
Fixes #514
Replaces #540
If a peer requests a block with a height that we don't have
respond with a bcNoBlockResponseMessage.
However, according to the Tendermint spec, if all nodes are honest
this condition shouldn't occur, so this is a possible hint of an
dishonest node.