WIP: blockchain/reactor: respondWithNoResponseMessage for missing height by odeke-em · Pull Request #648 · tendermint/tendermint

odeke-em · 2017-09-07T10:17:01Z

Fixes #514
Replaces #540

If a peer requests a block with a height that we don't have
respond with a bcNoBlockResponseMessage.
However, according to the Tendermint spec, if all nodes are honest
this condition shouldn't occur, so this is a possible hint of an
dishonest node.

odeke-em · 2017-09-07T10:19:58Z

@ebuchman PTAL at this work in progress, it's super tricky trying to test this out because setting up the reactor itself is quite involved, the only part that's left is to figure out how to add the blocks in despite the requester being unset at the *BlockchainReactor.pool(*blockPool).requests.

Am off to bed for now but I'd appreciate some feedback and will take a look at it. Thank you!

ebuchman

The code looks fine (other than 2 comments). The testing is obviously quite difficult given the current structure of reactors. It can be simplified a bunch as is, but to do it right we probably need a peer interface.

ebuchman · 2017-09-11T18:24:47Z

blockchain/reactor.go

+	// Otherwise peer is asking for things we don't have.
+	// According to the Tendermint spec, if the peer asks for
+	// something that we don't have have, some of the nodes aren't honest
+	bcR.Logger.Debug("Respond", "src", src, "flagging", "potentially malicious peer",


Let's simplify this message. Should be Info level and msg can just be Peer asking for block we don't have. Tags can just be src and height

Roger that, thanks. I'll update that.

ebuchman · 2017-09-11T18:26:13Z

blockchain/reactor.go

 	return cmn.Fmt("[bcBlockRequestMessage %v]", m.Height)
 }

+type bcNoBlockResponseMessage struct {


Needs to be registered with go-wire. See call to RegisterInterface

Roger that, updating, thanks.

Added, thank you!

ebuchman · 2017-09-11T18:29:43Z

blockchain/reactor_test.go

+	"github.com/tendermint/tmlibs/log"
+)
+
+type bcBlockRequestMessage struct {


Why do we need this ?

ebuchman · 2017-09-11T18:33:34Z

blockchain/reactor_test.go

+	fmt.Printf("% x\n", buf.Bytes())
+}
+
+func newBlockchainReactor(logger log.Logger, blockNumbers ...int) (*blockchain.BlockchainReactor, *sm.State) {


We shouldn't need such complex setup. See how we setup consensus tests for reference.

We make whole networks in process here: https://github.com/tendermint/tendermint/blob/master/consensus/common_test.go#L357

And just the state machine here: https://github.com/tendermint/tendermint/blob/master/consensus/common_test.go#L232

We can hard code an inproc dummy app. No need for handshaker. No need for tx indexing. No need to log if we're a validator. Fast sync should be on. etc

Awesome, thank you @ebuchman for the reference. Yeah, it's mostly out of lack of familiarity, and it yeah I was also telling myself that y'all def wrote simpler test setups, but TIL!

ebuchman · 2017-09-11T18:35:51Z

blockchain/reactor_test.go

+	}
+
+	tests := [...]struct {
+		heightBytes []byte


why is this a byte array ?

It is a byte array, preserialized by go-wire so that we can pass them directly in *BlockchainReactor.Receive(byte, *p2p.Peer, []byte)

ebuchman · 2017-09-11T18:36:58Z

blockchain/reactor_test.go

+	for i, tt := range tests {
+		msgBytes := append([]byte{0x10, 0x02}, tt.heightBytes...)
+		bcr.Receive(chID, peer, msgBytes)
+		gotErr := bytes.Contains(logBuf.Bytes(), nonExistentSubstr)


Oh boy. This is unfortunate. This is why I opened #631.

ebuchman · 2017-09-11T18:38:24Z

blockchain/reactor.go

-			}
-		} else {
-			// TODO peer is asking for things we don't have.
+		if queued := bcR.respondToPeer(msg, src); !queued {


why move this out into its own function ?

It seemed quite noisy in there, plus the extra logic for responding to non-existent blocks would have crowded it more.

ebuchman · 2017-09-16T04:32:11Z

This should be easier to fix up the tests for now that #653 (peer interfaces) is merged.

melekes · 2017-09-18T20:23:29Z

blockchain/reactor.go

+// According to the Tendermint spec, if all nodes are honest, no node should
+// be requesting for a block that's non-existent.
+func (bcR *BlockchainReactor) respondToPeer(msg *bcBlockRequestMessage, src *p2p.Peer) (queued bool) {
+	// Got a request for a block. Respond with block if we have it.


I think this comment is unnecessary. the code looks simple enough

melekes · 2017-09-18T20:26:09Z

blockchain/reactor.go

+		return src.TrySend(BlockchainChannel, struct{ BlockchainMessage }{msg})
+	}
+
+	// Otherwise peer is asking for things we don't have.


Again, no need to duplicate the log message

Otherwise peer is asking for things we don't have. Peer asking for a block we don't have

According to the Tendermint spec,

do we need to focus on this? maybe we can skip this part? i.e. just if the peer asks ...

I've removed it entirely, thanks.

melekes · 2017-09-18T20:27:19Z

blockchain/reactor.go

-		} else {
-			// TODO peer is asking for things we don't have.
+		if queued := bcR.respondToPeer(msg, src); !queued {
+			// Unfortunately not queued since the queue is full.


I like this emotional touch :) haha

Fixes #514 Replaces #540 If a peer requests a block with a height that we don't have respond with a bcNoBlockResponseMessage. However, according to the Tendermint spec, if all nodes are honest this condition shouldn't occur, so this is a possible hint of an dishonest node.

odeke-em · 2017-09-19T05:17:13Z

Alright, I've updated the code, thanks for the initial reviews @ebuchman and @melekes, PTAL.

codecov-io · 2017-09-19T05:35:27Z

Codecov Report

Merging #648 into develop will increase coverage by 1.49%.
The diff coverage is 75%.

@@             Coverage Diff             @@
##           develop     #648      +/-   ##
===========================================
+ Coverage    55.66%   57.15%   +1.49%     
===========================================
  Files           80       79       -1     
  Lines         8356     8306      -50     
===========================================
+ Hits          4651     4747      +96     
+ Misses        3317     3156     -161     
- Partials       388      403      +15

ebuchman · 2017-09-21T18:26:52Z

push to next non-breaking release

melekes · 2017-09-21T18:42:59Z

blockchain/reactor_test.go

+type bcrTestPeer struct {
+	key string
+	ch  chan *keyValue
+	kvm map[byte]interface{}


not seeing where it is being used...

Right, thanks for this catch. It is a vestige from the previous round of review.

melekes · 2017-09-21T18:51:41Z

blockchain/reactor_test.go

+	// Currently the repsonses take asynchronous paths so
+	// the results are skewed. However, we can ensure that
+	// the (expectedHeight, expectedExistence) all tally up.
+	heightTally := map[int]bool{}


could you help me understand why we need this tally stuff and the code below?

can't we just test that if we call Receive(bcBlockRequestMessage{-100}) we will get bsNoBlockResponseMessage?

(p testPeer) TrySend() { map[channel] = message } bcr.Receive(bcBlockRequestMessage{-100}) assert.Equal(p.map[blockchainChannel], bcNoBlockResponse)

For sure, so in the tests we use different heights https://github.com/tendermint/tendermint/pull/648/files#diff-4b45e760192d8a11e9b4e278a8c8ea32R72.
Responses are sent with different heights echoed back, the channels might be the same for the same type of response, but the heights vary thus for different heights we strictly expect a response back. Does that make sense?

ebuchman · 2017-10-04T22:06:36Z

Ok - I have dramatically simplified the test code. @odeke-em please tell me if I'm failing to test something there that you were testing here. #714

The test in this PR was hard to read. It may have been correct, but too difficult to follow. We need to try to pare down our tests to be as simple as possible so it's easy to understand what's going on. If that means we have to refactor an entire package (as we did for this), so be it. And even then, we need to continue to ensure the tests are simple :)

odeke-em · 2017-10-04T22:44:43Z

Thanks for the simplification @ebuchman. I agree that we should aim for tests that are simple and easy to follow.
The problem though is that if we want to be thorough enough, IMHO we have to bite the bullet.
The latest simplification just checks that a block was received back, but it doesn't check that the height that was sent back was the one for the test. However, the responses are sent out of order which is what I commented here https://github.com/tendermint/tendermint/pull/648/files#diff-4b45e760192d8a11e9b4e278a8c8ea32R94
and that's why there was the complexity of tallying up heights with the respective responses and at the end ensuring that no other response came out https://github.com/tendermint/tendermint/pull/648/files#diff-4b45e760192d8a11e9b4e278a8c8ea32R143. The current test runs just len(test) times but doesn't check that we have no rogue responses out and also the heights most definitely don't match with those at i, tt, you can confirm this by printing out the responding height along with the wanted height.

ebuchman · 2017-10-05T04:58:36Z

Hmm, so since your comment I force pushed a change to check the height.

you can confirm this by printing out the responding height along with the wanted height.

This seems to contradict what's now implemented, where we explicitly check the height in the response message, and it matches the tt.height. Can you check again please? Am I still missing something?

the responses are sent out of order which is what I commented here

I don't understand this part. How so ? We are calling Receive synchronously in a loop, which calls TrySend, which just loads the msg onto a buffered channel, and then we read it off immediately after. There are no go routines. How could the messages come out of order?

The current test runs just len(test) times but doesn't check that we have no rogue responses

Valid point. We can add a final select statement to pull on the peers internal channel and ensure theres no rogue messages.

odeke-em · 2017-10-05T20:31:00Z

@ebuchman

  This seems to contradict what's now implemented, where we explicitly check the height in the 
  response message, and it matches the tt.height. Can you check again please? Am I still missing 
  something?

I don't understand this part. How so ? We are calling Receive synchronously in a loop, which calls TrySend, which just loads the msg onto a buffered channel, and then we read it off immediately after. There are no go routines. How could the messages come out of order?

Interesting, I think I caused the confusion myself, my apologies. So when adding blocks, I'd get blockStatusResponse messages which skewed the results. However, I compensated for that with https://github.com/tendermint/tendermint/pull/648/files#diff-4b45e760192d8a11e9b4e278a8c8ea32R198 and never simplified the tally code, of which this you did for me, thank you!

In short, thank you for the PR, I hadn't thought through it to propagate the simplifications I previously encountered; my last comments were outdated and the PR update that you made fixed it. LGTM!

odeke-em requested a review from ebuchman September 7, 2017 10:17

odeke-em changed the base branch from master to develop September 7, 2017 10:17

ebuchman reviewed Sep 11, 2017

View reviewed changes

ebuchman mentioned this pull request Sep 12, 2017

[blockchain] respond with nil on block request to indicate no block (… #540

Closed

melekes reviewed Sep 18, 2017

View reviewed changes

odeke-em mentioned this pull request Sep 19, 2017

blockchain: BlockchainReactor.AddPeer doesn't add peer to its pool #666

Closed

odeke-em mentioned this pull request Sep 19, 2017

doc.go, provide examples and high level description of go-wire and how to use it tendermint/go-amino#32

Closed

ebuchman modified the milestone: 0.11.0 Sep 21, 2017

melekes reviewed Sep 21, 2017

View reviewed changes

ebuchman mentioned this pull request Oct 4, 2017

Feature/no block response #714

Merged

ebuchman closed this Oct 4, 2017

odeke-em deleted the react-to-no-block-found branch May 11, 2021 13:11

Conversation

odeke-em commented Sep 7, 2017

Uh oh!

odeke-em commented Sep 7, 2017

Uh oh!

ebuchman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ebuchman commented Sep 16, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

melekes Sep 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

odeke-em commented Sep 19, 2017

Uh oh!

codecov-io commented Sep 19, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ebuchman commented Sep 21, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

melekes Sep 21, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ebuchman commented Oct 4, 2017

Uh oh!

odeke-em commented Oct 4, 2017

Uh oh!

ebuchman commented Oct 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

odeke-em commented Oct 5, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

melekes Sep 18, 2017 •

edited

Loading

codecov-io commented Sep 19, 2017 •

edited

Loading

melekes Sep 21, 2017 •

edited

Loading

ebuchman commented Oct 5, 2017 •

edited

Loading