Conversation
|
Ok, so the reactor refactor itself is pretty much done. I don't know how long the tests will take me to refactor -- they seem very...messy. But I'm hopeful I can get most of them done by this week. In the interest of time, feel free to just review the core changes (i.e. |
|
Great stuff, will have a look later today! |
|
Thanks @erikgrinaker. I think this is pretty standard stuff as far as reviewal compared to the other refactors (i.e. it's pretty much the same). What I would probably spend the most effort on reviewing is the usage of |
erikgrinaker
left a comment
There was a problem hiding this comment.
Great that the consensus reactor is getting some ❤️, it's quite a bit easier to navigate now that it's split up like this.
Codecov Report
@@ Coverage Diff @@
## master #5969 +/- ##
==========================================
+ Coverage 60.72% 60.78% +0.05%
==========================================
Files 275 276 +1
Lines 25345 25521 +176
==========================================
+ Hits 15392 15513 +121
- Misses 8348 8401 +53
- Partials 1605 1607 +2
|
cmwaters
left a comment
There was a problem hiding this comment.
utACK. Really good work! 🚀 (my comments are all questions)
In terms of the design, I couldn't think of any scenarios where we might leak go routines. The ordering of the execution of processes seems sound. I also like the extraction of peer state into its own file. I guess the only qualm I have is that we have two stats that we use to keep track of peer behavior (votes and blocks) but the peer manager is never informed when anything bad happens i.e. invalid proposal or invalid vote.
| // NOTE: We need a dedicated stateCloseCh channel for signaling closure of | ||
| // the StateChannel due to the fact that the StateChannel message handler | ||
| // performs a send on the VoteSetBitsChannel. This is an antipattern, so having | ||
| // this dedicated channel,stateCloseCh, is necessary in order to avoid data races. |
There was a problem hiding this comment.
What's the reason behind the state channel sending stuff also over the vote set bits channel? Is it a timing thing? Like only on certain conditions do we sent vote set bits?
There was a problem hiding this comment.
Also just so I understand correctly, we need the separate channel because we need to make sure we close state channel before the rest of the channels right?
There was a problem hiding this comment.
What's the reason behind the state channel sending stuff also over the vote set bits channel? Is it a timing thing? Like only on certain conditions do we sent vote set bits?
Good question. To me, it seems like a code smell. i.e. you should only send and receive messages on dedicated channels and not mix-n-match. As to the why, unfortunately idk. This is an artifact of how this reactor was designed/implemented. Perhaps @ebuchman can elaborate on why we do this?
Also just so I understand correctly, we need the separate channel because we need to make sure we close state channel before the rest of the channels right?
Yes, we need it to coordinate closure of specific goroutines and ensure there are no data races.
| maxMsgSize = 1048576 // 1MB; NOTE: keep in sync with types.PartSet sizes. | ||
|
|
||
| blocksToContributeToBecomeGoodPeer = 10000 | ||
| votesToContributeToBecomeGoodPeer = 10000 |
There was a problem hiding this comment.
What's the role of something like this with respect to the new framework
There was a problem hiding this comment.
Good question. So in peerStatsRoutine we use these variables to invoke the (legacy) MarkPeerAsGood API on the switch, which internally does some logic within the address book.
In the new framework, the PeerManager has a notion of a "score" for each peer. These scores are currently binary (0 or 100). While it's not clear how exactly to implement a more sophisticated scoring system, I think the idea will be to either still have these APIs or piggy-back off of the PeerUpdate mechanism to send scoring updates.
tl;dr MarkPeerAsGood -> Update the peer score. How the score is updated remains to be determined.
Indeed this is true, but I'm not sure where/if this happens. I did a quick skim of the code and I couldn't see where this happens in the reactor or the state. Do you happen to know? |
|
Yeah look at this function here In it we have |
melekes
left a comment
There was a problem hiding this comment.
nice work 👍 left a few questions, otherwise LGTM
| for { | ||
| if !r.IsRunning() { | ||
| r.Logger.Info("stopping peerStatsRoutine") | ||
| return |
There was a problem hiding this comment.
should we flush statsMsgQueue before exiting?
There was a problem hiding this comment.
We can execute close(statsMsgQueue) during State#OnStop, so it's closed when the reactor closes. WDYT?
There was a problem hiding this comment.
Then again, the State type doesn't close any of its channels. Not sure if its worth it?
There was a problem hiding this comment.
I don't think we should close statsMsgQueue - will keep us away from panicking. You mean old implementation does not do flushing? Then I guess it's fine. But this is what I would've done to ensure all goroutines that are writing to statsMsgQueue terminate.
There was a problem hiding this comment.
You mean old implementation does not do flushing?
Yes, correct. In fact, peerStatsRoutine was largely left untouched.
This would seem like the most logical place to return the error. I don't understand why we return the entire message back to the reactor when we're just recording votes and block parts. I would almost be tempted to just have a new struct for reporting peer stats/behavior. The other thing that bothers me is this: Why do we continue to broadcast invalid votes? |
|
@cmwaters can you comment on the issue I've created? You bring up good points and I don't want them to get buried in this PR (as to also not bike-shed). |
This reverts commit 16bbe8c.
Description
Prepares the consensus reactor for the newly designed p2p changes per ADR 062.
PeerStatetopeer_state.goto improve readability and separate concerns inreactor.go.running,closer, andbroadcastWGfields to synchronize spawned goroutines when adding/removing a peer.p2p.Peer#Get|Setby makingReactorhave an internalpeers *cmap.CMapwhere we now storePeerState.Note to reviewers: To simplify review, I would mainly focus on changes to
reactor.goandpeer_state.goand perhaps some of the messages changes.ref: #5670