p2p: PeerBehaviour implementation (#3539) #3552
p2p: PeerBehaviour implementation (#3539) #3552ebuchman merged 29 commits intotendermint:developfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #3552 +/- ##
===========================================
+ Coverage 64.01% 64.05% +0.04%
===========================================
Files 213 214 +1
Lines 17381 17390 +9
===========================================
+ Hits 11127 11140 +13
+ Misses 5323 5322 -1
+ Partials 931 928 -3
|
xla
left a comment
There was a problem hiding this comment.
Real good process, great to see such quick turnaround on the proposal.
Couple of things we can iterate on. 👹
+ Structs with embeded locks must specify all methods with pointer
receivers to avoid creating a copy of the embeded lock.
+ Decouple PeerBehaviour interface from Peer by parametrizing
methods with `p2p.ID` instead of `p2p.Peer`
+ Setter methods wrapped in a write lock
+ Getter methods wrapped in a read lock
+ Getter methods on storedPeerBehaviour now take a `p2p.ID`
+ Getter methods return a copy of underlying stored behaviours
+ Add doc strings to public types and methods
+ Decouple PeerBehaviour interface from Peer by parametrizing
methods with `p2p.ID` instead of `p2p.Peer`
+ Setter methods wrapped in a write lock
+ Getter methods wrapped in a read lock
+ Getter methods on storedPeerBehaviour now take a `p2p.ID`
+ Getter methods return a copy of underlying stored behaviours
+ Add doc strings to public types and methods
9b1985a to
a0eca4a
Compare
Co-Authored-By: brapse <brapse@gmail.com>
Co-Authored-By: brapse <brapse@gmail.com>
Co-Authored-By: brapse <brapse@gmail.com>
Co-Authored-By: brapse <brapse@gmail.com>
Co-Authored-By: brapse <brapse@gmail.com>
Co-Authored-By: brapse <brapse@gmail.com>
Co-Authored-By: brapse <brapse@gmail.com>
Co-Authored-By: brapse <brapse@gmail.com>
Co-Authored-By: brapse <brapse@gmail.com>
p2p/peer_behaviour.go
Outdated
| const ( | ||
| ErrorPeerBehaviourUnknown = iota | ||
| ErrorPeerBehaviourBadMessage | ||
| ErrorPeerBehaviourMessageOutofOrder |
There was a problem hiding this comment.
How do reactors specify reactor specific values for this, and should they need to? We can try to capture the common things, like timeouts and invalid messages, but presumably we may want to distinguish between those happening in the SecretConnection/MConn from those happening in specific reactors.
There was a problem hiding this comment.
Initially the reason for the introduction of the behaviour is to avoid disparate places in the codebase stopping a peer for a non-formalised error. We need to enumerate all possible errors in one place, ideally the place which decides if the error is severe enough to punish a peer. If we strive for a reactor centric definition of those errors we end up with the same kid of dynamic glue we currently try to reduce.
Do you have ideas how it could be expressed?
There was a problem hiding this comment.
I don't think we can avoid having some reactor-centric feature here without losing significant functionality or over-coupling our domains.
At the least, we could standardize where in the respective reactor packages such errors are defined. For instance it could be part of channel registration: ie https://github.com/tendermint/tendermint/blob/develop/consensus/reactor.go#L125
There was a problem hiding this comment.
What if we move this to a higher level package as mentioned in
#3552 comment? We could then follow a similar pattern as specified in ADR-30 to provide reactor-centric features:
Implementation: behaviour/peer_behaviour.go:
type PeerBehaviour interface {}
type PeerBehaviourVote struct {...}
...
type PeerBehaviourReporter interface {
Report(peerID p2p.ID, behaviour PeerBehaviour) error
}
...
func (spbr *PeerBehaviourReporterXXX)Report(peerID ID, behaviour PeerBehaviour) error {
...
switch behaviour.(type) {
case PeerBehaviourVote:
...
}
}Example consensus/reactor.go:
peerReporter.Report(peerID, PeerBehaviourVote{...})Thoughts?
p2p/peer_behaviour.go
Outdated
|
|
||
| const ( | ||
| GoodPeerBehaviourVote = iota + 100 | ||
| GoodPeerBehaviourBlockPart |
There was a problem hiding this comment.
This kind of knowledge shouldn't be in the peer package - ie. the p2p package shouldn't depend on (or have knowledge of) consensus reactor behaviour.
Do we need per-channel peer behaviour? Then we can let these behaviours be defined where the channels are set up (eg. in the specific reactors)
(Recall each reactor can utilize multiple channels, where each channel is a stream multiplexed on the underlying MConn)
There was a problem hiding this comment.
We can't really express this with the go type system and it might likely lead to hard to debug code paths again, we wouldn't gain much, compared to the current implementation.
One approach is to extrat all possible behvaviours into it's own package.
There was a problem hiding this comment.
One approach is to extrat all possible behvaviours into it's own package.
This actually might not be a bad idea. A higher level package that deals with all these things. My main concern is just to avoid having reactor-specific ideas trickle into the p2p package.
p2p/peer_behaviour.go
Outdated
| // PeerBehaviour provides an interface for reactors to signal the behaviour | ||
| // of peers synchronously to other components. | ||
| type PeerBehaviour interface { | ||
| Behaved(peerID ID, reason GoodPeerBehaviour) error |
There was a problem hiding this comment.
I'd go for something stronger than behaved. What we currently have is more like BehavedForALongTime, but maybe there's a better word for that.
There was a problem hiding this comment.
Is it important to encode the temporal part? Sounds very narrow to me.
There was a problem hiding this comment.
Behaved is the normal behaviour. Many peers are behaving even if we're not calling this method. We only call this method once a peer has behaved for a sufficiently long period of time. It's actually stronger than behave ... currently it's a peer we've been receiving valuable unique information from. I'm not sure what a better name would be, behave is probably sufficient for now, but we can almost certainly do better :)
There was a problem hiding this comment.
Also the interface is called PeerBehaviour so Behaved doesn't feel informative. How did it behave? Maybe this would be better with just Good and Bad methods?
There was a problem hiding this comment.
I agree with of issues raised here around naming. The go guidelines suggest that interfaces should have -er post fix indicating what they do. If we are reporting peer behaviour the interface should probably be PeerReporter. In which case the only thing the PeerReporter needs to do is ...Report. This frees the caller from distinguishing between good and bad behaviour. Here is one path we could take:
type PeerBehaviour int
const (
BadMessage = iota
MessageOutOfOrder
Vote
BlockPart
)
type PeerReporter interface {
Report(peerID ID, reason PeerBehaviour) error
}
type SwitchPeerReporter struct {
sw *Switch
}
func (spb *SwitchPeerReporter) Report(peerID ID, behaviour PeerBehaviour) error {
peer := spb.sw.Peers().Get(peerID)
if peer == nil {
return errors.New("Peer not found")
}
switch behaviour {
case Vote:
spb.sw.MarkPeerAsGood(peer)
case BlockPart:
spb.sw.MarkPeerAsGood(peer)
case BadMessage:
spb.sw.StopPeerForError(peer, "Bad message")
case MessageOutOfOrder:
spb.sw.StopPeerForError(peer, "Message out of order")
default:
return errors.New("Unknown behaviour")
}
return nil
}This approach still produces a central set of peer behaviours. I tend to agree with @xla that there is value in having central authority of stopping peers. However, I honestly don't have a good grasp on the extent that coupling the p2p package with reactors at this level would have on current and future plans.
There was a problem hiding this comment.
Maybe PeerBehaviourReporter? (it's a bit long but tells what it reports about too). And I wanted to throw in yet another verb/iface name: track / Tracker. As the thing keeps track of peer behaviours 🤔
I like what @brapse and @xla proposed above. You should probably collapse the cases in the actual code.
There was a problem hiding this comment.
From the Reactors perspective, I can imagine them taking on responsibility for reporting peers when they witness some behaviour. Tracking does have a nice ring to it but to me it implies some process is being triggered which might not be the case.
PeerBehaviourReporter also sounds nice, and would help distinguish between other reporters. (Metrics perhaps?).
There was a problem hiding this comment.
Other possibilites: PeerMarker, PeerEvaluator, PeerRater, PeerAppraiser, etc.
+ Change naming of reporting behaviour to `Report`
+ Remove the responsibility of distinguishing between the categories
of good and bad behaviour and instead focus on reporting behaviour.
| @@ -0,0 +1,115 @@ | |||
| package p2p | |||
There was a problem hiding this comment.
As far as I can see the tests below only use the public API 👍 If you want you can rename this test package here to p2p_test.
| } | ||
|
|
||
| if behaviours[0] != PeerBehaviourBadMessage { | ||
| t.Errorf("Expected PeerBehaviourBadMessage to have been reported") |
There was a problem hiding this comment.
Nit: not necessary to use Errorf (no formatting directive in the string).
|
|
||
| for _, items := range behaviourScript { | ||
| if !equalBehaviours(pr.GetBehaviours(items.PeerID), items.Behaviours) { | ||
| t.Errorf("Expected peer %s to have behaved \n", items.PeerID) |
There was a problem hiding this comment.
Nit: In case this fails: instead of only shoding the pperID here, wouldn't it be nicer if the test showed you what "script" was expected/wanted and what the actual/got behaviour observed was?
liamsi
left a comment
There was a problem hiding this comment.
The changes LGTM! 🚀
We might want to give some further thoughts, where the behaviour should live: #3552 (comment) (I also prefer a separate package)
Address the remaining test issues mentioned in #tendermint#3552 notably: + switch to p2p_test package + Use `Error` instead of `Errorf` when not using formatting + Add expected/got errors to `TestMockPeerBehaviourReporterConcurrency` test
* Peer behaviour test tweaks:
Address the remaining test issues mentioned in ##3552 notably:
+ switch to p2p_test package
+ Use `Error` instead of `Errorf` when not using formatting
+ Add expected/got errors to
`TestMockPeerBehaviourReporterConcurrency` test
* Peer behaviour equal behaviours test
+ slices of PeerBehaviours should be compared as histograms to
ensure they have the same set of PeerBehaviours at the same
freequncy.
* TestEqualPeerBehaviours:
+ Add tests for the equivalence between sets of PeerBehaviours
* p2p: initial implementation of peer behaviour * [p2p] re-use newMockPeer * p2p: add inline docs for peer_behaviour interface * [p2p] Align PeerBehaviour interface (tendermint#3558) * [p2p] make switchedPeerHebaviour private * [p2p] make storePeerHebaviour private * [p2p] Add CHANGELOG_PENDING entry for PeerBehaviour * [p2p] Adjustment naming for PeerBehaviour * [p2p] Add coarse lock around storedPeerBehaviour * [p2p]: Fix non-pointer methods in storedPeerBehaviour + Structs with embeded locks must specify all methods with pointer receivers to avoid creating a copy of the embeded lock. * [p2p] Thorough refactoring based on comments in tendermint#3552 + Decouple PeerBehaviour interface from Peer by parametrizing methods with `p2p.ID` instead of `p2p.Peer` + Setter methods wrapped in a write lock + Getter methods wrapped in a read lock + Getter methods on storedPeerBehaviour now take a `p2p.ID` + Getter methods return a copy of underlying stored behaviours + Add doc strings to public types and methods * [p2p] make structs public * [p2p] Test empty StoredPeerBehaviour * [p2p] typo fix * [p2p] add TestStoredPeerBehaviourConcurrency + Add a test which uses StoredPeerBehaviour in multiple goroutines to ensure thread-safety. * Update p2p/peer_behaviour.go Co-Authored-By: brapse <brapse@gmail.com> * Update p2p/peer_behaviour.go Co-Authored-By: brapse <brapse@gmail.com> * Update p2p/peer_behaviour.go Co-Authored-By: brapse <brapse@gmail.com> * Update p2p/peer_behaviour.go Co-Authored-By: brapse <brapse@gmail.com> * Update p2p/peer_behaviour.go Co-Authored-By: brapse <brapse@gmail.com> * Update p2p/peer_behaviour_test.go Co-Authored-By: brapse <brapse@gmail.com> * Update p2p/peer_behaviour.go Co-Authored-By: brapse <brapse@gmail.com> * Update p2p/peer_behaviour.go Co-Authored-By: brapse <brapse@gmail.com> * Update p2p/peer_behaviour.go Co-Authored-By: brapse <brapse@gmail.com> * Update p2p/peer_behaviour.go Co-Authored-By: brapse <brapse@gmail.com> * [p2p] field ordering convention * p2p: peer behaviour refactor + Change naming of reporting behaviour to `Report` + Remove the responsibility of distinguishing between the categories of good and bad behaviour and instead focus on reporting behaviour. * p2p: rename PeerReporter -> PeerBehaviourReporter
Address the remaining test issues mentioned in #tendermint#3552 notably: + switch to p2p_test package + Use `Error` instead of `Errorf` when not using formatting + Add expected/got errors to `TestMockPeerBehaviourReporterConcurrency` test
Implementation of ADR-039 #3539
StoredPeerBehaviour