p2p: add prototype peer lifecycle manager by erikgrinaker · Pull Request #5882 · tendermint/tendermint

erikgrinaker · 2021-01-09T18:49:42Z

This adds a prototype peer lifecycle manager, peerManager, which stores peer data in an internal peerStore. The overall idea here is to have methods for peer lifecycle events which exchange a very narrow subset of peer data, and to keep all of the peer metadata (i.e. the peerInfo struct) internal, to decouple this from the router and simplify concurrency control. See peerManager GoDoc for more information.

The router is still responsible for actually dialing and accepting peer connections, and routing messages across them, but the peer manager is responsible for determining which peers to dial next, preventing multiple connections being established for the same peer (e.g. both inbound and outbound), and making sure we don't dial the same peer several times in parallel. Later it will also track retries and exponential backoff, as well as peer and address quality. It also assumes responsibility for peer updates subscriptions.

It's a bit unclear to me whether we want the peer manager to take on the responsibility of actually dialing and accepting connections as well, or if it should only be tracking peer state for the router while the router is responsible for all transport concerns. Let's revisit this later.

codecov · 2021-01-09T18:57:21Z

Codecov Report

Merging #5882 (3f65e83) into master (5cbb826) will increase coverage by 0.11%.
The diff coverage is 64.96%.

@@            Coverage Diff             @@
##           master    #5882      +/-   ##
==========================================
+ Coverage   59.93%   60.04%   +0.11%     
==========================================
  Files         269      269              
  Lines       24548    24591      +43     
==========================================
+ Hits        14712    14765      +53     
+ Misses       8237     8229       -8     
+ Partials     1599     1597       -2

Impacted Files	Coverage Δ
config/config.go	`76.26% <ø> (ø)`
config/toml.go	`60.86% <ø> (ø)`
p2p/router.go	`64.22% <53.57%> (-5.04%)`	⬇️
p2p/peer.go	`56.35% <71.28%> (+3.23%)`	⬆️
libs/events/events.go	`92.94% <0.00%> (-5.89%)`	⬇️
statesync/snapshots.go	`91.59% <0.00%> (-1.69%)`	⬇️
mempool/reactor.go	`80.83% <0.00%> (-1.67%)`	⬇️
blockchain/v2/reactor.go	`32.73% <0.00%> (-1.44%)`	⬇️
proxy/multi_app_conn.go	`48.05% <0.00%> (ø)`
consensus/state.go	`67.75% <0.00%> (+0.09%)`	⬆️
... and 13 more

melekes

👍

looks good to me, but I have questions about all those FIXME's

melekes · 2021-01-11T06:32:33Z

p2p/router.go

+// done, since delivery is guaranteed and will block peer
+// connection/disconnection otherwise.
+//
+// FIXME: Consider having callers just use peerManager.Subscribe() directly, if


p2p/peer.go

melekes · 2021-01-11T06:36:33Z

p2p/peer.go

+// is too expensive and also causes difficulties in tests where we may want to
+// consume peer updates and send messages sequentially).
+//
+// FIXME: This possibly indicates an architectural problem. Should the peerManager


could you expand on this?

Ready() exists only because the flow has to be like this:

Call peerManager.Connected() to check whether the connection is allowed (i.e. whether we already have a connection to this peer).

Set up router queue.

Call peerManager.Ready() to trigger peer update.

Otherwise, if we send the peer update during Connected() (which would be more logical), the reactor can send a message to the peer before the router has set up its queue, thus dropping that message. This detailed interplay between the manager and router suggests that the abstraction may be wrong, since they really shouldn't care too much about each other's internals. However, I don't think there's any way around it, and this approach is probably simpler to understand than e.g. introducing an onConnected() callback that the router could use for queue registration.

alexanderbez · 2021-01-11T16:57:54Z

It's a bit unclear to me whether we want the peer manager to take on the responsibility of actually dialing and accepting connections as well, or if it should only be tracking peer state for the router while the router is responsible for all transport concerns. Let's revisit this later.

I'm inclined to stick with the latter. Currently reviewing ...

p2p/peer.go

alexanderbez · 2021-01-11T17:18:42Z

p2p/peer.go

+	}
+}
+
+// Add adds a peer to the manager, given as an address. If the peer already


Looks like the peer address is (attempted to be) added regardless if the peer exists in the store.

Yes, this is how we get new peers into the store in the first place. peerManager.Add() is used to add peers to the store, or to add an address to an existing peer, and the rest of Tendermint identifies peers by a peer address (e.g. in configuration files and via PEX).

Yup, but I was referring to the last sentence in the godoc :)

Ah. Yes, that's also accurate I believe -- if the peer exists, we try to add the address to it (in case it's a new address for an existing peer).

p2p/peer.go

alexanderbez · 2021-01-11T17:43:36Z

p2p/peer.go

-	defer s.mtx.Unlock()
-	delete(s.claimed, id)
+// List retrieves all peers.
+func (s *peerStore) List() ([]*peerInfo, error) {


Is there a reason to return references to peerInfo objects?

Just to avoid copying the struct all over the place. This needs to be revisited, since we have to make sure the caller can't inadvertently change data in an in-memory database or in other callers, but we'll need to explicitly copy in that case anyway since e.g. slices and maps aren't copied automatically.

p2p/router.go

erikgrinaker added 4 commits January 8, 2021 18:06

split out peerManager

0a23fd8

rework peerManager design

d06058e

move peer updates into peerManager

37f44c0

comment tweak

d28365b

erikgrinaker self-assigned this Jan 9, 2021

erikgrinaker requested review from ebuchman, melekes and tessr as code owners January 9, 2021 18:49

erikgrinaker requested a review from alexanderbez January 9, 2021 18:49

comment update

ebeb071

erikgrinaker changed the title ~~p2p: app prototype peer lifecycle manager~~ p2p: add prototype peer lifecycle manager Jan 9, 2021

fix stray mutex unlocks

f9847c4

erikgrinaker closed this Jan 9, 2021

erikgrinaker reopened this Jan 9, 2021

melekes approved these changes Jan 11, 2021

View reviewed changes

alexanderbez reviewed Jan 11, 2021

View reviewed changes

erikgrinaker added 2 commits January 18, 2021 17:54

address review comments

0eaf32d

Merge branch 'master' into erik/p2p-peer-mgmt

edc0704

alexanderbez approved these changes Jan 18, 2021

View reviewed changes

fix early mutex unlock

3f65e83

erikgrinaker added the S:automerge Automatically merge PR when requirements pass label Jan 18, 2021

erikgrinaker merged commit 96215a0 into master Jan 18, 2021

erikgrinaker deleted the erik/p2p-peer-mgmt branch January 18, 2021 18:56

erikgrinaker mentioned this pull request Jan 21, 2021

p2p: internal refactor and architecture redesign #5670

Closed

57 tasks

Conversation

erikgrinaker commented Jan 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jan 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

melekes left a comment

Choose a reason for hiding this comment

Uh oh!

melekes Jan 11, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

melekes Jan 11, 2021

Choose a reason for hiding this comment

Uh oh!

erikgrinaker Jan 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexanderbez commented Jan 11, 2021

Uh oh!

Uh oh!

Uh oh!

alexanderbez Jan 11, 2021

Choose a reason for hiding this comment

Uh oh!

erikgrinaker Jan 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexanderbez Jan 18, 2021

Choose a reason for hiding this comment

Uh oh!

erikgrinaker Jan 18, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexanderbez Jan 11, 2021

Choose a reason for hiding this comment

Uh oh!

erikgrinaker Jan 18, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

erikgrinaker commented Jan 9, 2021 •

edited

Loading

codecov bot commented Jan 9, 2021 •

edited

Loading

erikgrinaker Jan 18, 2021 •

edited

Loading

erikgrinaker Jan 18, 2021 •

edited

Loading