-
Notifications
You must be signed in to change notification settings - Fork 780
Description
Migrated from tendermint/tendermint#9548.
The method ensurePeers defines when a node should dial peer addresses in order to establish outbound peer connections. It dials new peers when the number of outbound peers plus the number of ongoing connections attempts is lower than the configured MaxNumOutboundPeers parameter. If the node defines it has to dial peers but it cannot find, in its address book, any suitable peer to dial, this method will submit a PEX request to the configured seed nodes in order to retrieve candidate peer addresses.
The ensurePeers method is invoked periodically, every ensurePeersPeriod = 30s. This means that between requesting peer addresses to a seed node and actually attempting to dial them, a node will wait for this interval, which was considered too long (tendermint/tendermint#2093). A fix was proposed in tendermint/tendermint#2115, where ensurePeers was immediately invoked when a node receives peer addresses provided by a seed node. This solution introduced another issue (tendermint/tendermint#3338): by executing ensurePeers in intervals below ensurePeersPeriod, a node could end up submitting PEX requests too often, which consists on a misbehavior.
To solve this issue, tendermint/tendermint#3762 replaced the invocation of ensurePeers, when receiving addresses from a seed node, by go routines that attempt connecting to all peer addresses received from a seed node. The introduced code excerpt, however, does not takes into consideration the number of connected peers or of connection attempts, nor the MaxNumOutboundPeers parameter. In other words, the logic implemented in ensurePeers to limit the number of outbound peers is ignored. As a result, upon resorting on a seed node to retrieve candidate peers to dial, a node may end up connected to all peers informed by the seed node, which tends to be more than the configured MaxNumOutboundPeers, with default value of 10.
This breach in the limits of outbound peers was identified when writing the spec of the p2p layer (tendermint/tendermint#9089). In testnet experiments (tendermint/tendermint#9499), it was observed that nodes had more peers than the maximum configured number (MaxNumOutboundPeers + MaxNumInboundPeers, 50 by default). This issue represents the suspect that the reason for this is the breach above detailed. Reinforces this suspect the fact that nodes in the testnet resort by default to seed nodes to learn addresses of potential peers.
Suggested fix: restore the solution introduced by tendermint/tendermint#2115, while creating another way to prevent sending PEX requests to often to connected peers or seed nodes.