Conversation
Codecov Report
@@ Coverage Diff @@
## p2p-docs-compare #1057 +/- ##
===================================================
Coverage ? 59.95%
===================================================
Files ? 116
Lines ? 10667
Branches ? 0
===================================================
Hits ? 6395
Misses ? 3701
Partials ? 571 |
|
|
||
| `--p2p.seed_mode` | ||
|
|
||
| The node operates in seed mode. It will kick incoming peers after sharing some peers. |
There was a problem hiding this comment.
What about: The node operates in seed mode. In seed mode, a node continuously crawls the network for peers, and upon incoming connection shares some peers and disconnect.
| Note that the auto-redial uses exponential backoff and will give up | ||
| after a day of trying to connect. | ||
|
|
||
| NOTE: If `dial_seeds` and `persistent_peers` intersect, |
There was a problem hiding this comment.
Is it seeds instead of dial_seeds?
|
|
||
| `MConnection` is a multiplex connection: | ||
|
|
||
| __multiplex__ *noun* a system or signal involving simultaneous transmission of |
There was a problem hiding this comment.
I don't understand this definition. If I am not wrong any TCP connection satisfies this definition as you can send several messages over it.
There was a problem hiding this comment.
This is the stock google definition.
Afaik, TCP is typically considered a single stream - you read and write one stream of messages. This is an abstraction to support reading/writing on multiple streams
|
|
||
| Each `MConnection` handles message transmission on multiple abstract communication | ||
| `Channel`s. Each channel has a globally unique byte id. | ||
| The byte id and the relative priorities of each `Channel` are configured upon |
There was a problem hiding this comment.
If I understand well, MConnection is different than "simple" TCP connection as it provides support for different quality of service guarantees (priorities, bandwidth provided, etc) for different channels sent over it. Is this right? Maybe it can be phrased slightly different so it's more clear what is special about MConnection.
| ### Ping and Pong | ||
|
|
||
| The ping and pong messages consist of writing a single byte to the connection; 0x1 and 0x2, respectively | ||
|
|
There was a problem hiding this comment.
. missing at the end of previous sentence.
| The ping and pong messages consist of writing a single byte to the connection; 0x1 and 0x2, respectively | ||
|
|
||
| When we haven't received any messages on an `MConnection` in a time `pingTimeout`, we send a ping message. | ||
| When a ping is received on the `MConnection`, a pong is sent in response. |
There was a problem hiding this comment.
Do we always send pong or only if there is no msg to be sent? Also we probably want to have some timeout between two consecutive pongs, i.e., we don't want to react upon every ping as this might be DDoS attack.
There was a problem hiding this comment.
this is a very good question
|
|
||
| ### Msg | ||
|
|
||
| Messages in channels are chopped into smaller msgPackets for multiplexing. |
There was a problem hiding this comment.
Not clear why this is done? TCP already provides some support cutting big messages in smaller chunks. I guess it's part of MConnection way of providing different QoS for different channels?
There was a problem hiding this comment.
Yes I believe that's right
|
|
||
| Messages are sent from a single `sendRoutine`, which loops over a select statement that results in the sending | ||
| of a ping, a pong, or a batch of data messages. The batch of data messages may include messages from multiple channels. | ||
| Message bytes are queued for sending in their respective channel, with each channel holding one unsent message at a time. |
There was a problem hiding this comment.
Why only one unsent message at a time? We might be missing some high level design goal regarding multiplexing. We should maybe try to take advantage with writing this spec to state what we want and why it matters, and then the code will be the current best way of how to provide it. By stating what and why we basically open call for innovation (different ways of implementing it) by community.
| for the channel with the given id byte `chID`. The message `msg` is serialized | ||
| using the `tendermint/wire` submodule's `WriteBinary()` reflection routine. | ||
|
|
||
| `TrySend(chID, msg)` is a nonblocking call that returns false if the channel's |
There was a problem hiding this comment.
What about: is a nonblocking call that queues the message msg in the channel with the given id byte chID if the queue is not full; otherwise it returns false immediately.
|
|
||
| ### PexReactor/AddrBook | ||
|
|
||
| A `PEXReactor` reactor implementation is provided to automate peer discovery. |
| ... | ||
|
|
||
| // Send a random message to all outbound connections | ||
| for _, peer := range switch.Peers().List() { |
There was a problem hiding this comment.
What is illustrated with the following code snippet?
There was a problem hiding this comment.
just an example of using the library. in this case to send a random message to all outbound connections (ie. peers that we dialed)
|
|
||
| Restarted full nodes can run the `blockchain` or `consensus` reactor protocols to sync up | ||
| to the latest state of the blockchain, assuming they aren't too far behind. | ||
| If they are too far behind, they may need to validate a recent `H` and `HASH` out-of-band again. |
There was a problem hiding this comment.
Don't understand this. In order to validate recent hash it needs to start from the genesis, right? Isn't it always slower then catching up?
There was a problem hiding this comment.
"validate recent hash" is a social excercise. the idea here is if you're far enough behind, you are vulnerable to nothing-at-stake attacks, so when you sync up to some recent block, you need to validate its hash socially to know you're on the right chain
| ## Sentry Node | ||
|
|
||
| Sentry nodes are guardians of a validator node and provide it access to the rest of the network. | ||
| Sentry nodes may be dynamic, but should maintain persistent connections to some evolving random subset of each other. |
There was a problem hiding this comment.
Sentry node can also probably connect to Full Nodes.
|
|
||
| ## Peer Identity | ||
|
|
||
| Tendermint peers are expected to maintain long-term persistent identities in the form of a private key. |
| ## Peer Identity | ||
|
|
||
| Tendermint peers are expected to maintain long-term persistent identities in the form of a private key. | ||
| Each peer has an ID defined as `peer.ID == peer.PrivKey.Address()`, where `Address` uses the scheme defined in go-crypto. |
There was a problem hiding this comment.
Should it be peer.ID == peer.PubKey.Address()?
| Tendermint peers are expected to maintain long-term persistent identities in the form of a private key. | ||
| Each peer has an ID defined as `peer.ID == peer.PrivKey.Address()`, where `Address` uses the scheme defined in go-crypto. | ||
|
|
||
| Peer ID's must come with some Proof-of-Work; that is, |
There was a problem hiding this comment.
Can be removed as PoW for peer id is dropped.
| This ensures they are not too easy to generate. To begin, let `target == 2^240`. | ||
|
|
||
| A single peer ID can have multiple IP addresses associated with it. | ||
| For simplicity, we only keep track of the latest one. |
There was a problem hiding this comment.
If I am not wrong, there was a case mentioned on meeting where we benefit from keeping multiple keys.
|
|
||
| Peers can also be connected to without specifying an ID, ie. just `<IP>:<PORT>`. | ||
| In this case, the peer must be authenticated out-of-band of Tendermint, | ||
| for instance via VPN |
There was a problem hiding this comment.
. is missing at the end of the sentence.
| - flip the last bit of nonce1 to get nonce2 | ||
| - if we had the smaller ephemeral pubkey, use nonce1 for receiving, nonce2 for sending; | ||
| else the opposite | ||
| - all communications from now on are encrypted using the shared secret and the nonces, where each nonce |
There was a problem hiding this comment.
It looks like sentence is not complete.
|
|
||
| ### Peer Filter | ||
|
|
||
| Before continuing, we check if the new peer has the same ID as ourselves or |
There was a problem hiding this comment.
As we don't assume some kind of central CA to exist for id/key management, it can happen (although with very small probability I guess) that two nodes exist in the network with the same id (public key). Is this a problem?
There was a problem hiding this comment.
No - that would require a private key collision and we consider those sufficiently improbable. If we had to worry about it here, we'd have to worry about it in the app layer too, and cryptocurrencies wouldn't work
|
|
||
| We also check the peer's address and public key against | ||
| an optional whitelist which can be managed through the ABCI app - | ||
| if the whitelist is enabled and the peer does not qualigy, the connection is |
There was a problem hiding this comment.
Typo: qualify instead of qualigy.
|
|
||
| ``` | ||
| type NodeInfo struct { | ||
| PubKey crypto.PubKey `json:"pub_key"` |
|
|
||
| `--p2p.seeds “1.2.3.4:466656,2.3.4.5:4444”` | ||
|
|
||
| Dials these seeds when we need more peers. They will return a list of peers and then disconnect. |
There was a problem hiding this comment.
will -> should?? i.e. what if one node from this list is not in "seed mode"?
| - all communications from now on are encrypted using the shared secret and the nonces, where each nonce | ||
| - we now have an encrypted channel, but still need to authenticate | ||
| increments by 2 every time it is used | ||
| - generate a common challenge to sign: |
There was a problem hiding this comment.
why this is called a challenge? i.e. I do not see any challenge here? what's challenging about signing a msg?
There was a problem hiding this comment.
The challenge is: "prove you have the privkey for this pubkey by signing this msg"
|
thanks for the feedback folks. see updates in https://github.com/tendermint/tendermint/pull/1076/files |
|
Looks great! |
caffix
left a comment
There was a problem hiding this comment.
Overall, the information provided within these files is quite useful. Great!
| Messages are sent from a single `sendRoutine`, which loops over a select statement that results in the sending | ||
| of a ping, a pong, or a batch of data messages. The batch of data messages may include messages from multiple channels. | ||
| Message bytes are queued for sending in their respective channel, with each channel holding one unsent message at a time. | ||
| Messages are chosen for a batch one a time from the channel with the lowest ratio of recently sent bytes to channel priority. |
There was a problem hiding this comment.
Typo: I believe it was supposed to say, "Messages are chosen for a batch one at a time ..."
| # Tendermint Peers | ||
|
|
||
| This document explains how Tendermint Peers are identified, how they connect to one another, | ||
| and how other peers are found. |
There was a problem hiding this comment.
This file does not yet appear to discuss how peers are found as this section claims. Perhaps further discussion is in order, or this claim should be removed?
| There are various cases where we decide a peer has misbehaved and we disconnect from them. | ||
| When this happens, the peer is removed from the address book and black listed for | ||
| some amount of time. We call this "Disconnect and Mark". | ||
| Note that the bad behaviour may be detected outside the PEX reactor itseld |
There was a problem hiding this comment.
Typo: "itseld" was probably meant to be "itself"
| ## Trust Metric | ||
|
|
||
| The quality of peers can be tracked in more fine-grained detail using a | ||
| Proportional-Integral-Derrivative (PID) controller that incorporates |
There was a problem hiding this comment.
Typo: "Derrivative" should be "Derivative"
| Behaviours are defined as one of: | ||
| - fatal - something outright malicious. we should disconnect and remember them. | ||
| - bad - any kind of timeout, msgs that dont unmarshal, or fail other validity checks, or msgs we didn't ask for or arent expecting | ||
| - neutral - normal correct behaviour. unknown channels/msg types (version upgrades). |
There was a problem hiding this comment.
We discussed neutral behavior being used for unknown channels, etc, and a separate type of behavior named "Correct" being used as the opposite of Bad. This way, good is reserved for behavior that is exceptionally positive.
| @@ -0,0 +1,16 @@ | |||
|
|
|||
| The trust metric tracks the quality of the peers. | |||
There was a problem hiding this comment.
Additional detail could be provided in this file using information from the architecture documents if you wish.
…#1311) * node/state:bootstrap state api (tendermint#1057) Co-authored-by: HuangYi <huang@crypto.com> Co-authored-by: yihuang <yi.codeplayer@gmail.com> Co-authored-by: Thane Thomson <connect@thanethomson.com> Co-authored-by: Sergio Mena <sergio@informal.systems> (cherry picked from commit 49127d2) # Conflicts: # state/mocks/store.go * Fixed merge conflict --------- Co-authored-by: Jasmina Malicevic <jasmina.dustinac@gmail.com>
* add bootstrap state cmd * add a missing line * Initial API impl * Added error message for missing appHash * Added changelog, removed cli * Added PR number * Unified hex encoding with rest of the code * Applied PR review comments * Proper blockstore initialization in case of offline statesync * Reverted forcing blocksync, not needed for correct operation * Added changelog and comments * Removed printfs, added check for empty state store * Fixed linter * Apply minor suggestions from code review Co-authored-by: Thane Thomson <connect@thanethomson.com> * Moved the appHash check up * Apply minor suggestions from code review Co-authored-by: Sergio Mena <sergio@informal.systems> * Apply suggestions from code review Co-authored-by: Sergio Mena <sergio@informal.systems> * Fixed linter * Do not look for VE when starting up after offline statesync * Extracted check for offline statesync outside load commit * Reconstruct seen commit after offline statesync * Call reconstructSeenCommit from reconstructLastCommit * Reading offline statesync height only once and passing it as a parameter * Moved up option initialization to make sure offline statesync is enabled * Added error to panic message * Update consensus/state.go Co-authored-by: Sergio Mena <sergio@informal.systems> * Apply suggestions from code review Co-authored-by: Sergio Mena <sergio@informal.systems> * Adjusted new lines * Added unit test to test int conversion and fixed linter * Apply suggestions from code review Co-authored-by: Thane Thomson <connect@thanethomson.com> * Replaced closing ifs with defer, added errors to error messages * linter fix * Adapted bootstrap code to use proper genesis file functions * Reverted genesis doc changes * Moved deferred closing before checking for whether the store is empty * Moved deferred close before error check --------- Co-authored-by: HuangYi <huang@crypto.com> Co-authored-by: yihuang <yi.codeplayer@gmail.com> Co-authored-by: Thane Thomson <connect@thanethomson.com> Co-authored-by: Sergio Mena <sergio@informal.systems>
* node/state:bootstrap state api (tendermint#1057) * add bootstrap state cmd * add a missing line * Initial API impl * Added error message for missing appHash * Added changelog, removed cli * Added PR number * Unified hex encoding with rest of the code * Applied PR review comments * Proper blockstore initialization in case of offline statesync * Reverted forcing blocksync, not needed for correct operation * Added changelog and comments * Removed printfs, added check for empty state store * Fixed linter * Apply minor suggestions from code review Co-authored-by: Thane Thomson <connect@thanethomson.com> * Moved the appHash check up * Apply minor suggestions from code review Co-authored-by: Sergio Mena <sergio@informal.systems> * Apply suggestions from code review Co-authored-by: Sergio Mena <sergio@informal.systems> * Fixed linter * Do not look for VE when starting up after offline statesync * Extracted check for offline statesync outside load commit * Reconstruct seen commit after offline statesync * Call reconstructSeenCommit from reconstructLastCommit * Reading offline statesync height only once and passing it as a parameter * Moved up option initialization to make sure offline statesync is enabled * Added error to panic message * Update consensus/state.go Co-authored-by: Sergio Mena <sergio@informal.systems> * Apply suggestions from code review Co-authored-by: Sergio Mena <sergio@informal.systems> * Adjusted new lines * Added unit test to test int conversion and fixed linter * Apply suggestions from code review Co-authored-by: Thane Thomson <connect@thanethomson.com> * Replaced closing ifs with defer, added errors to error messages * linter fix * Adapted bootstrap code to use proper genesis file functions * Reverted genesis doc changes * Moved deferred closing before checking for whether the store is empty * Moved deferred close before error check --------- Co-authored-by: HuangYi <huang@crypto.com> Co-authored-by: yihuang <yi.codeplayer@gmail.com> Co-authored-by: Thane Thomson <connect@thanethomson.com> Co-authored-by: Sergio Mena <sergio@informal.systems> * Fixed cherry pick conflicts * Removed go API changes in state store * Remove breaking go API changes in the blocksync reactor * Removed breaking changelog and fixed mocks * [pair programming] Found a more readable way to add the methods needed for the store * Removed duplicated code from blocksync reactor * Apply suggestions from code review Co-authored-by: Sergio Mena <sergio@informal.systems> * Removed comments * Removed unused variable from state --------- Co-authored-by: HuangYi <huang@crypto.com> Co-authored-by: yihuang <yi.codeplayer@gmail.com> Co-authored-by: Thane Thomson <connect@thanethomson.com> Co-authored-by: Sergio Mena <sergio@informal.systems>
These docs are already on develop but should have gone through PR for review tools