Skip to content

For Review: P2P docs#1057

Closed
ebuchman wants to merge 2 commits intop2p-docs-comparefrom
p2p-docs
Closed

For Review: P2P docs#1057
ebuchman wants to merge 2 commits intop2p-docs-comparefrom
p2p-docs

Conversation

@ebuchman
Copy link
Contributor

@ebuchman ebuchman commented Jan 4, 2018

These docs are already on develop but should have gone through PR for review tools

@ebuchman ebuchman requested a review from melekes as a code owner January 4, 2018 17:34
@ebuchman ebuchman mentioned this pull request Jan 4, 2018
@codecov-io
Copy link

codecov-io commented Jan 4, 2018

Codecov Report

❗ No coverage uploaded for pull request base (p2p-docs-compare@008de93). Click here to learn what that means.
The diff coverage is n/a.

@@                 Coverage Diff                 @@
##             p2p-docs-compare    #1057   +/-   ##
===================================================
  Coverage                    ?   59.95%           
===================================================
  Files                       ?      116           
  Lines                       ?    10667           
  Branches                    ?        0           
===================================================
  Hits                        ?     6395           
  Misses                      ?     3701           
  Partials                    ?      571


`--p2p.seed_mode`

The node operates in seed mode. It will kick incoming peers after sharing some peers.
Copy link
Contributor

@milosevic milosevic Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about: The node operates in seed mode. In seed mode, a node continuously crawls the network for peers, and upon incoming connection shares some peers and disconnect.

Note that the auto-redial uses exponential backoff and will give up
after a day of trying to connect.

NOTE: If `dial_seeds` and `persistent_peers` intersect,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it seeds instead of dial_seeds?


`MConnection` is a multiplex connection:

__multiplex__ *noun* a system or signal involving simultaneous transmission of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this definition. If I am not wrong any TCP connection satisfies this definition as you can send several messages over it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the stock google definition.

Afaik, TCP is typically considered a single stream - you read and write one stream of messages. This is an abstraction to support reading/writing on multiple streams


Each `MConnection` handles message transmission on multiple abstract communication
`Channel`s. Each channel has a globally unique byte id.
The byte id and the relative priorities of each `Channel` are configured upon
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand well, MConnection is different than "simple" TCP connection as it provides support for different quality of service guarantees (priorities, bandwidth provided, etc) for different channels sent over it. Is this right? Maybe it can be phrased slightly different so it's more clear what is special about MConnection.

### Ping and Pong

The ping and pong messages consist of writing a single byte to the connection; 0x1 and 0x2, respectively

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. missing at the end of previous sentence.

The ping and pong messages consist of writing a single byte to the connection; 0x1 and 0x2, respectively

When we haven't received any messages on an `MConnection` in a time `pingTimeout`, we send a ping message.
When a ping is received on the `MConnection`, a pong is sent in response.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we always send pong or only if there is no msg to be sent? Also we probably want to have some timeout between two consecutive pongs, i.e., we don't want to react upon every ping as this might be DDoS attack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a very good question

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a TODO in code


### Msg

Messages in channels are chopped into smaller msgPackets for multiplexing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear why this is done? TCP already provides some support cutting big messages in smaller chunks. I guess it's part of MConnection way of providing different QoS for different channels?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I believe that's right


Messages are sent from a single `sendRoutine`, which loops over a select statement that results in the sending
of a ping, a pong, or a batch of data messages. The batch of data messages may include messages from multiple channels.
Message bytes are queued for sending in their respective channel, with each channel holding one unsent message at a time.
Copy link
Contributor

@milosevic milosevic Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only one unsent message at a time? We might be missing some high level design goal regarding multiplexing. We should maybe try to take advantage with writing this spec to state what we want and why it matters, and then the code will be the current best way of how to provide it. By stating what and why we basically open call for innovation (different ways of implementing it) by community.

for the channel with the given id byte `chID`. The message `msg` is serialized
using the `tendermint/wire` submodule's `WriteBinary()` reflection routine.

`TrySend(chID, msg)` is a nonblocking call that returns false if the channel's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about: is a nonblocking call that queues the message msg in the channel with the given id byte chID if the queue is not full; otherwise it returns false immediately.


### PexReactor/AddrBook

A `PEXReactor` reactor implementation is provided to automate peer discovery.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What PEX stands for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Peer Exchange

...

// Send a random message to all outbound connections
for _, peer := range switch.Peers().List() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is illustrated with the following code snippet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just an example of using the library. in this case to send a random message to all outbound connections (ie. peers that we dialed)


Restarted full nodes can run the `blockchain` or `consensus` reactor protocols to sync up
to the latest state of the blockchain, assuming they aren't too far behind.
If they are too far behind, they may need to validate a recent `H` and `HASH` out-of-band again.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand this. In order to validate recent hash it needs to start from the genesis, right? Isn't it always slower then catching up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"validate recent hash" is a social excercise. the idea here is if you're far enough behind, you are vulnerable to nothing-at-stake attacks, so when you sync up to some recent block, you need to validate its hash socially to know you're on the right chain

## Sentry Node

Sentry nodes are guardians of a validator node and provide it access to the rest of the network.
Sentry nodes may be dynamic, but should maintain persistent connections to some evolving random subset of each other.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sentry node can also probably connect to Full Nodes.


## Peer Identity

Tendermint peers are expected to maintain long-term persistent identities in the form of a private key.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be public key?

## Peer Identity

Tendermint peers are expected to maintain long-term persistent identities in the form of a private key.
Each peer has an ID defined as `peer.ID == peer.PrivKey.Address()`, where `Address` uses the scheme defined in go-crypto.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be peer.ID == peer.PubKey.Address()?

Tendermint peers are expected to maintain long-term persistent identities in the form of a private key.
Each peer has an ID defined as `peer.ID == peer.PrivKey.Address()`, where `Address` uses the scheme defined in go-crypto.

Peer ID's must come with some Proof-of-Work; that is,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed as PoW for peer id is dropped.

This ensures they are not too easy to generate. To begin, let `target == 2^240`.

A single peer ID can have multiple IP addresses associated with it.
For simplicity, we only keep track of the latest one.
Copy link
Contributor

@milosevic milosevic Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not wrong, there was a case mentioned on meeting where we benefit from keeping multiple keys.


Peers can also be connected to without specifying an ID, ie. just `<IP>:<PORT>`.
In this case, the peer must be authenticated out-of-band of Tendermint,
for instance via VPN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. is missing at the end of the sentence.

- flip the last bit of nonce1 to get nonce2
- if we had the smaller ephemeral pubkey, use nonce1 for receiving, nonce2 for sending;
else the opposite
- all communications from now on are encrypted using the shared secret and the nonces, where each nonce
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like sentence is not complete.


### Peer Filter

Before continuing, we check if the new peer has the same ID as ourselves or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we don't assume some kind of central CA to exist for id/key management, it can happen (although with very small probability I guess) that two nodes exist in the network with the same id (public key). Is this a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - that would require a private key collision and we consider those sufficiently improbable. If we had to worry about it here, we'd have to worry about it in the app layer too, and cryptocurrencies wouldn't work


We also check the peer's address and public key against
an optional whitelist which can be managed through the ABCI app -
if the whitelist is enabled and the peer does not qualigy, the connection is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: qualify instead of qualigy.


```
type NodeInfo struct {
PubKey crypto.PubKey `json:"pub_key"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe remove json stuff.


`--p2p.seeds “1.2.3.4:466656,2.3.4.5:4444”`

Dials these seeds when we need more peers. They will return a list of peers and then disconnect.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will -> should?? i.e. what if one node from this list is not in "seed mode"?

- all communications from now on are encrypted using the shared secret and the nonces, where each nonce
- we now have an encrypted channel, but still need to authenticate
increments by 2 every time it is used
- generate a common challenge to sign:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this is called a challenge? i.e. I do not see any challenge here? what's challenging about signing a msg?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The challenge is: "prove you have the privkey for this pubkey by signing this msg"

@ebuchman
Copy link
Contributor Author

ebuchman commented Jan 7, 2018

thanks for the feedback folks. see updates in https://github.com/tendermint/tendermint/pull/1076/files

@milosevic
Copy link
Contributor

Looks great!

Copy link
Contributor

@caffix caffix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the information provided within these files is quite useful. Great!

Messages are sent from a single `sendRoutine`, which loops over a select statement that results in the sending
of a ping, a pong, or a batch of data messages. The batch of data messages may include messages from multiple channels.
Message bytes are queued for sending in their respective channel, with each channel holding one unsent message at a time.
Messages are chosen for a batch one a time from the channel with the lowest ratio of recently sent bytes to channel priority.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: I believe it was supposed to say, "Messages are chosen for a batch one at a time ..."

# Tendermint Peers

This document explains how Tendermint Peers are identified, how they connect to one another,
and how other peers are found.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file does not yet appear to discuss how peers are found as this section claims. Perhaps further discussion is in order, or this claim should be removed?

There are various cases where we decide a peer has misbehaved and we disconnect from them.
When this happens, the peer is removed from the address book and black listed for
some amount of time. We call this "Disconnect and Mark".
Note that the bad behaviour may be detected outside the PEX reactor itseld
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "itseld" was probably meant to be "itself"

## Trust Metric

The quality of peers can be tracked in more fine-grained detail using a
Proportional-Integral-Derrivative (PID) controller that incorporates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "Derrivative" should be "Derivative"

Behaviours are defined as one of:
- fatal - something outright malicious. we should disconnect and remember them.
- bad - any kind of timeout, msgs that dont unmarshal, or fail other validity checks, or msgs we didn't ask for or arent expecting
- neutral - normal correct behaviour. unknown channels/msg types (version upgrades).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed neutral behavior being used for unknown channels, etc, and a separate type of behavior named "Correct" being used as the opposite of Bad. This way, good is reserved for behavior that is exceptionally positive.

@@ -0,0 +1,16 @@

The trust metric tracks the quality of the peers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional detail could be provided in this file using information from the architecture documents if you wish.

@ebuchman
Copy link
Contributor Author

Thanks @caffix . Fixes in #1123

@ebuchman ebuchman closed this Jan 19, 2018
@ebuchman ebuchman deleted the p2p-docs branch January 21, 2018 05:39
firelizzard18 pushed a commit to AccumulateNetwork/tendermint that referenced this pull request Nov 24, 2023
…#1311)

* node/state:bootstrap state api (tendermint#1057)



Co-authored-by: HuangYi <huang@crypto.com>
Co-authored-by: yihuang <yi.codeplayer@gmail.com>
Co-authored-by: Thane Thomson <connect@thanethomson.com>
Co-authored-by: Sergio Mena <sergio@informal.systems>
(cherry picked from commit 49127d2)

# Conflicts:
#	state/mocks/store.go

* Fixed merge conflict

---------

Co-authored-by: Jasmina Malicevic <jasmina.dustinac@gmail.com>
firelizzard18 pushed a commit to AccumulateNetwork/tendermint that referenced this pull request Feb 1, 2024
* add bootstrap state cmd

* add a missing line

* Initial API impl

* Added error message for missing appHash

* Added changelog, removed cli

* Added PR number

* Unified hex encoding with rest of the code

* Applied PR review comments

* Proper blockstore initialization in case of offline statesync

* Reverted forcing blocksync, not needed for correct operation

* Added changelog and comments

* Removed printfs, added check for empty state store

* Fixed linter

* Apply minor suggestions from code review

Co-authored-by: Thane Thomson <connect@thanethomson.com>

* Moved the appHash check up

* Apply minor suggestions from code review

Co-authored-by: Sergio Mena <sergio@informal.systems>

* Apply suggestions from code review

Co-authored-by: Sergio Mena <sergio@informal.systems>

* Fixed linter

* Do not look for VE when starting up after offline statesync

* Extracted  check for offline statesync outside load commit

* Reconstruct seen commit after offline statesync

* Call reconstructSeenCommit from reconstructLastCommit

* Reading offline statesync height only once and passing it as a parameter

* Moved up option initialization to make sure offline statesync is enabled

* Added error to panic message

* Update consensus/state.go

Co-authored-by: Sergio Mena <sergio@informal.systems>

* Apply suggestions from code review

Co-authored-by: Sergio Mena <sergio@informal.systems>

* Adjusted new lines

* Added unit test to test int conversion and fixed linter

* Apply suggestions from code review

Co-authored-by: Thane Thomson <connect@thanethomson.com>

* Replaced closing ifs with defer, added errors to error messages

* linter fix

* Adapted bootstrap code to use proper genesis file functions

* Reverted genesis doc changes

* Moved deferred closing before checking for whether the store is empty

* Moved deferred close before error check

---------

Co-authored-by: HuangYi <huang@crypto.com>
Co-authored-by: yihuang <yi.codeplayer@gmail.com>
Co-authored-by: Thane Thomson <connect@thanethomson.com>
Co-authored-by: Sergio Mena <sergio@informal.systems>
cboh4 pushed a commit to scrtlabs/tendermint that referenced this pull request Apr 7, 2025
* node/state:bootstrap state api (tendermint#1057)

* add bootstrap state cmd

* add a missing line

* Initial API impl

* Added error message for missing appHash

* Added changelog, removed cli

* Added PR number

* Unified hex encoding with rest of the code

* Applied PR review comments

* Proper blockstore initialization in case of offline statesync

* Reverted forcing blocksync, not needed for correct operation

* Added changelog and comments

* Removed printfs, added check for empty state store

* Fixed linter

* Apply minor suggestions from code review

Co-authored-by: Thane Thomson <connect@thanethomson.com>

* Moved the appHash check up

* Apply minor suggestions from code review

Co-authored-by: Sergio Mena <sergio@informal.systems>

* Apply suggestions from code review

Co-authored-by: Sergio Mena <sergio@informal.systems>

* Fixed linter

* Do not look for VE when starting up after offline statesync

* Extracted  check for offline statesync outside load commit

* Reconstruct seen commit after offline statesync

* Call reconstructSeenCommit from reconstructLastCommit

* Reading offline statesync height only once and passing it as a parameter

* Moved up option initialization to make sure offline statesync is enabled

* Added error to panic message

* Update consensus/state.go

Co-authored-by: Sergio Mena <sergio@informal.systems>

* Apply suggestions from code review

Co-authored-by: Sergio Mena <sergio@informal.systems>

* Adjusted new lines

* Added unit test to test int conversion and fixed linter

* Apply suggestions from code review

Co-authored-by: Thane Thomson <connect@thanethomson.com>

* Replaced closing ifs with defer, added errors to error messages

* linter fix

* Adapted bootstrap code to use proper genesis file functions

* Reverted genesis doc changes

* Moved deferred closing before checking for whether the store is empty

* Moved deferred close before error check

---------

Co-authored-by: HuangYi <huang@crypto.com>
Co-authored-by: yihuang <yi.codeplayer@gmail.com>
Co-authored-by: Thane Thomson <connect@thanethomson.com>
Co-authored-by: Sergio Mena <sergio@informal.systems>

* Fixed cherry pick conflicts

* Removed go API changes in state store

* Remove breaking go API changes in the blocksync reactor

* Removed breaking changelog and fixed mocks

* [pair programming] Found a more readable way to add the methods needed for the store

* Removed duplicated code from blocksync reactor

* Apply suggestions from code review

Co-authored-by: Sergio Mena <sergio@informal.systems>

* Removed comments

* Removed unused variable from state

---------

Co-authored-by: HuangYi <huang@crypto.com>
Co-authored-by: yihuang <yi.codeplayer@gmail.com>
Co-authored-by: Thane Thomson <connect@thanethomson.com>
Co-authored-by: Sergio Mena <sergio@informal.systems>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants