Lite Client Spec by tac0turtle · Pull Request #3796 · tendermint/tendermint

tac0turtle · 2019-07-14T08:39:10Z

Opening this PR to spur discussions here

Referenced an issue explaining the need for the change
Updated all relevant documentation in docs
Updated all code comments where relevant
Wrote tests
Updated CHANGELOG_PENDING.md

zmanian · 2019-07-14T18:13:13Z

docs/spec/consensus/light-client.md

-blockchain header, and further the corresponding Merkle proofs.
+## Lite client requirements from Tendermint and Proof of Stake modules
+
+Before explaining lite client mechanisms and operations we need to define some requirements expected from the Tendermint blockchain. Tendermint provides a deterministic, Byzantine fault-tolerant, source of time (called


So I don't think that BFT has a huge amount to with lite clients.

Weak Subjectivity is entirely in terms of clients local clock.

It is slightly relevant in an IBC context where BFT time is the local clock because the lite client is another blockchain.

I think we need to constrain drifts between local time and BFT time to be able to come up with some guarantees. Otherwise (at least in theory) clocks can drift arbitrarily.

I thought we're going to remove BFT time, no? #2840

The plan isn't to remove BFT time but to make it proposer based so the attack is 2f+1 instead of f+1 to accelerate time.

cwgoes

Commented! I think this PR, #3710, and #3795 should be combined.

cwgoes · 2019-07-15T14:57:52Z

docs/spec/consensus/light-client.md

+
+Before explaining lite client mechanisms and operations we need to define some requirements expected from the Tendermint blockchain. Tendermint provides a deterministic, Byzantine fault-tolerant, source of time (called
+[BFT Time](/Users/zarkomilosevic/go-workspace/src/github.com/tendermint/tendermint/docs/spec/consensus/bft-time.md)).
+BFT time is monotonically increasing and in case of at most 1/3 of voting power equivalent of faulty validators guaranteed to be close to the wall time of correct validators. For correct functioning of lite client we need a guarantee that BFT time does not drift more than some known parameter BFT_TIME_DRIFT_BOUND (that should normally be measured in hours, maybe even days) from client wall time. Note that this requirement currently only holds in case 


What is "close"?

It is made more concrete below.

cwgoes · 2019-07-15T14:59:22Z

docs/spec/consensus/light-client.md

+BFT time is monotonically increasing and in case of at most 1/3 of voting power equivalent of faulty validators guaranteed to be close to the wall time of correct validators. For correct functioning of lite client we need a guarantee that BFT time does not drift more than some known parameter BFT_TIME_DRIFT_BOUND (that should normally be measured in hours, maybe even days) from client wall time. Note that this requirement currently only holds in case 
+at most 1/3 of voting power equivalent of validators report wrong time, but we might need to strengthen this requirement further to also be able to tolerate time-related misbehavior of more than 1/3 voting power equivalent of validators (https://github.com/tendermint/tendermint/issues/2653, https://github.com/tendermint/tendermint/issues/2840).  
+
+Furthermore, lite client security is tightly coupled with the notion of UNBONDING_PERIOD that is at the core of the security of proof of stake blockchain systems (for example Cosmos Hub). UNBONDING_PERIOD is period of time that needs to pass from the withdraw event until stake is liquid. During this period unbonded validator cannot participate in the consensus protocol (and is therefore not rewarded) but can be slashed for misbehavior (done either before withdraw event or during UNBONDING_PERIOD). This is used to protect against a validator attacking the network and then immediately withdrawing his stake. Cosmos Hub is currently enforcing a 21-day UNBONDING_PERIOD. Note that UNBONDING_PERIOD is measured with respect to BFT time and that this has significant effect on the security of lite client operation as validators are not slashable outside their UNBONDING_PERIOD. There is a hidden implicit assumptions regarding the UNBONDING_PERIOD: we assume that Tendermint will always generate blocks within duration of UNBONDING_PERIOD. If chain halts for the duration of UNBONDING_PERIOD security of lite clients are jeopardized. Probably more secure solution would be defining UNBONDING_PERIOD as a hybrid of wall time and logical time (number of block heights). In that case UNBONDING_PERIOD is over when the both conditions are true. In that case no assumption is being made on the chain progress (which is in theory hard to make as Tendermint operate in partially synchronous system model), and system is secure (including lite clients) even in case of long halts.   


The proof-of-stake system is defined in the Cosmos SDK, it's not (necessarily) particular to the Cosmos Hub.

Validators are unbonding during the unbonding period.

We also assume synchrony of evidence submission.

cwgoes · 2019-07-15T14:59:28Z

docs/spec/consensus/light-client.md

+obtain `ResultValidators` that contains validators that has committed the block h. Then we check if MerkleRoot
+of the validator set is equal to the trusted validator set hash. If verification failed, initialization exits with error, otherwise it proceeds. 
+
+Next step is determining if the block at hight h is correctly signed by the obtained validator set. This is achieved by 


Suggested change

Next step is determining if the block at hight h is correctly signed by the obtained validator set. This is achieved by

Next step is determining if the block at height h is correctly signed by the obtained validator set. This is achieved by

cwgoes · 2019-07-15T15:00:05Z

docs/spec/consensus/light-client.md

-Tendermint RPC:
+`header.Time + UNBONDING_PERIOD <= Now() - BFT_TIME_DRIFT_BOUND`.
+
+Note that outside this time window lite client cannot trust validator set as validators could potentially unbonded its stake so security of the lite client does not hold as they are not slashable for its actions. Therefore, they can eclipse client and cheat about the system state without risk of being punished for such misbehavior. 


Suggested change

Note that outside this time window lite client cannot trust validator set as validators could potentially unbonded its stake so security of the lite client does not hold as they are not slashable for its actions. Therefore, they can eclipse client and cheat about the system state without risk of being punished for such misbehavior.

Note that outside this time window lite client cannot trust validator set as validators could potentially have unbonded their stake so security of the lite client does not hold as they are not slashable for its actions. Therefore, they can eclipse client and cheat about the system state without risk of being punished for such misbehavior.

I don't understand what eclipsing and the unbonding period have to do with one another. If the light client is eclipsed for the full duration of the unbonding period it doesn't matter whether the validators are unbonding or not, since the light client can't submit evidence anyways. Outside the unbonding period, it doesn't matter if the client is eclipsed since the state machine will reject the evidence even if it is submitted.

I assume here that being eclipsed is not permanent state of things as at any point in time you can decide to change full node you are connected to or decide to connect to new full node. In that sense there is a difference whether you are operating within unbonding period as you still have a chance of connected to correct node, detecting fork (for example) and submit evidence, compared to the case where you are outside unbonding period where there is no help if you are cheated.

cwgoes · 2019-07-15T15:01:50Z

docs/spec/consensus/light-client.md

+Note that outside this time window lite client cannot trust validator set as validators could potentially unbonded its stake so security of the lite client does not hold as they are not slashable for its actions. Therefore, they can eclipse client and cheat about the system state without risk of being punished for such misbehavior. 
+
+Note that this formula shows a fundamental dependence of lite client security on the wall time. If UNBONDING_PERIOD
+would be defined only in terms of logical time (block heights), lite client will not have a way to know if trusted validator set is still withing its UNBONDING_PERIOD as it does not have a way of reliably determining top of the chain.


I think we should do this! (define unbonding period as a minimum of time passed and of blocks passed)

It's not very hard at all.

I feel the same. I think complexity is significantly added and we increase security of lite client significantly.

cwgoes · 2019-07-15T15:02:03Z

docs/spec/consensus/light-client.md

+
+Note that this formula shows a fundamental dependence of lite client security on the wall time. If UNBONDING_PERIOD
+would be defined only in terms of logical time (block heights), lite client will not have a way to know if trusted validator set is still withing its UNBONDING_PERIOD as it does not have a way of reliably determining top of the chain.
+Therefore, it seems that having BFT time in sync with standard notions of time (for example NTP) is necessarily for correct operations of the system. 


Suggested change

Therefore, it seems that having BFT time in sync with standard notions of time (for example NTP) is necessarily for correct operations of the system.

Therefore, it seems that having BFT time in sync with standard notions of time (for example NTP) is necessary for correct operations of the system.

(and see above comment)

cwgoes · 2019-07-15T15:02:34Z

docs/spec/consensus/light-client.md

+would be defined only in terms of logical time (block heights), lite client will not have a way to know if trusted validator set is still withing its UNBONDING_PERIOD as it does not have a way of reliably determining top of the chain.
+Therefore, it seems that having BFT time in sync with standard notions of time (for example NTP) is necessarily for correct operations of the system. 
+
+Lite client security depends also on the guarantee that faulty validator behavior will be punished. Therefore if a client detect faulty behavior we need to guarantee that proof of misbehavior evidence transaction will be committed within UNBONDING_PERIOD of faulty validators so it can be slashed. This can be achieved by having client considering


Suggested change

Lite client security depends also on the guarantee that faulty validator behavior will be punished. Therefore if a client detect faulty behavior we need to guarantee that proof of misbehavior evidence transaction will be committed within UNBONDING_PERIOD of faulty validators so it can be slashed. This can be achieved by having client considering

Lite client security depends also on the guarantee that faulty validator behavior will be punished. Therefore if a client detects faulty behavior we need to guarantee that proof of misbehavior evidence transaction will be committed within UNBONDING_PERIOD of faulty validators so it can be slashed. This can be achieved by having client considering

cwgoes · 2019-07-15T15:03:10Z

docs/spec/consensus/light-client.md

-validator set sequence number and the validator set init time.
-The core of the light client logic is captured by the VerifyAndUpdate function that is used to 1) verify if the given header is valid,
-and 2) update the validator set (when the given header is valid and it is more recent than the seen headers).
+To be able to validate a Merkle proof, a light client needs to validate the blockchain header that contains the root app hash.Validating a blockchain header in Tendermint consists in verifying that the header is committed (signed) by >2/3 of the voting power of the corresponding validator set. As the validator set is a dynamic set (it is changing), one of the core functionality of the lite client is updating the current validator set, that is then used to verify the


Suggested change

To be able to validate a Merkle proof, a light client needs to validate the blockchain header that contains the root app hash.Validating a blockchain header in Tendermint consists in verifying that the header is committed (signed) by >2/3 of the voting power of the corresponding validator set. As the validator set is a dynamic set (it is changing), one of the core functionality of the lite client is updating the current validator set, that is then used to verify the

To be able to validate a Merkle proof, a light client needs to validate the blockchain header that contains the root app hash. Validating a blockchain header in Tendermint consists in verifying that the header is committed (signed) by >2/3 of the voting power of the corresponding validator set. As the validator set is a dynamic set (it is changing), one of the core functionalities of the lite client is updating the current validator set, which is then used to verify the

cwgoes · 2019-07-15T15:03:20Z

docs/spec/consensus/light-client.md

-The core of the light client logic is captured by the VerifyAndUpdate function that is used to 1) verify if the given header is valid,
-and 2) update the validator set (when the given header is valid and it is more recent than the seen headers).
+To be able to validate a Merkle proof, a light client needs to validate the blockchain header that contains the root app hash.Validating a blockchain header in Tendermint consists in verifying that the header is committed (signed) by >2/3 of the voting power of the corresponding validator set. As the validator set is a dynamic set (it is changing), one of the core functionality of the lite client is updating the current validator set, that is then used to verify the
+blockchain header, and further the corresponding Merkle proofs.


Suggested change

blockchain header, and further the corresponding Merkle proofs.

blockchain header, and subsequently the corresponding Merkle proofs.

cwgoes · 2019-07-15T15:04:30Z

docs/spec/consensus/light-client.md

-i.e., that it will always be used to verify more recent headers. In case a light client needs to be used to verify older
-headers (go backward) the same mechanisms and similar logic can be used. In case a call to the FullNode or subsequent
-checks fail, a light client need to implement some recovery strategy, for example connecting to other FullNode.
+In case a call to the FullNode or subsequent checks fail, a light client need to implement some recovery strategy, for example connecting to other FullNode.


The light client should always try to use a randomized load balancing strategy, considering the possibility of malevolent eclipse attacks (or just innocuous but inconvenient stale full node data).

tac0turtle · 2019-07-22T13:33:11Z

Closing this PR and #3710, as @milosevic & @josef-widder will open a new pr with these two documents combined.

Zarko Milosevic added 2 commits July 10, 2019 17:07

Initial version of time related concerns

ef8e18a

Add init and verifyUpdate funcs

e32fa44

zmanian reviewed Jul 14, 2019

View reviewed changes

cwgoes suggested changes Jul 15, 2019

View reviewed changes

tac0turtle closed this Jul 22, 2019

tac0turtle mentioned this pull request Jul 22, 2019

docs: Add check validators spec #3710

Closed

4 tasks

	Next step is determining if the block at hight h is correctly signed by the obtained validator set. This is achieved by
	Next step is determining if the block at height h is correctly signed by the obtained validator set. This is achieved by

	Note that outside this time window lite client cannot trust validator set as validators could potentially unbonded its stake so security of the lite client does not hold as they are not slashable for its actions. Therefore, they can eclipse client and cheat about the system state without risk of being punished for such misbehavior.
	Note that outside this time window lite client cannot trust validator set as validators could potentially have unbonded their stake so security of the lite client does not hold as they are not slashable for its actions. Therefore, they can eclipse client and cheat about the system state without risk of being punished for such misbehavior.

	Therefore, it seems that having BFT time in sync with standard notions of time (for example NTP) is necessarily for correct operations of the system.
	Therefore, it seems that having BFT time in sync with standard notions of time (for example NTP) is necessary for correct operations of the system.

	Lite client security depends also on the guarantee that faulty validator behavior will be punished. Therefore if a client detect faulty behavior we need to guarantee that proof of misbehavior evidence transaction will be committed within UNBONDING_PERIOD of faulty validators so it can be slashed. This can be achieved by having client considering
	Lite client security depends also on the guarantee that faulty validator behavior will be punished. Therefore if a client detects faulty behavior we need to guarantee that proof of misbehavior evidence transaction will be committed within UNBONDING_PERIOD of faulty validators so it can be slashed. This can be achieved by having client considering

	blockchain header, and further the corresponding Merkle proofs.
	blockchain header, and subsequently the corresponding Merkle proofs.

Conversation

tac0turtle commented Jul 14, 2019 • edited by melekes Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cwgoes left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tac0turtle commented Jul 22, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

tac0turtle commented Jul 14, 2019 •

edited by melekes

Loading