-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
I'm not entirely sure whether this issue belongs in Tendermint or the SDK, since it really results from the interaction of the median BFT time calculation in Tendermint and the bonded-proof-of-stake model implemented in the SDK, so has to be considered with both in mind - putting it here for now.
Concern 1: Timewarp attack
I'm concerned that the current model of BFT time & the unbonding period substantially changes the Byzantine attack surface of the Cosmos Hub — in particular, it gives too much power to 34% (just more than a third) of the stake.
Presently, if we assume a time oracle, 34% of coordinating stake can:
- Halt the chain by refusing to sign blocks (no network control required, no stake at risk)
- Network partition the other 2/3, double-sign and cause a fork (complete network control of other 2/3 required, stake at risk)
Halting the chain - although unfortunate - is easily detectable by "humans watching the system" in practice, can easily be fixed by forking out the offending stake, and doesn't lead to double-spends for any other services connected to the chain or other blockchains connected over IBC.
The second attack is more problematic, but it requires complete network control (in practice difficult). Once complete network control breaks, double-sign proofs will be submitted from both forks to each other and the offending 34% will be slashed on both (or both will halt, but either case is attributable). Likewise for the current IBC model - proof-of-double-sign can be submitted to IBC contracts on the other chains and the contracts can immediately lock assets / prevent further value transfer.
However, with our current median BFT time plus the unbonding period which utilizes it, I think 34% of stake can do the following:
- Censor all other proposers so that the 34% cabal exclusively controls the included vote set of each block, and only include the votes from other 1/3 of stake (totaling just over 66%, so enough to commit blocks) - thus completely controlling the median timestamp, since the 34% will comprise 51% of the votes in each block.
- Double-sign a block at some height
h, but wait to publish the signatures - In block
h+1orh+2, increase the timestamp by three weeks - Submit the double-signed block to an IBC connection or light client a few headers behind, and voila - double-spend with no punishment, because the SDK will reject the evidence as being too old
This attack requires no ability to partition the other 2/3 of validators, puts no stake at risk, and can happen in a matter of a few blocks before anyone notices. It is still attributable, but not in-protocol - governance would have to elect to slash the offending validators, which could be controversial, takes time, doesn't work with IBC, etc etc.
The SDK could not check evidence timestamps like this, but then the 34% cabal could increase the timestamp above the evidence rejection threshold at the Tendermint P2P layer instead.
In practice this seems like a much worse attack than either of the two above — it doesn't require network control, allows double-spending, isn't necessarily attributable or slashable, and happens almost instantly.
Concern 2: Inflationary incentives
Separate from the Byzantine case above, I think rational self-interested validators who are not explicitly colluding (which is our threat model) might be incentivized to lie about time.
What does the timestamp do in the Cosmos Hub incentive model? Two things:
- Timestamp controls the unbonding period - oldest age of valid evidence and how fast unbonding stake is unlocked
- Timestamp controls inflation (the annual target inflation rate is applied incrementally every so often according to elapsed time)
In different cases I could see lying about time in both ways being rational, but I'm more concerned about the "fast time" case. Because timestamp controls inflation, stakers control their own rate of payment for staking on the network. As a validator - even one who isn't colluding at all - the later the timestamp I pick, the more the median slightly shifts and the (slightly) more I get paid. As a rational delegator, I'll vote for validators which pick later timestamps and increase (slightly) my rewards.
In the otherwise-honest model (where the only "non-protocol-compliant" thing validators are doing is lying about time) this does still require 51% of stake to lie in this way to actually be a problem - otherwise the timestamp will just be too far ahead, but by a constant amount since the honest 51% control the median and are just setting their time from an external oracle. But since there's no punishment for lying and a (slight) benefit even as a single validator who changes only their action, I'm not sure we have sufficiently strong reasons to expect that 51% of stake would be consistently honest.
In the Byzantine model, the 34% attack - without double-signing - applies here as well: a 34% cabal can censor half the other votes, control the timestamp, and speed up the inflation rate by any factor they like. (this might be even worse because I think they can also selectively censor precommits, ref cosmos/cosmos-sdk#2522)
In general, it seems to me like we have not thought enough about the ramifications of utilizing a timestamp completely controlled by the validator set for core protocol security state machine logic. I think we:
- should think more about it and sketch out the security model more concretely
- should consider using or also using a time metric which at least has some real-world logistic constraints - if we additionally require a minimum number of blocks for unbonding, for example, the 34% cabal attack is far less effective even if the number of blocks we only expect to take half an unbonding period since they can't speed up the rate of block production
Let me know if the above explanations are clear or if I missed anything.