Signature-Free BFT Consensus in Three Steps

2026-06-16T00:00:00+00:00

Update, June 16, 2026:</strong> Revised the framing after first publication.</p>

Update, June 20, 2026:</strong> Simplified the protocol (merged witnesses and cores).</p>

Introduction</h1>
Recently, Abraham et al. presented Forget-IT</a>, a new partially-synchronous, signature-free BFT consensus protocol for $n=3f+1$ parties among which $f$ may be Byzantine. Forget-IT achieves two properties never achieved in combination before 1</a></sup>: (a) when the network is synchronous, it commits a correct leader’s proposal in just three message delays, and (b) even in the worst case, it sends only $O(n^2)$ bits per view. Having tried and failed to obtain such a protocol before, I set out to understand the core idea behind it.</p>
I initially framed this blog post as distilling Forget-IT down to the core mechanism that allows it to achieve points (a) and (b). However, after first publishing this blog post, I realized that I must have had elements of IT-Kuplex</a> in mind too. So, let us say that we are going to sketch how to obtain properties (a) and (b) using simple quorum patterns inspired by both works.</p>
As Eli Gafni would say, the key to solving consensus is to first solve the adopt-commit problem 2</a></sup>. In fact, solving our consensus problem reduces to solving adopt-commit with a good-case latency of two message delays while sending $O(n^2)$ bits in the worst case. Essentially, each consensus view can be implemented using two consecutive instances of adopt-commit: the first to try to commit the leader’s proposal, the second to lock the committed value, if any, before the next view (we leave the details as an exercise to the reader). So we will now focus on solving adopt-commit, and this will expose clearly some interesting quorum patters found in Forget-IT and IT-Kuplex.</p>
Adopt-Commit</h1>
Adopt-commit is a single-shot abstraction that, unlike consensus, is solvable asynchronously even with failures.</p>
In the adopt-commit problem, each party receives an input value and must eventually commit</em> or adopt</em> a unique value, such that:</p>

Validity: if a correct party commits or adopts $v$, then $v$ is the input of a correct party.</li>
Agreement: if a correct party commits $v$, then no correct party commits or adopts a different value.</li>
Unanimity: if all correct parties have the same input value, then they all commit that value.</li> </ul>
Here we consider a slightly relaxed variant of adopt-commit, which we call adopt-commit, where parties do not terminate after adopting or committing a value and where we allow a party to adopt and then commit the same value. This simplifies the solution and works fine if the goal is to use it to implement consensus.</p>
Adopt-Commit in two message delays and $O(n^2)$ communication</h1>
Now let us consider a message-passing system consisting of $n$ parties among which at most $f=\lfloor\frac{n-1}{3}\rfloor$ may be Byzantine while the others are “correct” (meaning they follow the protocol and keep taking steps). We assume that the system is asynchronous: parties have no local clocks and message delay is unbounded. However, the network is reliable: every message sent from a correct party to a correct party is eventually received.</p>
Our task is to solve the adopt-commit* problem such that:</p>

If all correct parties receive the same input (the good case</em>), then they all commit in two message delays.</li>
In the worst case, correct parties together send at most $O(n^2)$ bits in total.</li> </ol>
Here is an algorithm. It evolves in two phases. First, each party broadcasts a vote for its input. Second, parties exchange three types of message to let each other know what votes they received:</p>

Commit messages.</em> A party sends a commit message for a value $v$ when it has received votes for $v$ from $n-f$ parties and it has not previously sent a no-core message or a commit or candidate message for a different value.</li>
Candidate messages.</em> A party sends a candidate message for a value $v$ when it has received votes for $v$ from $f+1$ parties (excluding parties that sent multiple votes) and it has not previously sent a commit message for a different value.</li>
No-core messages.</em> A party sends a no-core message when it has received votes from $n-f$ parties (excluding parties that sent multiple votes) and, among those votes, no value got votes from a core of $f+1$ parties, and the party has not previously sent a commit message.</li> </ul>
Finally, parties output according to the following rules:</p>

If a party receives commit messages for $v$ from $n-f$ parties, it commits $v$.</li>
If a party has not output yet and it receives commit or candidate messages for $v$ from $n-f$ parties, it adopts $v$.</li>
If a party has not output yet and it receives no-core messages from $n-f$ parties, it adopts its own input.</li> </ul>
Correctness proof</h1>
We must show the following:</p>

The algorithm satisfies the validity and agreement properties.</li>
Every correct party eventually adopts or commits a value.</li>
If all correct parties have the same input, then they all commit in two message delays.</li>
Correct parties together send $O(n^2)$ bits in the worst case.</li> </ul>
First, let us define some vocabulary. Call sets of at least $n-f$ parties quorums</em> and sets of at least $f+1$ parties cores</em>. Say a set is correct when all its members are correct. The algorithm depends on the following properties of quorums and cores.</p>

Quorum intersection:</em> Every two quorums have a correct party in common (because $2(n-f)-n>f$).</li>
Quorum availability:</em> Correct parties form a quorum.</li>
Quorum validity:</em> Every quorum contains a core of correct parties (because $(n-f)-f=n-2f\geq f+1$).</li>
Core validity:</em> Every core contains a correct party (because a core has $f+1$ parties and at most $f$ are Byzantine).</li>
Core/quorum intersection:</em> Every correct core set and every quorum have a correct party in common (because $(f+1)+(n-f)-n>0$).</li> </ul>
Validity and Agreement</h2>
Validity is trivial. Agreement is not too hard: Note that no correct party broadcasts both a commit message for a value $v$ and either a commit/candidate message for a different value or a no-core message (this follows purely from the local rules a party follows). Thus, by quorum intersection, it is impossible for a correct party to commit a value $v$ and another correct party to commit or adopt a different value.</p>
Unanimity and good-case latency</h2>
Next, let us show that, if all correct parties have the same input, then they all commit in two message delays. This covers both the unanimity property and the good-case latency. Suppose all correct parties have the same input $v$. It follows that:</p>

By core validity, no correct party ever broadcasts a candidate message for a value other than $v$.</li>
By quorum validity, no correct party ever broadcasts a commit message for a value other than $v$.</li>
By quorum validity, no correct party ever broadcasts a no-core message.</li> </ul>
Thus, by quorum availability, every correct party will eventually broadcast a commit message for $v$ and will eventually commit $v$ after two message delays.</p>
Liveness</h2>
It remains to show that every correct party eventually commits or adopts a value. Consider the following two exhaustive cases.</p>
First, assume there is a value $v$ such that at least $f+1$ correct parties (a core set) have input $v$. Then, by core/quorum intersection, no correct party broadcasts a commit message for a value other than $v$. Moreover, every correct party eventually receives those core votes and broadcasts a candidate message for $v$; by quorum availability, every correct party adopts $v$ unless it has already adopted a value.</p>
Second, assume there is no value $v$ such that at least $f+1$ correct parties have input $v$. Then, by quorum validity, no correct party ever broadcasts a commit message. Moreover, every correct party eventually receives the votes from all correct parties, which form a quorum in which no core voted for the same value. Hence, every correct party broadcasts a no-core message and then adopts its own input, unless it has already adopted a value.</p>
In both cases, we concluded that every correct party adopts a value, and so we are done.</p>
Communication complexity</h2>
Finally, we must show that correct parties only send $O(n^2)$ bits in total. Assuming values are of constant size, it suffices to observe that each party broadcasts at most six messages: one vote message, one no-core message, one commit message, and three candidate messages (each distinct candidate needs a disjoint core of $f+1$ non-equivocating voters; since $n\leq 3(f+1)$, there can be at most three).</p>
Note that, if we used $f<\lfloor\frac{n-1}{3}\rfloor$, we would increase the number of possible candidate messages. In the general case of $f<\frac{n}{3}$ we may get a linear number of candidate messages and an overall bit complexity of $O(n^3)$.</p>
Mechanically-checked proofs in Ivy</h2>
A formalization in Ivy is included below, and mechanically-checked proofs of safety and liveness are available at https://github.com/nano-o/2-step-sf-bft-adopt-commit</a>.</p>
Epilogue</h1>
So what is the key insight? I think it is the 3-way case split and the three corresponding message types:</p>

If all correct parties have the same input $v$, then every correct party gets a quorum of “commit” messages for $v$ in two message delays.</li>
If some value $v$ has a correct core ($f+1$ correct processes with the same input), then that core prevents commits for conflicting values, and we eventually get a quorum of candidate messages for $v$.</li>
If no correct core has the same input, then we eventually get a quorum of “no-core” messages.</li> </ul>
On top of that, local exclusion rules prevent a party from sending both a commit message and either a commit/candidate message for a different value or a no-core message; thus, by quorum intersection, we have agreement.</p>
Appendix: Ivy protocol model</h1>
For reference, here is the core Ivy model of the protocol.</p>
#lang ivy1.7 ################################################################################ # Types ################################################################################ type party type val # The set types below abstract the threshold certificates from the writeup. # Cardinalities are not encoded directly; the axioms capture the threshold facts # used by the protocol for n >= 3f+1 and at most f Byzantine parties. # A quorum represents at least n-f distinct senders. type quorum # A core represents at least f+1 distinct senders. type core ################################################################################ # Immutable state and quorum theory ################################################################################ relation faulty(P:party) relation quorum_member(Q:quorum, P:party) relation core_member(S:core, P:party) # Quorum intersection axiom [quorum_intersection] exists P. ~faulty(P) & quorum_member(Q1, P) & quorum_member(Q2, P) # Quorum availability axiom [non_faulty_quorum] exists Q. forall P. quorum_member(Q, P) -> ~faulty(P) # Every quorum contains a core of nonfaulty members. axiom [quorum_contains_non_faulty_core] exists S. forall P. core_member(S, P) -> ~faulty(P) & quorum_member(Q, P) # Every core has a nonfaulty member. axiom [core_has_non_faulty_member] exists P. ~faulty(P) & core_member(S, P) # A nonfaulty core intersects a quorum in a nonfaulty party. axiom [core_quorum_intersection] (forall P. core_member(S, P) -> ~faulty(P)) -> exists P. ~faulty(P) & core_member(S, P) & quorum_member(Q, P) ################################################################################ # Mutable protocol state ################################################################################ # First-round value votes. relation vote(P:party, V:val) # Second-round support. relation candidate(P:party, V:val) relation commit(P:party, V:val) relation no_core(P:party) # Output state. Parties may both commit and adopt (the same value, obviously), # and keep sending messages after output. relation commit_out(P:party, V:val) relation adopt_support_out(P:party, V:val) relation adopt_no_core_out(P:party, V:val) ################################################################################ # Derived predicates ################################################################################ relation adopt_out(P:party, V:val) definition adopt_out(P, V) = adopt_support_out(P, V) | adopt_no_core_out(P, V) relation output(P:party, V:val) definition output(P, V) = commit_out(P, V) | adopt_out(P, V) relation started(P:party) definition started(P) = exists V. vote(P, V) relation correct_input(V:val) definition correct_input(V) = exists P. ~faulty(P) & vote(P, V) ################################################################################ # Initialization ################################################################################ after init { vote(P, V) := false; candidate(P, V) := false; commit(P, V) := false; no_core(P) := false; commit_out(P, V) := false; adopt_support_out(P, V) := false; adopt_no_core_out(P, V) := false; } ################################################################################ # Protocol actions ################################################################################ action start_step(p:party, v:val) = { require ~vote(p, V); vote(p, v) := true; } action commit_step(p:party, v:val, q:quorum) = { require quorum_member(q, P) -> vote(P, v); require ~commit(p, V); require ~no_core(p); # TODO: necessary? require candidate(p, V) -> V = v; commit(p, v) := true; } action candidate_step(p:party, v:val, w:core) = { require core_member(w, P) -> vote(P, v); require commit(p, V2) -> V2 = v; candidate(p, v) := true; } action no_core_step(p:party, q:quorum) = { require ~commit(p, V); require quorum_member(q, P) -> started(P); require (forall P . core_member(B, P) -> quorum_member(q, P)) -> (exists P . core_member(B, P) & ~vote(P,V)); no_core(p) := true; } action output_commit_step(p:party, v:val, q:quorum) = { require quorum_member(q, P) -> commit(P, v); commit_out(p, v) := true; } action output_adopt_step(p:party, v:val, q:quorum) = { require ~(adopt_out(p, V) | commit_out(p, V)); require quorum_member(q, P) -> candidate(P, v) | commit(P, v); adopt_support_out(p, v) := true; } action output_no_core_step(p:party, v:val, q:quorum) = { require ~(adopt_out(p, V) | commit_out(p, V)); require vote(p, v); require quorum_member(q, P) -> no_core(P); adopt_no_core_out(p, v) := true; } # Byzantine parties may equivocate and may change their sent-message relations # arbitrarily. action byz_party(p:party) = { require faulty(p); vote(p, V) := ; candidate(p, V) := ; commit(p, V) := ; no_core(p) := ; } </code></pre>`A solution that commits in three message delays in the good case but sends $O(n^3)$ bits per view in the worst case appears in Chapter 3 of the PhD thesis of Miguel Castro</a>. Solutions with $O(n^2)$ bits per view but more than three message delays in the good case include TetraBFT</a> and IT-HS</a>. ↩</a></p> </li> Adopt-commit was first presented at PODC 1998 by Eli Gafni in Round-by-round fault detectors: unifying synchrony and asynchrony</em></a>. ↩</a></p> </li> </ol> </section>`

Streamlet in TLA+

2022-01-04T00:00:00+00:00

In this blog post, we will see how to specify the Streamlet algorithm in PlusCal/TLA+ with a focus on writing simple specifications that are amenable to model-checking of both safety and liveness properties with TLC.</p>

You can find the source code at https://github.com/nano-o/streamlet</a>.</p>

Context and results</h1>
The Streamlet blockchain-consensus algorithm</a> is arguably one of the simplest blockchain-consensus algorithm. What makes Streamlet simple is that there are only two types of messages (leader proposals and votes) and processes repeat the same, simple propose-vote pattern ad infinitum. In contrast, protocols like Paxos or PBFT alternate between two sub-protocols: one for the normal case, when things go well, and a view-change sub-protocol to recover from failures.</p>
The proofs in the Streamlet paper use the operational reasoning style, where we consider an entire execution and try to reason about the possible ordering of events in order to show by case analysis that the algorithm is correct. In my experience, this proof style is very error-prone, and it is not easy to make sure that no case was overlooked. I would prefer a proof based on inductive invariants, but that’s a discussion that’s off-topic for this post.</p>
Instead, in this post, we will specify the Streamlet algorithm in PlusCal/TLA+ and use the TLC model-checker to verify its safety and liveness properties in small but non-trivial configurations. Moreover, the specification I present are also an example of:</p>

how to use non-determinism to obtain simple specifications</li>
how to exploit the commutativity of actions to speed-up model-checking by sequentializing the specification.</li> </ul>
I was able to exhaustively check the safety and liveness properties of (crash-stop) Streamlet in interesting configurations:</p>

with 3 processes, 2 block payloads, and 7 asynchronous epochs;</li>
with 3 processes, 2 block payloads, and 5 asynchronous epochs followed by 4 synchronous epochs (i.e. “GST” happens at the start of epoch 6).</li> </ul>
Those results give me very high confidence that streamlet satisfies its claimed properties.</p>
We’ll also see that TLC verifies that, in all configurations checked, Streamlet guarantees that a new block gets finalized in 4 synchronous rounds. This is better than the bound of 5 rounds proved in the Streamlet paper, and I believe that a bound of 4 holds in general.</p>
The Streamlet algorithm</h1>
The goal of the Streamlet algorithm is to enable a fixed set of `N</code> processes in a message-passing network to iteratively construct a unique and ever-growing blockchain. Although many such algorithms existed before Streamlet, Streamlet is striking because of the simplicity of the rules that processes must follow.</p>`
`Streamlet can tolerate malicious, Byzantine processes, but, to simplify things, here we consider only crash-stop faults and we assume that, in every execution, a strict majority of the processes do not fail. As is customary, we refer to a strict majority as a quorum.</p>`
`The protocol evolves in consecutive epochs numbered 1,2,3,… during which processes vote for blocks according to the rules below.</p>`
A block consists of a hash of a previous block, an epoch number, and a payload (e.g. a set of transactions); moreover, a special, unique genesis block has epoch number 0. Thus a set of blocks forms a directed graph such that (b1,b2)</code> is an edge if and only if b2</code> contains the hash of block b1</code>. We say that a set of blocks forms a valid block tree when the directed graph formed by the blocks is a tree rooted at the genesis block. A valid blockchain (or simply a chain for short) is a valid block tree in which every block has at most one successor, i.e. in which there are no forks.</p>
`Each epoch e</code> has a unique, pre-determined leader (e.g. process (e mod N)+1</code>), and processes in epoch e must follow the following rules:</p>`
`The leader proposes a new block with epoch number e</code> that extends one of the longest notarized chains that the leader knows of (where notarized is defined below).</li>`Every process votes for the leader’s proposal as long as the proposal is longer than the longest notarized chains that the process ever voted to extend.1</a></sup></li> A block is notarized when it has gathered votes from a quorum in the same epoch, and a chain is notarized when all its blocks, except the genesis block, are notarized.</li> When a notarized chain includes three adjacent blocks with consecutive epoch numbers, the prefix of the chain up to the second of those 3 blocks is considered final.</li> </ul> Process proceed from one epoch to the next through unspecified means. In practice, a process may increment its epoch using a real-time clock (e.g. each epoch lasting 2 seconds), or, even though I don’t think this is discussed in the Streamlet paper, processes may use a synchronizer sub-protocol. The synchronizer approach is more robust than simply relying on clocks, and it is used by many deployed protocols. Surprisingly, it dates back to the pioneering work of Dwork, Lynch, and Stockmeyer</a> in the 1980s. For a recent treatment, see Gotsman et al.</a>.</p> Example scenario</h2> The following drawing illustrates a possible blocktree built by the Streamlet algorithm. The tree vertices represent blocks with their epoch number and payloads omitted (we can assume that payloads are all different). A vertex with a circle around it represents a notarized block, while a vertex without a circle is a block that has been voted for by at least one process but is not notarized. The longest finalized blockchain is represented with full arrows, while other arrows are dotted.</p> We can see that the block with epoch 2 is notarized, thus finalizing the block with epoch 1, but then the leader of epoch 3 did not notice that block 2 was notarized and instead extended the block with epoch 1 with a block with epoch 3. This cause a fork in the tree of notarized blocks, but not a fork in the finalized blockchain. Blocks with epoch number 3 and 4 in the finalized chain are final because of the notarized block with epoch 5.</p> </p> Safety guarantee</h2> The algorithm guarantees that if two chains are final, then one is a prefix of the other. This is the consistency property of the algorithm.</p> Note that the consistency guarantee holds even if processes proceed through epochs at different speeds and may not be in the same epoch at the same time.</p> Liveness guarantee</h2> In an asynchronous network, Streamlet cannot guarantee that a block will ever get finalized. This is because the consensus problem is famously unsolvable in an asynchronous network. Instead, to guarantee liveness properties, we must make additional assumptions. To do so, first define a synchronous epoch as an epoch in which all non-faulty processes receive each other’s messages before the end of the epoch, and in which the leader is not faulty.</p> We can now state Streamlet’s liveness guarantee: The Streamlet algorithm guarantees that, after 4 consecutive synchronous epochs, a new blocks gets finalized.</p> Note that, according to the usual definitions of liveness and safety used in the academic field of distributed computing, this is a safety property because it can be violated in a bounded execution. But, as in the Streamlet paper, we’ll call it liveness anyway.</p> Note that the Streamlet paper proves that we need 5 synchronous epochs to guarantee finalizing one more block, but I believe this is overly conservative and that 4 epochs suffice.</p> Streamlet in PlusCal/TLA+</h1> Blocks</h2> Processes in the Streamlet algorithm can be seen as building a block tree out of which emerges a unique, finalized blockchain. Thus, we need to model blocks, block trees, and blockchains.</p> Except for the genesis block, a block consists of the hash of its parent block, an epoch, and a payload. Thus, assuming that there are no hash collisions, a block uniquely determines all its ancestors up to the genesis block, or, equivalently, a unique sequence of epoch-payload pairs. We could model a block as a recursive data structure containing its parent. However, to make things simpler, we model a block as a sequence of pairs, each containing an epoch and a payload. No information is lost in the process: in both cases, a block determines a unique sequence of epoch-payload pairs.</p> For example, this is a block in TLA+ notation:</p> << <<1, tx1>>, <<3, tx3>>, <<4, tx4>> >> </code></pre> This TLA+ block models a real block consisting of epoch 4, payload tx4</code>, and the hash of a previous block with epoch 3, payload tx3</code>, and a hash of a previous block with epoch 1, payload tx1</code>, and the hash of the genesis block.</p> In this model of blocks, a block tree is a prefix-closed set of blocks, and a blockchain is a block tree without branching. Moreover, we can extend a block b</code> just by appending an epoch-payload tuple to it. Finally, the genesis block is the empty sequence <<>></code>.</p> We now define the epoch of a block b</code> as 0</code> if b</code> is the genesis block and otherwise as the epoch found in the last tuple in b</code>. In TLA+, this translates to:</p> Epoch(b) == IF b = Genesis THEN 0 ELSE b[Len(b)][1] </code></pre> Moreover, the parent of a block b</code> is the genesis block if b</code> has length 1, and otherwise it’s the block obtained by removing the last element of b</code>. In TLA+:</p> Parent(b) == IF Len(b) = 1 THEN Genesis ELSE SubSeq(b, 1, Len(b)-1) </code></pre> First specification</h2> We start with a specification that makes use of non-determinism in order to eschew irrelevant details and capture the essence of how Streamlet ensures safety. The specification, appearing below and in Streamlet.tla</a>, is very short. It consists of a mere 44 numbered lines of PlusCal; the display below uses a few unnumbered continuation lines to keep the listing readable.</p> Listing 1: A compact PlusCal model of Streamlet safety</figcaption> CONSTANTS P \* The set of processes , Tx \* Transaction sets (the payload in a block) , MaxEpoch , Quorum \* The set of quorums , Leader(_) \* Leader(e) is the leader of epoch e 1 --algorithm Streamlet { 2 variables 3 vote = [p \in P |-> {}], \* the votes cast by the processes 4 proposal = [e \in E |-> <<>>]; 5 define { 6 E == 1..MaxEpoch \* the set of epochs 7 Genesis == <<>> 8 Epoch(b) == \* the epoch of a block 9 IF b = Genesis 10 THEN 0 \* the root is by convention a block with epoch 0 11 ELSE b[Len(b)][1] 12 Parent(b) == \* the parent of a block IF Len(b) = 1 THEN Genesis ELSE SubSeq(b, 1, Len(b)-1) 13 Blocks == UNION {vote[p] : p \in P} 14 Notarized == {Genesis} \cup \* Genesis is considered notarized by default 15 { b \in Blocks : \E Q \in Quorum : \A p \in Q : b \in vote[p] } 16 Final(b) == 17 /\ \E tx \in Tx : Append(b, <<Epoch(b)+1, tx>>) \in Notarized 18 /\ Epoch(Parent(b)) = Epoch(b)-1 19 Safety == \A b1,b2 \in {b \in Blocks : Final(b)} : 20 Len(b1) <= Len(b2) => b1 = SubSeq(b2, 1, Len(b1)) 21 } 22 process (proc \in P) 23 variables 24 epoch = 1, \* the current epoch of p 25 height = 0; \* height of the longest notarized chain that p voted to extend 26 { 27 l1: while (epoch \in E) { 28 \* if leader, make a proposal: 29 if (Leader(epoch) = self) { 30 with ( parent \in { b \in Notarized : height <= Len(b) /\ Epoch(b) <= epoch }, 31 tx \in Tx, b = Append(parent, <<epoch, tx>>) ) 32 proposal[epoch] := b 33 }; 34 \* next, either vote for the leader's proposal or skip: 35 either { 36 when Len(proposal[epoch]) > height; 37 vote[self] := @ \cup {proposal[epoch]}; 38 height := Len(proposal[epoch])-1 39 } or skip; 40 \* finally, go to the next epoch: 41 epoch := epoch + 1; 42 } 43 } 44 } </code></pre> </figure> Let me now describe the specification informally.</p> We have two global variables, vote</code> and proposal</code>, and two process-local variables, epoch</code> and height</code>, with the following meaning:</p> For every process p</code>, vote[p]</code> is the set of all votes cast by p</code> so far.</li> For every epoch e</code>, proposal[e]</code> is the leader’s proposal for epoch e</code> unless proposal[e]</code> is empty (i.e. equal to the empty sequence <<>></code>).</li> For every process, the local variable epoch</code> is the current epoch the process.</li> For every process, the local variable height</code> is the height of the longest notarized block that the process ever voted to extend.</li> </ul> Lines 6 to 20, we make a few useful definitions, most notably:</p> Line 14, a block is notarized when a quorum unanimously voted for it.</li> Line 16, a block b</code> is final when: line 17, b</code> has a notarized child with epoch Epoch(b)+1</code>, and</li> line 18, the epoch of b</code>’s parent is Epoch(b)-1</code></li> </ul> </li> </ul> Finally, line 19, the algorithm is safe when every two final blocks are the same up to the length of the shortest.</li> </ul> Now consider a process we will call self</code> at line 27, when self</code> is just starting it current epoch epoch</code>.</p> First, lines 29 to 33, if self</code> is the leader of the current epoch, self</code> picks a notarized block show length is greater than height</code>, extends it with a new payload tx</code>, and proposes it for the epoch. This is an example of how we use non-determinism to abstract over irrelevant details: In the original Streamlet algorithm, a process creates a proposal by extending one of the longest notarized chains it knows of. Here we abstract over this rule by allowing the process to pick an arbitrary notarized block. This is a sound abstraction because it does not restrict the behaviors of the algorithm (in fact, it may add new behaviors).</p> Next, line 35, self</code> either votes for the leader’s proposal or skips voting in this epoch</p> Line 36, self</code> checks that the proposal extends a block whose height is at least the height of last block that self</code> voted to extend. If so, self</code> votes for the proposal (line 37) and updates height</code> to reflect the fact that it just voted to extend a block of height equal to the length of the proposal minus one.</li> Alternatively, line 39, self</code> skips voting in this epoch. This models the case in which self</code> did not receive the leader’s proposal, or the proposal is not at least of height equal to the height of the longest notarized block that self</code> ever voted to extend.</li> </ul> Finally, line 41, self</code> goes to the next epoch.</p> Model-checking results</h3> With TLC, I was able to exhaustively model-check that the Safety</code> property holds for 3 processes, 2 payloads, and 6 epochs. This was done on a 24 core Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz</code> with 40GB of memory allocated to TLC, and it took about 4 hours and 20 minutes.</p> I think this is an interesting configuration because we have multiple quorums (sets of 2 processes at least), branching even within a single epoch (because of the 2 payloads), and enough epochs to obtain finalized chains containing non-consecutive epoch numbers and having notarized but non-final branches. Thus, the model-checking results give me high confidence that the Streamlet algorithm is indeed safe.</p> Liveness and sequentialization</h2> To model-check the liveness property of Streamlet, we must modify our specification and introduce synchronous epochs. Moreover, we’ll need to be able to exhaustively model-check the specification for more than 6 epochs to get meaningful results. This is because the liveness property states that a new block must be decided after 5 synchronous epochs. To truly test this claim, we need to check that it holds even after a few asynchronous epochs have had the chance to wreak as much havoc as possible. Just taking 3 asynchronous epoch gives use 3+5=8 epochs to check. Thus, we better find a way to speed up model-checking.</p> As we will next see, we can take advantage of the commutativity of some steps in the protocol to reduce the problem to checking only a restricted set of canonical executions. This is very effective and will allow us to exhaustively check that, even after 5 totally asynchronous epochs, Streamlet guarantees that a new block is finalized after 5 synchronous epochs. In fact, we’ll see that it only takes 4 synchronous epoch for a new block to be finalized. I believe that this is true in general, and that the bound of 5 proved in the Streamlet paper is overly conservative.</p> Sequentialization</h3> Consider an execution e</code> of Streamlet and two steps s1</code> and s2</code> of two different processes p1</code> and p2</code> such that s1</code> occurs right before s2</code>, s1</code> is a step of epoch e1</code>, s2</code> is a step of epoch e2</code>, and e2<e1</code>. Note that the global state written by s2</code> is never read by s1</code> because a process in epoch e1</code> only uses information from epoch smaller or equal to e1</code>. Moreover, if step s2</code> can take place at some point in an execution, adding more steps of other processes epochs lower than e2</code> before s2</code> never prevents s2</code> from taking place (i.e. the protocol is monotonic).</p> The previous paragraph shows that we can reorder all steps in an execution e1</code> to obtain a new execution e2</code> in which all steps of epoch 1</code> happen first, then all steps of epoch 2, then all steps of epoch 3, etc. Crucially, the end state of the system in e1</code> and e2</code> are the same. Thus, if we prove that all executions like e2</code>, which we call sequentialized executions, are safe and live, then we can conclude that all executions are safe and live. This is because we express safety and liveness as state predicates, and, by our crucial observation above, restricting ourselves to sequentialized executions does not change the set of reachable states.</p> Moreover, with a slightly more complex justification, we can also reorder the steps of different processes within the same epoch as long as the leader always takes the first step. This means that we can schedule processes completely deterministically, as in a sequential program, without loosing any reachable states.</p> This is what we do in the specification SequentializedStreamlet.tla</code></a>. There, we specify a scheduler that schedules all processes deterministically. The result is that the set of behaviors that the TLC model checker must explore is drastically reduced. For example, it takes only about 15 minutes to exhaustively explore all executions with 3 processes, 2 payloads, and 6 epochs; in contrast, it took about 4 hours and 20 minutes with the previous specification.</p> Note that this style of reduction is well-known and was used by Dwork, Lynch, and Stockmeyer</a> in 1984 in order to simplify reasoning about their algorithms. Several recent frameworks use this type of reduction to help engineers design and verify their algorithms. For example, PSync</a> provides a programming language to develop consensus algorithms directly in a model somewhat similar to the sequential model we used, and an efficient runtime system to deploy such algorithms. We have taken a rather ad-hoc and informally justified approach to our sequentialization of Streamlet. In contrast, methods such as inductive sequentialization</a>, supported by the Civl verifier</a>, offer a principled approach to applying such reductions.</p> Expressing the liveness property</h3> To check the liveness property, we must first have a way to specify that epochs become synchronous after a given, fixed epoch. To this end, we introduce a constant GSE</code> (for global synchronization epoch) and we add constraints that model the fact that all epoch including and after GSE</code> are synchronous.</p> Remember that in a synchronous epoch, every node receives the leader’s proposal and every node receives all the votes of the other nodes. This is reflected as follows in the PlusCal/TLA+ specification:</p> In epoch GSE</code> and after, nodes do not skip the epoch and vote for the leader’s proposal if the proposal is longer than the longest chain that the node has ever voted to extend.</li> In epoch GSE</code>, the leader makes a proposal, but the proposal doesn’t necessarily extend a longest notarized chain because, even though the leader must receive all previous votes by the end of epoch GSE</code>, it might not yet have by the time it makes its proposal.</li> In epoch GSE+1</code> and above, the leader proposes to extend one of the longest notarized chains.</li> </ul> Given the above, we now state the liveness property as:</p> Liveness == (epoch = GSE+4) => \E b \in Blocks : Final(b) /\ Epoch(b) >= GSE-1 </code></pre> In English, this states that by the beginning of epoch GSE+4</code>, there is a final block whose epoch is greater or equal to GSE-1</code>. You might wonder where this constraint on the block’s epoch comes from. The answer is that we want to show that a new</em> block, i.e. a block which was not final in epoch GSE</code>, is now final. It is easy to see that, when GSE</code> starts, no block with an epoch greater or equal to e-1</code> can be final when epoch e</code> starts. Thus, any final block with an epoch greater or equal to GSE-1</code> was not final when epoch GSE</code> started and can be considered “new”.</p> The sequentialized specification with liveness</h3> Omitting definitions that are the same as before, here is the sequentialized specification of Streamlet. You can also find it in SequentializedStreamlet.tla</a></p> Listing 2: Sequentialized Streamlet with the liveness check</figcaption> 1 --algorithm Streamlet { 2 variables 3 height = [p \in P |-> 0], \* height of the longest notarized chain p voted to extend 4 votes = [p \in P |-> {}], \* the votes cast by the processes 5 epoch = 1, \* the current epoch 6 scheduled = {}, \* processes already scheduled in the current epoch 7 proposal = <<>>; \* the leader's proposal for the current epoch 8 define { 9 NextProc == 10 IF scheduled = {} 11 THEN CHOOSE p \in P : Leader(epoch) = p 12 ELSE CHOOSE p \in P : \neg p \in scheduled 13 \* It takes at most 4 epochs to finalize a new block: 14 Liveness == (epoch = GSE+4) => \E b \in Blocks : Final(b) /\ Epoch(b) >= GSE-1 15 } 16 process (scheduler \in {"sched"}) 17 { 18 l1: while (epoch \in E) { 19 with (proc = NextProc) { 20 \* if proc is leader, make a proposal: 21 if (Leader(epoch) = proc) 22 with ( parent \in { b \in Notarized : height[proc] <= Len(b) /\ Epoch(b) <= epoch }, 23 tx \in Tx, b = Append(parent, <<epoch, tx>>) ) { 24 \* After the first synchronous epoch, the leader can pick a highest notarized block: 25 when epoch > GSE => \A b2 \in Notarized : Len(b2) <= Len(parent); 26 proposal := b 27 }; 28 \* next, if possible, vote for the leader's proposal: 29 either if (height[proc] <= Len(proposal)-1) { 30 votes[proc] := @ \cup {proposal}; 31 height[proc] := Len(proposal)-1 32 } 33 or { 34 when epoch < GSE; \* Before GSE, we may miss the leader's proposal 35 skip 36 }; 37 \* go to the next epoch if all processes have been scheduled: 38 if (scheduled \cup {proc} = P) { 39 scheduled := {}; 40 epoch := epoch+1 41 } 42 else 43 scheduled := scheduled \cup {proc} 44 } 45 } 46 } 47 } </code></pre> </figure> Model-checking results</h3> I was able to exhaustively check the liveness property with 3 crash-stop processes, 2 block payloads, and 9 epochs among which the first 5 are asynchronous while the remaining 4 are synchronous (i.e. “GST” happens before epoch 6).</p> As before, this was done on a 24 core Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz</code> with 40GB of memory reserved for TLC. It took one hour and four minutes to complete and TLC found 320,821,303 distinct states for a depth of 29 steps.</p> I was also able to exhaustively check safety for 3 processes, 2 payloads, and 7 asynchronous epochs. This took 16 hours and TLC found 3,262,833,142 distinct states for a depth of 23 steps.</p> Related work</h1> There is another excellent PlusCal/TLA+ specification of the crash-fault Streamlet algorithm described in Murat’s blog</a>.</p> Compared to the present specification, this earlier specification uses a shared-whiteboard model of messages in which all processes receive a given message at the same time. This means that processes always have the same view of the system, which precludes some interesting behaviors of the Streamlet algorithm. In contrast, the specifications that I present reflect the fact that processes have different, partial views of what blocks have been notarized or not.</p> Shir Cohen and Dahlia Malkhi compare Streamlet and HotStuff in the following blog post</a>. They note that Streamlet lacks some of the qualities of an engineering-ready protocol like HotStuff. While their blog post is very interesting, I do not agree with the claim that Streamlet requires synchronized epochs. The original Streamlet presentation indeed stipulates that processes proceeds through epochs in lock-steps (using synchronized real-time clocks). However, any synchronizer that guarantees that epochs become synchronous after GST (in the sense that we have used in the current post) should work too.</p> Other notes</h1> The rule that the leader uses to pick a block to extend can be slightly improved. In the original Streamlet, the leader proposes a new block that extends one of the longest notarized chains that the leader knows of. However, it would make some executions finalize a new block faster if the leader would instead pick the notarized block with the highest epoch that it knows of.</p> this deviates slightly from the original formulation and makes the specification simpler without any downsides ↩</a></p> </li> </ol> </section>

Giuliano Losa - Blog

Signature-Free BFT Consensus in Three Steps

Mechanically-checked proofs in Ivy</h2> A formalization in Ivy is included below, and mechanically-checked proofs of safety and liveness are available at https://github.com/nano-o/2-step-sf-bft-adopt-commit</a>.</p>

Appendix: Ivy protocol model</h1> For reference, here is the core Ivy model of the protocol.</p>

Streamlet in TLA+

Streamlet in PlusCal/TLA+</h1>

Mechanically-checked proofs in Ivy</h2>
A formalization in Ivy is included below, and mechanically-checked proofs of safety and liveness are available at https://github.com/nano-o/2-step-sf-bft-adopt-commit</a>.</p>