Breaking Down Vlad’s Talk on CBC Consensus and Safety Proofs

Original Video


Vlad Zamfir is a researcher and "consensus protocol engineer" who works for the Ethereum Foundation. He's the lead project manager of the Casper Client, which will be responsible for the transition from Proof of Work (PoW) to Proof of Stake (PoS).

Casper the Friendly Finality Gadget (FFG) is being developed by Vitalik, and is the hybrid PoW/PoS system that we will likely see before full-PoS Casper the Friendly GHOST (Greedy Heaviest-Observed Sub-Tree), which is what Vlad's working on.

You will want to know about PoW and PoS in order to understand the benefit of what Vlad is describing in this video, but it's not necessary to understand the protocol itself. You will also want to know what a consensus protocol is, and why they are important in the context of blockchains.

Correct-by-Construction (CBC) Consensus is a branch of research Vlad is working on that helps to define consensus protocols with basic minimum properties that allow for proof of consensus safety (more on this later). The idea is to create a framework from which generic consensus protocols can be derived and modified without having to re-prove their base functionality.

The timestamps are used to anchor my explanation to roughly where Vlad is describing the concept in the video.

The Estimator

All CBC protocols satisfy this safety proof.


First, Vlad defines two things: objects (protocol states), and morphisms between objects (protocol state transitions). It is stated that if a protocol state A has a protocol state transition to B and then again to state C, (A -> B -> C), then there is a protocol state transition A -> C. Also, every protocol state has an inherent protocol state transition to itself (A->A).

To bring this into reality, we can envision a protocol state where address A has 1 Ether in it. The first protocol state transition has address A sending .5 Ether to address B (A -> B). Note that the protocol state B says: address A has .5 Ether and address B has .5 Ether. From state B, we have another state transition where B sends C .25 Ether. State C says A has .5, B has .25, and C has .25. What Vlad is saying here is that there is also a transition directly from A -> C directly.


Next, Vlad describes "The Estimator", a theoretical map between the protocol states & transitions to propositions about the consensus- ε : |Σ| -> Prop(C)

ε is the symbol for the estimator. The : reads as "such that", |Σ| is defined as being all the protocol states and protocol state transitions (as described above), and -> Prop(C) reads "leads to propositions about the consensus, C".

If we put it all together we get: "The estimator ε exists such that : the protocol states and protocol state transitions |Σ| lead to propositions about the consensus -> Prop(C).

What this means is that the estimator is able to look at the protocol state & state transitions and make statements about the state of the consensus. These statements can be something like "the consensus is zero, or the consensus is 1. The block at this height has this hash, or the block at that height has that hash", or they can be weaker statements like "the value of the consensus has some property".

The estimator is sort of like the fork-choice rule that maps sets of blocks to a blockchain. In other words, the estimator represents the "guesses" a node makes about the value of the consensus on any given protocol state.

The CBC Safety Proof


Assuming the estimator is producing propositions about the consensus of protocol states, Vlad defines the notion of "safety" of these propositions – S(p,σ) <=> ∀ σ', σ->σ' ε(σ') -> P

This reads: "the safety S that some proposition p at some protocol state σ, or S(p,σ), is safe if and only if (iff) <=> the following holds: for any future protocol state σ', where the current protocol state can evolve to this future state σ -> σ', the property P also holds for that state ε(σ') -> P is true.

In other words, if there is a proposition that a property that holds for every protocol state in the future, then that proposition is going to be called "safe", or "a value of the estimator is "safe" iff it holds for all future protocol states".

For example, if a block at height 10 has some hash H, and that block has the same hash H in all future states of the protocol H' = H, the proposition "the block at height 10 has hash H" is safe.


Imagine we have some present state and a safe proposition about this state S(p,σ_1). If we look at a future state that evolves from the present one σ_1->σ', we have a lemma (intermediary proof) that lets us know that the proposition at this future state is also safe S(p,σ') is true. This means that in future states of this future state σ'->σ'', the safety of the proposition still holds (since future states of future states are still just future states of the original, as defined by what we said above about A->B->C = A->C).

Now imagine we have another current state σ_2 that also evolves to the same future state, σ_2 -> σ'. The difference this time is that the proposition about σ_2 is the negation of our original proposition ¬p. Since it's impossible to have a state where both p and ¬p are true (a property of the estimator itself), the safety of this negated proposition is false S(¬p, σ_2) is false, which is the same as ¬S(¬p, σ_2) is true.

We can use this to look at some state that has safe propositions, and determine that past states that make propositions that negate the propositions of the current state are not safe.

Thus, if we have two current states σ_1 and σ_2 that both point to a future state σ' (which is safe), but the propositions of σ_2 negate the propositions of σ_1, and σ_1's propositions are safe, we know σ_2's propositions are not safe. This information is extremely important because all of the consensus protocols in the CBC paradigm only operate on states with safety.

The whole point of this proof is to show a case where non-safety can be determined. We need to know which values are not safe to determine which ones are safe, and we determine that a future state is not safe if it includes two contradictory propositions from two different current states.

The next part of this is to guarantee that nodes have a common protocol future as long as there are less than some number of byzantine faults (BFs). For reference, a byzantine fault is essentially an issue where an outcome can appear to be failed to some observers and successful to others. It's difficult to declare an error unless consensus is reached about where the fault is. See this.

If we can guarantee that nodes have a common protocol state given some tolerance to BFs, then we can use this protocol state to make safe propositions, thus ruling out contradictory past protocol states.

10:02 – a quick summary

This is pretty much the whole basic setup. From here you can start to define how the estimator works, what protocol states even are, etc., and you still have this notion of safety. In Vlad's words, "it's pretty cool".


12:00 – What is Vlad doing and why is it important

Vlad has essentially created a process to create consensus protocols that help guarantee safe estimates. These types of consensus protocols are very useful for things like Proof of Stake – which has lots of benefits. "When you have consensus protocols, you don't need to think in the distributed fashion as much when you're designing decentralized systems."

13:21 – How does this apply to "long-range" attacks

The long-range attack is an economic problem moreso than a consensus problem. Nodes that are un-bonded are not incentivized to not double-spend, so we expect the BFT rates to be much higher for old states than new ones.

Vlad then clarifies that the consensus safety proof applies to local consensus rather than distributed consensus with high rates of BFs. That's because a system with a high BF rate can result in bivalent outcomes – where there are two consensus safe outcomes (locally), but the overall consensus has failed. This is where the economic incentives to punish malicious BFs comes in (slashing conditions in PoS).

17:03 – is there a formal framework for this that includes the economic aspect

There is the protocol design, and then there's the analysis of the protocol is used in the real world. We know that there are limits to the amount of stuff we can account for in the protocol design itself, but models help to determine worse-case scenarios and these can be used to best incentivize consensus. An example of where the protocol fails is when you have a smart contract that wants to pay Alice to send a message to Bob, but if Alice fails to send the message or Bob fails to submit proof that he receive the message, the contract doesn't know who to blame.

Part of properly incentivizing consensus is to penalize BFs that result in consensus failure, but not penalize BFs that don't. If you penalize all BFs, you discourage participation which lowers the safety of the network, so you need to find a way to distinguish malicious BFs from regular ones.

21:05 – how does this differ from Casper FFG

Vlad has thought about proving the safety of Casper FFG, and it seems could work, but that's not really the point of the CBC Consensus framework.


22:50 – How to make sure nodes have common protocol futures as long as there's less than some number of BFs

I can't really represent this in text form very well, and Vlad's explanation is pretty clear here if you've followed along so far. It essentially boils down to the fact that we need to identify states that have a number of BFs above some threshold, and then we delete those states. By removing some states, you make the decision-making process non-trivial (really important for making things irreversible without consequence).

28:50 – a question about efficiency of the protocol

31:20 – types of faults

You have two kinds of faults that can cause safety failures. These faults are also distinguishable from network latency: invalid messages and equivocations. In asynchronously safe consensus protocols, liveness faults are indistinguishable from network latency (you don't know if it's dead or just slow), but these cannot cause safety failures. I think this part is fairly clear to anybody who understands a little bit about BFs, so the video is probably better than anything I can rephrase.

35:11 – why would you want to tolerate a certain number of BFs & how many

Vlad states that you almost always want your tolerance of BFs to be above zero, usually around 1/3rd, to allow for natural failures in distributed communication. In the CBC consensus framework, each node can actually set their own BF tolerance level. It's in their interest to set their BF tolerance level as high as possible while still operating with high security.

Vlad then talks a bit about Sharding, and if you understood everything thus far nothing here is particularly complex.

I'm sure I made mistakes in here somewhere, so please help me correct this as needed if you have better insight than I do. I'll try to answer questions if anybody has any, but I'd also urge people to watch the full video first.

Submitted February 09, 2018 at 08:13PM }
via reddit