This is an interesting topic that gets into to some core features of how Cardano’s blockchain works.
Why do forks occur on Cardano?
Forks occur naturally in Cardano’s blockchain for a couple of reasons:
- When two pools are allocated the same slot as leader. They both produce a block and only at most 1 can be adopted to the canonical chain.
- When the next leader doesn’t receive the last block in time, it will produce it’s block upon the second last block and thereby create a fork.
How does Cardano resolve forks?
In the current Ouroboros implementation, chain forks are resolved are follows (in order):
- Longest chain wins.
- If both alternative forks are equal length then prefer the chain fork where the final block has the lowest VRF value. Note: This is what we will call the “block VRF” value. (Importantly, the “Leader VRF” value is different.) For further reference see this GitHub discussion.
A very important thing to realise regarding the above is that the “block VRF” value is used to deterministically resolve equal length forks irrespective of each block’s slot. In other words, if a fork arises due to another pool not receiving the last block in time, then this will cause a fork where the terminal block in each fork has a different slot number. And, this fork will still be resolved deterministically based on the lowest “block VRF” wins.
What changed in Aug 2022?
Prior to cardano-node version 1.35.0 equal length forks were resolved deterministically based on the lowest “leader VRF” value wins. However, with the roll out of 1.35.0 this changed to using the lowest “block VRF” value. See this discussion on Cardano Forum, and this discussion on GitHub.
What are the different types of “battles”?
Terminology can be confusing, so I think it is important to call all fork battles, well… “fork battles”. This is because it doesn’t matter whether the terminal blocks in equal length forks have different slot numbers or not. All types of equal length forks are settled deterministically using the “block VRF” value.
I think that calling some of them “slot battles” (when the slot number is identical) just confuses people. Sure, if 2 pools are leader for the exact same slot, they can’t avoid producing a fork. However, the result is still a fork if another pool is awarded leader just 1 slot after another pool and doesn’t receive that pool’s block in time. The second pool still produces a block upon the same prior block causing a fork, and this fork is settled deterministically based on the lowest “block VRF” wins. It is irrelevant that the slot numbers differ under the current Ouroboros implementation.
How is the block VRF and leader VRF calculated?
Each block producer calculates the “block VRF” using the following inputs:
- Epoch nonce. (A random value based upon all blocks in the previous epoch up to “stability window” blocks from the end of the epoch.)
- Slot number
- Pool VRF private key
The “leader VRF” value is calculated from the “block VRF” value, in simple terms, by hashing it with an “L” (for “leader”) in front. See this part of the Ouroboros consensus code.
The “leader VRF” value is used to determine if a pool is a leader to produce a block for that slot, but it is not used (any more) to settle equal length “fork battles”.
Some important consequences?
There are at least a couple of important consequences that arise due to the current Ouroboros consensus implementation:
- Pools that are physically remote suffer increased network delays and therefore they are more likely to produce a fork when they don’t receive the previous pool’s block in time. Or, the next pool doesn’t receive their block in time.
Ouroboros parameters are set so that blocks occur on average every 20 seconds. This arises because the probability of any slot being “active” is 5% (1 in 20). Thus if your pool is awarded slot leader for a particular slot, then the odds that another pool is awarded slot leader for the same slot is likewise 5%.
Most pools are housed in USA / Europe and experience network delays of less than 1 second which means that these pools will roughly see 5% of their blocks involved in true “slot battles” and they will lose half based on the lowest “block VRF” wins rule. Thus most pools will get an orphan rate of roughly 2.5%
However, pools that are physically remote experiencing just 1 second network delays will get 3 times the number of “battles” because the chance that another pool had the slot before is 5% and the chance that another pool got awarded the slot after is also 5%. Therefore there is 15% chance that another pool got either the same slot, the slot before, or the slot after. So these remote pools will lose 7.5% of their blocks in “battles”. 3x the orphan rate!!!
Importantly, that works against decentralisation. Because it incentivises the pool operator to house their pool in a data centre in the USA / Europe where they can get the network delay to less than 1 second.
- An attack is enabled whereby the deterministic resolving of “fork battles” is used to deliberately “orphan” other pool’s blocks. For example, consider the following scenario: Let’s say that Amazon decided they wanted to control Cardano’s ledger. Amazon then negotiated with a number of their data centre users to form a coalition group. Let’s say that this “Amazon group” amounted to 32% of Cardano’s current active stake and thereby block production. This group then modify their cardano-node software to implement a small change whereby when minting a block they look at the previous block’s VRF value and compared it to their value. If their block VRF value is lower, and the previous block was minted by an “outsider”, then they deliberately produce their block upon the second last one. This results in a fork that they know they will win since they have the lower “block VRF” value.
It is easy to simulate such an attack and the outcome is that with 32% of the controlled stake, this “Amazon group” can achieve just over 50% of the final blocks adopted to the chain. Furthermore, what will Cardano stakers do when they see that pools in the “Amazon group” are earning higher rewards compared to other pools? Will they move their stake to “Amazon pools” for increased rewards thus providing Amazon with even more control over the Cardano blockchain?
I believe the proper fix for both of these problems is to:
- Increase the slot duration to 4 seconds. This removes the incentive to centralise due to network delay reasons. After all, decentralisation is also about where your pool is physically controlled and the government and legal rules of that jurisdiction.
- Only resolve “fork battles” deterministically by the block VRF where the slot number is identical. Other forks should be resolved by each node preferring the block they received first. With a slot duration of 4 seconds this will not produce a centralising force because 4 seconds should be enough time for full block propagation across the network, even for physically remote pools.