Summary: Formal Methods (Philipp Kant)


#1

Philipp Kant is the Director of Formal Methods at IOHK. You can find further details on his background and experience here.

  • His group focuses on introducing formal methods into the software development methodology
  • The idea of this group is to make sure that the results from researchers who write papers and prove aspects of the protocol at an abstract level can be translated into something that is machine-executable and can make the software work
  • IOHK is unique in applying formal methods into blockchain technology
  • Formal Methods is usually applied in industries like aerospace or medical products: systems where there is high value or harm to life at stake
  • blockchain systems do have large value so that is the reason behind applying this

In this video, Dr. Kant starts with an introduction on Cardano and the Proof-of-Stake protocol and then leads into a presentation on Formal Methods. Note that this is just a summary as certain sections would be better understood by watching the presentation alongside his prepared slides (codes and graphs).

What is PoS about?

  • What do you want to achieve with a cryptocurrency?
    • decentralization: a currency without a central authority, and in the hands of the users
    • permission-less: everyone should be able to join, no barrier to entry
    • a stable ledger
    • persistence: transactions that are put into the ledger should remain there and cannot be deleted
    • liveness: submitted transactions should end up in the ledger after some time
    • all these requirements are hard to fulfill at the same time – you need something that is very stable that everyone can trust, but you don’t want a central body that needs to decide and be the source of that trust

How trust is created on Bitcoin Blockchain

  • users of Bitcoin send transactions to the network
  • transactions are signed with a private key, so no one can spend other people’s money
  • people want to make sure transactions sent are included and transactions shown do not vanish at a later time
  • transactions sent are grouped into “blocks” which are like pages in a ledger
  • then they are ordered by putting blocks into order: each block references the previous block by including that block’s hash
  • ordering is important because if someone tries to double-spend their money, if there is an order then it will be required to choose one valid transaction
  • achieve trust by having everyone take turns, instead of having one central person
  • by taking turns, even if there are a few who don’t play by the rules, the effect will be that there is a stable ledger where transactions get included
  • prevent transactions from being deleted with the longest chain rule since someone would need to revert to blocks further down the chain to override any transactions
  • being permission-less without identity checks means one can register multiple times and this is a problem when there is a system that takes turns
  • this is called the Sybil attack
  • the solution that Bitcoin uses for this problem is adding a cost for registering
  • the cost is in the form of an arbitrary mathematical puzzle
  • what Bitcoin does is adding an arbitrary string of data to the block and hash of the block has a number of leading zero’s
  • since the hash is not reversible, you can’t just compute what these numbers are and must try and try again
  • you then must spend hashing/computing power to try out some data, compute the hash of the block and if it doesn’t fit, you have to try again
  • this cost gets expensive to attack the network as you would need to own more computers than everybody else in the system

How does PoW work

  • there is a randomized leader election, where for every CPU added to the system, you get one vote
  • chance of being voted is proportional to the CPU power you put into the system
  • winners get to write a block and there are rewards associated with that
  • because of ‘longest chain wins’ rule, it is hard to revert old blocks (unless you somehow have more than 50% of CPU in the system)
  • Problems with Bitcoin:
    • hashing power has huge energy consumption
    • consumption of the Bitcoin network is comparable to a small national state
    • and all for a fairly low transaction rate (~10 Tx/second)
    • hard to scale the system into something with higher transaction rate
    • format of the race leads to a winner-takes-all system
    • to maximize chance of getting any reward, people work together and form pools and these pools lead to centralization

Proof of Stake

  • different leader selection: weighted by stake!
  • for each time slot, the protocol randomly picks one coin and the owner of that coin gets to produce the next block
  • in order to have a higher chance of being elected, you need to buy more of the currency and that will become increasingly expensive
  • but this has the additional advantage because it is tied to the system itself
  • if you are a large stakeholder and you have lots of this currency, you have a better chance of subverting it and deleting transactions but you wouldn’t want to do so and if people noticed this, it will devalue that currency so you are incentivized to be honest, thereby creating trust in the system
  • the reason it has not been used so far is that you need randomness for the election process
  • you need an explicit random number generator → you all need to agree on the source of randomness but then, if you have a central authority for the random generator then that authority controls a large part of the system
  • the other reason it has not been used is that many people thought it was not as safe because of the possibility of someone predicting the random number generator
  • how it works is that coins in the system would be lined up, a random number is produced and the corresponding nth coin would be picked
  • if you could predict this, then what you could do is send money across the system to try and get a distribution so that your coins are in place to where they will be chosen
  • so what you need is: a decentralized random number generator and one that cannot be predicted in advance

Ouroboros

  • one result from the researchers at IOHK is the Ouroboros protocol, the first provably secure PoS protocol
  • this research has been submitted for peer-review and approved
  • the way it works: split time into slots and each slot has a block > for each slot, one stakeholder will be randomly elected to be the leader
  • for that, you need randomness
  • to do that, group the slots into epochs and before the start of an epoch, all the stakeholders must agree on the seed for the next random number generator
  • basically the stakeholders themselves privately roll a dice and produce randomness (and this is not revealed to other parties) and at the end, they send proof that they already know what they rolled and can no longer tamper with it
  • once the stake for the next epoch is clearly stated, all the random numbers are revealed and combined in a specified way and now you have a new seed
  • this seed is something that everybody has the ability to influence but nobody can predict since it is based on many random numbers that are shared
  • this protocol is proven secure against any adversary with less than 50% of stake (amount of Ada in the system)

Ouroboros Praos

  • extension of the protocol that deals with the question of “what happens when messages are delayed?”
  • since the PoS protocol deals with time and time slots and there is a block for each slot…in order for the chain to work, the information of the last block must propagate across network before the next slot leader has their turn, otherwise you would get forks automatically
  • in real networks, you can have delays in messages (like outages)
  • Praos extends adversary model to include network delays
  • this is what is currently being implemented for future versions of Cardano

From Paper to Implementation

  • translating research papers written in English language and mathematical formulae into something that can be executed by a machine
  • there is a gap between these two as the paper is written for understanding by humans and glosses over details that are not needed for discussion but needed for running a protocol
  • Publication = high level of abstraction, written in plain English and mathematical formulae, has proofs of security
  • vs. Code = deals with all the details and precision, written in Haskell, if you have code can you also transfer the proofs into the implementation?

Small Steps – no Big Leaps

  • small steps will be easier to make sure that each of those steps is correct
  • first step is to translate the algorithm into a formal language (without adding any detail at all – staying at the same level of abstraction as the paper) but getting to something that can be read and interpret by a machine
  • once you have this precise but abstract piece, you can start refining and adding details
  • steps should be small enough that you can verify and test that these added steps are correct and do not change the protocol in the wrong way
  • along the way at every step, simulations and testing is done

Process Calculi

  • first step is translating into formal language
  • and what should that language be?
  • in blockchain, you have lots of computers communicating and perform a distributed calculation of the ledger
  • In computer science, the process calculi are a diverse family of related approaches for formally modelling concurrent systems
  • the process calculus chosen is called Psi Calculus (a parametric family of process calculi)

Embedded Domain Specific Languages (EDSLs)

  • want to embed this process language into Haskell
  • you can basically write in any language, but it is easier to reason if the language fits the domain
  • examples of this: PostScript for describing pages of text or graphics, SQL for databases
  • for an embedded DSL → create the language inside another language and you can use the syntax and structures of the host language and then have the program itself as a data term within that language
  • embedded DSL allows for multiple interpretations

Embedding Psi in Haskell

  • implement Psi Calculus as an EDSL in Haskell, then write Ouroboros Praos in this language
  • then by writing different interpreters for the language you can:
    • get simulators for the program
    • or export to a syntax of a proof system (like Isabelle or Coq)
    • or have an interpreter that runs the whole thing and add details to and will ultimately be the production code

Modelling Performance (and Failure)

  • what we need in PoS protocol is a good handle of performance of the system
  • before the launch, benchmarks were set
  • have nodes running on different continents and increasing pressure by sending transactions, and finding where parameters could be tuned to increase throughput
  • IOHK wanted a handle not just from experimenting but also from theory on how well this code should run
  • with experimenting, you get benchmarking data but you need to better understand why that is the case or what the optimal is
  • then you can answer for the protocol questions like:
    • How long does it take for transactions to be recorded?
    • How long does it take to join the network?
    • Can blocks propagate through the whole network in a single slot?
    • What are the resource requirements for a node?

Impairment of Quality: ΔQ

  • time vs. probability by a certain time, some event will have occurred
  • in perfect systems, you will have a line that approaches 1 at some point
  • real world systems can fail, will typically not reach 1 and saturate at a lower level and the difference between this line and 1 will be your chance of failure if you wait for an arbitrary amount of time = improper Cumulative Distribution Function
  • this is something that can be modeled in Haskell
  • this is done by setting ΔQ be a function that takes the state of the random number generator
  • ΔQ formulae can be used for simulating but also for reasoning with it algebraically
  • this is used when looking at different scenarios like parallel or sequential composition

High Assurance Blockchain Implementations

  • cryptocurrencies carry large value
  • so IOHK thinks its proper to do things with a large degree of assurance
  • researchers prove things about the protocol but also need to make sure proofs are not lost in translations
  • there are also proposals for using blockchains for critical infrastructure like land deeds
  • therefore, we need to have systems that are fit for purpose that are designed for performance while being provably secure

#2

Thanks for your efforts :slight_smile: