Summary: Formal Methods (Philipp Kant)

maki.mukai · 30 May 2018 16:58

Philipp Kant is the Director of Formal Methods at IOHK. You can find further details on his background and experience here.

His group focuses on introducing formal methods into the software development methodology
The idea of this group is to make sure that the results from researchers who write papers and prove aspects of the protocol at an abstract level can be translated into something that is machine-executable and can make the software work
IOHK is unique in applying formal methods into blockchain technology
Formal Methods is usually applied in industries like aerospace or medical products: systems where there is high value or harm to life at stake
blockchain systems do have large value so that is the reason behind applying this

In this video, Dr. Kant starts with an introduction on Cardano and the Proof-of-Stake protocol and then leads into a presentation on Formal Methods. Note that this is just a summary as certain sections would be better understood by watching the presentation alongside his prepared slides (codes and graphs).

What is PoS about?

What do you want to achieve with a cryptocurrency?
- decentralization: a currency without a central authority, and in the hands of the users
- permission-less: everyone should be able to join, no barrier to entry
- a stable ledger
- persistence: transactions that are put into the ledger should remain there and cannot be deleted
- liveness: submitted transactions should end up in the ledger after some time
- all these requirements are hard to fulfill at the same time – you need something that is very stable that everyone can trust, but you don’t want a central body that needs to decide and be the source of that trust

How trust is created on Bitcoin Blockchain

users of Bitcoin send transactions to the network
transactions are signed with a private key, so no one can spend other people’s money
people want to make sure transactions sent are included and transactions shown do not vanish at a later time
transactions sent are grouped into “blocks” which are like pages in a ledger
then they are ordered by putting blocks into order: each block references the previous block by including that block’s hash
ordering is important because if someone tries to double-spend their money, if there is an order then it will be required to choose one valid transaction
achieve trust by having everyone take turns, instead of having one central person
by taking turns, even if there are a few who don’t play by the rules, the effect will be that there is a stable ledger where transactions get included
prevent transactions from being deleted with the longest chain rule since someone would need to revert to blocks further down the chain to override any transactions
being permission-less without identity checks means one can register multiple times and this is a problem when there is a system that takes turns
this is called the Sybil attack
the solution that Bitcoin uses for this problem is adding a cost for registering
the cost is in the form of an arbitrary mathematical puzzle
what Bitcoin does is adding an arbitrary string of data to the block and hash of the block has a number of leading zero’s
since the hash is not reversible, you can’t just compute what these numbers are and must try and try again
you then must spend hashing/computing power to try out some data, compute the hash of the block and if it doesn’t fit, you have to try again
this cost gets expensive to attack the network as you would need to own more computers than everybody else in the system

How does PoW work

there is a randomized leader election, where for every CPU added to the system, you get one vote
chance of being voted is proportional to the CPU power you put into the system
winners get to write a block and there are rewards associated with that
because of ‘longest chain wins’ rule, it is hard to revert old blocks (unless you somehow have more than 50% of CPU in the system)
Problems with Bitcoin:
- hashing power has huge energy consumption
- consumption of the Bitcoin network is comparable to a small national state
- and all for a fairly low transaction rate (~10 Tx/second)
- hard to scale the system into something with higher transaction rate
- format of the race leads to a winner-takes-all system
- to maximize chance of getting any reward, people work together and form pools and these pools lead to centralization

Proof of Stake

different leader selection: weighted by stake!
for each time slot, the protocol randomly picks one coin and the owner of that coin gets to produce the next block
in order to have a higher chance of being elected, you need to buy more of the currency and that will become increasingly expensive
but this has the additional advantage because it is tied to the system itself
if you are a large stakeholder and you have lots of this currency, you have a better chance of subverting it and deleting transactions but you wouldn’t want to do so and if people noticed this, it will devalue that currency so you are incentivized to be honest, thereby creating trust in the system
the reason it has not been used so far is that you need randomness for the election process
you need an explicit random number generator → you all need to agree on the source of randomness but then, if you have a central authority for the random generator then that authority controls a large part of the system
the other reason it has not been used is that many people thought it was not as safe because of the possibility of someone predicting the random number generator
how it works is that coins in the system would be lined up, a random number is produced and the corresponding nth coin would be picked
if you could predict this, then what you could do is send money across the system to try and get a distribution so that your coins are in place to where they will be chosen
so what you need is: a decentralized random number generator and one that cannot be predicted in advance

Ouroboros

one result from the researchers at IOHK is the Ouroboros protocol, the first provably secure PoS protocol
this research has been submitted for peer-review and approved
the way it works: split time into slots and each slot has a block > for each slot, one stakeholder will be randomly elected to be the leader
for that, you need randomness
to do that, group the slots into epochs and before the start of an epoch, all the stakeholders must agree on the seed for the next random number generator
basically the stakeholders themselves privately roll a dice and produce randomness (and this is not revealed to other parties) and at the end, they send proof that they already know what they rolled and can no longer tamper with it
once the stake for the next epoch is clearly stated, all the random numbers are revealed and combined in a specified way and now you have a new seed
this seed is something that everybody has the ability to influence but nobody can predict since it is based on many random numbers that are shared
this protocol is proven secure against any adversary with less than 50% of stake (amount of Ada in the system)

Ouroboros Praos

extension of the protocol that deals with the question of “what happens when messages are delayed?”
since the PoS protocol deals with time and time slots and there is a block for each slot…in order for the chain to work, the information of the last block must propagate across network before the next slot leader has their turn, otherwise you would get forks automatically
in real networks, you can have delays in messages (like outages)
Praos extends adversary model to include network delays
this is what is currently being implemented for future versions of Cardano

From Paper to Implementation

translating research papers written in English language and mathematical formulae into something that can be executed by a machine
there is a gap between these two as the paper is written for understanding by humans and glosses over details that are not needed for discussion but needed for running a protocol
Publication = high level of abstraction, written in plain English and mathematical formulae, has proofs of security
vs. Code = deals with all the details and precision, written in Haskell, if you have code can you also transfer the proofs into the implementation?

Small Steps – no Big Leaps

small steps will be easier to make sure that each of those steps is correct
first step is to translate the algorithm into a formal language (without adding any detail at all – staying at the same level of abstraction as the paper) but getting to something that can be read and interpret by a machine
once you have this precise but abstract piece, you can start refining and adding details
steps should be small enough that you can verify and test that these added steps are correct and do not change the protocol in the wrong way
along the way at every step, simulations and testing is done

Process Calculi

first step is translating into formal language
and what should that language be?
in blockchain, you have lots of computers communicating and perform a distributed calculation of the ledger
In computer science, the process calculi are a diverse family of related approaches for formally modelling concurrent systems
the process calculus chosen is called Psi Calculus (a parametric family of process calculi)

Embedded Domain Specific Languages (EDSLs)

want to embed this process language into Haskell
you can basically write in any language, but it is easier to reason if the language fits the domain
examples of this: PostScript for describing pages of text or graphics, SQL for databases
for an embedded DSL → create the language inside another language and you can use the syntax and structures of the host language and then have the program itself as a data term within that language
embedded DSL allows for multiple interpretations

Embedding Psi in Haskell

implement Psi Calculus as an EDSL in Haskell, then write Ouroboros Praos in this language
then by writing different interpreters for the language you can:
- get simulators for the program
- or export to a syntax of a proof system (like Isabelle or Coq)
- or have an interpreter that runs the whole thing and add details to and will ultimately be the production code

Modelling Performance (and Failure)

what we need in PoS protocol is a good handle of performance of the system
before the launch, benchmarks were set
have nodes running on different continents and increasing pressure by sending transactions, and finding where parameters could be tuned to increase throughput
IOHK wanted a handle not just from experimenting but also from theory on how well this code should run
with experimenting, you get benchmarking data but you need to better understand why that is the case or what the optimal is
then you can answer for the protocol questions like:
- How long does it take for transactions to be recorded?
- How long does it take to join the network?
- Can blocks propagate through the whole network in a single slot?
- What are the resource requirements for a node?

Impairment of Quality: ΔQ

time vs. probability by a certain time, some event will have occurred
in perfect systems, you will have a line that approaches 1 at some point
real world systems can fail, will typically not reach 1 and saturate at a lower level and the difference between this line and 1 will be your chance of failure if you wait for an arbitrary amount of time = improper Cumulative Distribution Function
this is something that can be modeled in Haskell
this is done by setting ΔQ be a function that takes the state of the random number generator
ΔQ formulae can be used for simulating but also for reasoning with it algebraically
this is used when looking at different scenarios like parallel or sequential composition

High Assurance Blockchain Implementations

cryptocurrencies carry large value
so IOHK thinks its proper to do things with a large degree of assurance
researchers prove things about the protocol but also need to make sure proofs are not lost in translations
there are also proposals for using blockchains for critical infrastructure like land deeds
therefore, we need to have systems that are fit for purpose that are designed for performance while being provably secure

ishleh · 30 May 2018 20:23

Thanks for your efforts

CardanoHerz · 25 June 2021 13:11

Excellent work ! Here you can find the bulgarian translation of the article - > 🇧🇬 Формални методи - обобщение (Филип Кант)

Topic		Replies	Views
Philipp Kant from IOHK talks about Formal Methods News and Announcements	1	2444	23 May 2018
:DE: Zusammenfassung: Formale Methoden (Philipp Kant) Bildung translation , translated-content	0	680	27 March 2021
Resumen: Métodos formales (Philipp Kant). Traducción al español :es: General translation , español-🇪🇸	0	1042	30 January 2019
IOHK Philipp Kant의 Formal Method설명 영상 Cardano 프로젝트	0	708	23 May 2018
Hi, i am from hamburg, germany Introductions	4	1073	26 February 2019