Shelley Reward formula improvements : the Skewed and Curved reward function extensions

paris-ada-stakepool · 1 February 2021 11:07

Hello, i’d like to share some ideas to improve the current formula with the dev community.
I have written a document that gathers and describes these ideas, and the way they can solve most problems that have appeared and will continue to grow with the increase of factor K in the near feature.

Idea #1: Skewed reward factor b0 : goal here is to improve fairness of pledge benefit by making benefit depend on Leverage factor

Idea #2: Curved reward functions (as introduced in a CIP by Shawn Mc Murdo)

Idea #3: combine Skewed and Curved reward concepts to solve both problems in a unique function

Thanks in advance for your feedbacks !
I hope these ideas can benefit the dev community and inspire some actual improvement of the protocol in order to make it even more resistent to Sybil attacks and even more farily decentralized.

Laurent.

Link to another discussion post I originally made on this topic:

Link to the documentation I wrote for the proposals:

Serotonin · 1 February 2021 23:27

Great analysis and I’m glad to see more people interested in improving the rewards function! I think that some form of the skewed and curved function would be the most ‘equitable’ of them. I have two critiques:

It makes the rewards function quite complex. People are already very confused about how it works and so it would likely make debates on what optimal parameters should be very difficult to agree upon.
Because it retains a0, it does nothing to solve the problem of coupling the rewards function with economic expansion. (see the edit remark in An Alternative to a0 and k - #4 by Serotonin )

While you’re at it, I would love to hear what you think about using the alternative that I suggested in the main article of the link above (and feel free to give any suggestions for improvements).

paris-ada-stakepool · 3 February 2021 10:23

Thanks for your feedback Serotonin ! I think the question about reward formula is an important topic and i’m happy as well to see other people feeling concerned about it.

Regarding the two points you raised:

I agree that skewed version, curved version and even more skewed+curved combination make the function more complex than the original one, but I am afraid a “single parameter” function will always have some flaws, may it shape a linear, or non linear reward profile. Besides, if we would simplify it a bit, i would get rid of the (subtle) adjustment (1-P*(1-S)) from the original formula, so that it would lead to a simpler, yet rich, form:
```
        Reward = R * Sigma * (1 +( b0 + a0 * P) * S) / (1 + a0 + b0 / P)
```

with Sigma pool stake fraction, P = pledge ratio (i.e. pledge amount / pool stake), S = pool saturation ratio. The “curved” version would just consist in replacing P by c§ with c the curve sub-function.
The a0 factor is here to give some benefit to higher pledge whereas b0 factor allows to penalize too highly leveraged pools (i.e. where Leverage = 1/ P is too high).

If we wish to make the function evolve, i am pretty convinced it should be made in a “continuous” way, i.e. ensuring that new formula is backward-compatible with previous existing one. That’s why i kept a0 factor, even though i also think it should not necessarily be tuned back to 0. If we should put it back to zero in the skewed (simplified) version as written above, we would get something like:
```
             Reward = R * Sigma * P * (1 + b0 * S) / (P + b0)
```

For small P the reward would then be mainly proportional to P (inciting growing pledge vs having small pledge) but as P will increase, the divider (P+b0) will have an amortization effect of this benefit, mitigating then the advantage given to very highly pledged pools. If combining with curved c§ then this control over high pledge benefit is even more emphasized, thus countering the “flat” marginal benefit phenomenon araising with the a0 only formula.

If i understand correctly your proposition, the reward function would be defined as (with equivalent notations as above):

        Reward = R * Sigma * min(1, (L * PledgeAmount + B) / TotalPoolStakeAmount)

or with my notations:

           Reward = R * Sigma * min(1, L * P + B / (S * GlobalStake))

which if you redefine your baseline B in % of global stake instead of in absolute ada amount, would then lead to:

              Reward = R * Sigma * min(1, L * P + B(%) / S)

or
Reward = R * z0 * min(S, L * P * S + B(%))

I like the proportionnality wrt P and the fact that above a certain level of P, there will be no additional benefit (in that way it’s similar to what the curve function would bring, even though it’s a bit more brutal here).

What seems to be more problematic is the presence of the term B / S, whatever the value you choose for B (unless you put it to zero which would lead to an oversimplistic formula).

First problem i see is that the reward “efficiency” is a decreasing function of S, i.e. a pool will naturally be penalized by growing its saturation ratio. This is a bit unnatural in terms of incentive.
Second problem: the reward efficiency will be optimal (maximal) as soon as the saturation is low enough S << B% which is also a bit counter-intuitive.

Yet, besides this pure “efficiency” argument, If we look at the effective reward, it looks not too bad and quite well balanced, as a pool with zero pledge would still have a theoretical reward Reward (P = 0) = R * z0 * min(S, B(%)), and as soon as you choose a not too high value for B, this should not create too much advantage to very small pools.

Conclusion: all propositions looks interesting ! I like Skewed & Curved for the flexibility it offers in terms of function shape, for its strict penalization of non pledged pools, and for the backward-comptibility with current formula. Despite its non backward-compatibility, I like your proposition as well, as it embeds in a very simple form the ability to both “incurve” the reward benefit of high pledge (even though it does it a bit less smoothly than the curved reward offers), while still penalizing “significantly” low pledged pool (as soon as B is not too high).

Question now : how could we convince the “dev in charge” to implement these propositions, thoroughly test them in a testnet env and eventually decide (through a CIP voting ?) to put one of them into production ?

Serotonin · 5 February 2021 01:33

This is correct, but I don’t think that it can be translated into your notation exactly due to z₀ still residing in your parameters (remember this is an alternative to a₀ and k, so I ditched both of them, and therefore z₀ as well). Allow me to reorganize using T at the total global stake…

f = R * σ * min(1, T(L*S+B)/(T * Σ) = R * σ * min(1, (1/σ) * (L * s+B/T) )

now, if we want to create a similar pledge ratio P ’ ≡ s/σ with B/T = B% then this becomes

f = R * σ * min(1, L * P ’ + B% / σ)

or perhaps in a simpler form that I should have stated from the beginning…

f = R * min(σ, L * s + B%)

In other words, in most cases, the reward is simply R * σ unless the pool is below it’s own saturation limit which is determined by the amount of individual pledge to the pool. Therefore your efficiency term is 100% for all pools and all pools that create a block receive an equitable share of the rewards.
The disincentive, in this scenario, is that a pool’s saturation is dependent on its pledge s. Therefore, pools with low pledge have a low saturation and the maximum reward they would get, regardless of the number of delegates, is capped dependent on their pledge.

You may be right in that many would prefer to see the function evolve in a “continuous” way. This however isn’t so much a technical argument as it is a psychological one, i.e. any function will still be backwards compatible with the protocol, it is just a bigger leap for people to mentally digest the change from a set saturation limit and variable relative reward to a set relative reward and variable saturation limit.

As far as how we could go about convincing the “dev in charge” to implement one of these propositions, your guess is as good as mine. The CIP process relies on gaining peoples interest to discuss and back proposals. Unfortunately, in the 3-4 months since I’ve put my suggestion forward I’ve only received a few comments, so I don’t know if there is a great interest there. I know there have been a few people very outspoken about the rewards function, but they may only be a small vocal group. I don’t know if it’s a general awareness problem, if the topic appears too complex, or if people are just happy with things the way they are. Either way, I think that the back and forth discussion helps our cause and anyone that is willing to input is welcome in my book.

paris-ada-stakepool · 5 February 2021 08:02

Oh ok I misread your definition of Sigma; I thought it was still capped by z0. So there is no z0, no k in your approach; it’s changing a lot of concepts at the same time.

In that case, I am a bit afraid this will encourage biggest actors to create a simple single gigantic pool with a massive stake amount, the only limit being its capacity to pledge enough ada, as operational costs will be small.

Not sure that’s an efficient way of improving decentralization as it will then be super easy for a handful actors to create a cartel with majority of stake power all concentrated in a few pools without bringing any security to the network…

That’s exactly what skewed and curved rewards combined try to solve (skew penalizes too low pledge to limit the number of pools run by a single actor, and curve penalizes too high pledge advantage to avoid running a few numbers of private pools)

Yet I still like your proposition if you restrict it with z0 as I first understood it

Serotonin · 6 February 2021 00:39

My counter to that argument is that the skewed and curved rewards encourage large pools as well since the efficiency is higher for them. Large pools still dominate the space and it will continue to happen, even with skewed and curved rewards, the only difference is that it takes on the form of pool splitting. The extra operational cost to spin up a new pool, while finite, is essentially negligible for large SPOs. It happened from the get go with IOHK/IOG, 1PCT, ZZZ, and many others and has become a more ubiquitous strategy as k has increased. One of the major points of “An Alternative to a₀ and k” is just to call a spade a spade and admit that there are going to be large players in the space. I don’t see much point in technically having more pools even though many of them have the same operator…in that case, k is just a vanity metric. There’s nothing inherently wrong with having a large (or small) pool, I just don’t think that large pools should receive a higher reward efficiency just because they have more money (Ada). If you can prevent Sybil attack and still reward everyone equally (on a percentage basis), then why not do it?

paris-ada-stakepool · 7 February 2021 18:01

Maybe my document is not clear enough but here are two main assertions that need to be kept in mind:

skewed function prevents large pools to split into multiple pools and divide their pledge into tiny chunks by penalizing very low pledge ratio (i.e. low pledge vs total pool stake)
curved function penalizes efficiency of highly pledged thanks to the crossover factor

I gave some numerical simulations and charts so that people can realize the effect of the different parameters and function variants.

Finding a well-balanced POS incentive system is a difficult task as it is always a trade-of between network security (Sybil attack resistance) and fairness of incentive distribution across operators.

« There’s nothing inherently wrong with having a large (or small) pool, I just don’t think that large pools should receive a higher reward efficiency just because they have more money (Ada) »

I fully agree with that statement and that’s exactly what the proposals i make try to solve.
I think we should also encourage having as much unique operators as possible to guarantee a proper decentralization level of the network. Mixing curved and skewed formulas appears to me as a well balanced formula to achieve that purpose (encouraging small and medium operators, discouraging big operators to split too much their pledge power into multiple pools, and discouraging them to run single fully pledge private pools).

Let’s hope progress will be made on this topic in the near future, one way of the other.

Thanks again for your feedbacks; it’s always good to discuss and share ideas about that important subject. Would be good to see even more ideas

Topic		Replies	Views
Protocol Parameters, Pledge and Sybil Resistance Stake Delegation	30	4277	27 September 2020
Understanding shelley reward formula (and k, a0 parameters) Operate a Stake Pool	23	5966	27 May 2020
Proposition for improved Sybil attack resistance in the Shelley reward formula Operate a Stake Pool	1	835	30 January 2021
CIP - Shelley’s Basho-Voltaire decentralization update CIPs	91	4826	21 October 2022
I loved Itn and Cardano, but I may not love mainet because of a0! Stake Delegation	167	9841	22 April 2021

Shelley Reward formula improvements : the Skewed and Curved reward function extensions

Laurent.

Related topics