Evaluation of Propagation Times Ranking - Peer Review wanted

Markus-VITAL · 28 March 2022 16:02

Hi!

I recently ran an analysis about Propagation Times to evaluate how good my Pool performs in terms of Connectivity. The goal was to achieve the lowest possible propagation times to decrease the risk of ghosted blocks.

It would be really great if someone can take a look at my approach and comment if this is a proper way to measure “Best Pools by Receiver Time of Random Blocks over some time range”

That’s how I did it:

Download List of Pools for Lookup Purpose (ID,Ticker,active_stake)
KOIOS /pool_list
KOIOS /pool_info
Get a List of Blocks through KOIOS API (needed to build a URL for PoolTools Block Statistics)
KOIOS /blocks
Per Block Fetch Propagation Statistics from PoolTool (includes rawTips = propagation delays which were reported from each individual pool)
https://s3-us-west-2.amazonaws.com/data.pooltool.io/blockdata/7053/dda50a24c9d63c5797fbb3ea4db37067d22760b661849fbe6f887364306ab98e.json
Insert every the rawtips of every reported block in a Postgres DB (block_height, producer_pool, receiver_pool, propagation_time)

image1059×226 12.6 KB
Query average propagation_time based on the receiver (excluding datasets which are <0 s or >10s)

SELECT row_number() over (order by avg(pt.propagation_time) asc), p.ticker as receiver_ticker, pt.receiver_pool, avg(pt.propagation_time)
FROM proptimes pt, pools p
WHERE pt.receiver_pool=p.id_hex and pt.propagation_time>0 and pt.propagation_time<10000
GROUP BY pt.receiver_pool, p.ticker
ORDER BY 4 asc;

Other useful queries available:

Find producers for which a specific pool receives the blocks with long latency (helpful for tuning custom peers)
Distribution of propagation times in full seconds (helpful to estimate Height Battle Probabilities)
Calculate VRF winning chance (total win change against an average block based on the individual pool_size)

My ask:
Please let me know if you see this as a proper way to run this analysis (see goal at the top)

7.4d4 · 30 March 2022 10:55

It is great that you are attempting to quantify the effect that slot battles and single block forks have on rewards and the advantage that small pools have.

If we look at “ghosted” blocks caused by single block forks resulting from propagation delays. This is where a pool produces a block without seeing a block produced by another pool some slots prior. Both of these blocks will have the same block number and will likely contain many of the same transactions. Only one of these blocks can be accepted, the other is “orphaned” (or “ghosted”). When the fork is only 1 block high, the consensus mechanism resolves this based on the lowest VRF score wins.

Importantly, increasing your network speed and connectivity to decrease propagation times will not affect this. However, there is a proposal to limit this “VRF score wins” decision to only a 5 second window. But currently it is open ended. For example, my pool got an orphaned block when another pool’s block was delayed by 30 seconds!

On the other hand the situation becomes more dependent on propagation times if another pool produces a block that adds to the other fork before your block is transmitted to the network. For example, consider this scenario:

Pool A produces block 0100 for slot 0000
Pool B produces block 0101 for slot 0005 - appending to Pool A’s block for slot 0000
Pool C produces block 0101 for slot 0006 - appending to Pool A’s block for slot 0000
Pool D produces block 0102 for slot 0010 - appending to Pool B’s block for slot 0005 (because it hadn’t received Pool C’s block for slot 0006 yet)
At time 0011 Pool E receives both versions of block 0101 from pools B and C (slots 0005 and 0006 respectively), and lets say that Pool C’s block has the lower VRF score, then it will prefer C’s block and extend its tip with the block from pool C (slot 0006). At this point Pool B’s block is orphaned.
At time 0012 Pool E then receives block 0102 from pool D for slot 0010 then it notices that this block has built upon the fork from Pool B and now this fork is 2 blocks high whereas the fork produced by Pool C is only 1 block high. Thus Pool E will then switch its tip to block 0102 from Pool D and will therefore orphan the block produced by Pool C. Now pool C’s block is orphaned and pool B’s block is no longer orphaned.

So in scenarios like this propagation times do become deterministic on the outcome.

If you wonder why the consensus mechanism uses this VRF score:

I recall that during the ITN stage the consensus mechanism simply preferred the first block received. However, this resulted in pool operators all wanting to co-locate their servers in a particular European ISPs rooms in order to get the fastest connections to the majority of other pools so their blocks were maximally accepted. This was obviously not good for decentralisation. This is why the consensus mechanism was changed to be determined by the VRF score rather than just first block received. The VRF score provides a degree of unpredictability to the result and also preferences smaller pools. This is better for decentralisation.

Remember that for decentralisation we do want stake pools located in all the nooks and crannies of the world even if their connectivity is not a good as it is in America or Europe. Therefore we need to tolerate a reasonable amount of propagation delays.

Markus-VITAL · 30 March 2022 11:46

Thanks for your additional insight. This is very helpful!

If a height battle is evaluated it will be based on VRF. Thats understood. Lower propergation times might avoid this scenario. So if i get more blocks whithin 1s there is a lower risk on buiding a block on a old previous block.

This is really important as there are blocks which need 7s to propagate sometimes. If such a block “steals” a subsequential block from another pool its not really fair

7.4d4 · 30 March 2022 12:12

Duncan Coutts outlines the consensus mechanism as follows:

The current ordering rule is the lexicographic combination of the following, in order:

compare chain length
when (same slot number) then (compare if we produced the block ourselves) else equal
when (same block issuer) then (compare operational certificate issue number) else equal
compare (descending) leader VRF value

Then he says:

The suggestion is to change the last one:

when (both blocks’ slots are within Delta of now) then (compare (descending) leader VRF value) else equal.

This suggested change has no definite plan for implementation yet that I can see.

The Delta value he is referring to is 5 seconds. This is the amount of time IOG has assessed that can be spent on propagating blocks with current block production of roughly every 20 seconds.

See this link: Consensus should favor expected slot height to ward off delay attack · Issue #2913 · input-output-hk/ouroboros-network · GitHub

So when you say a “height battle”, I interpret this as a fork with different height blocks in each branch of the fork. The consensus mechanism will prefer the longest chain. Based on rule 1. above.

I agree. But we need to be careful because we DO want pool operators in different countries across the globe and maybe eventually even in space. And, we DON’T want every pool housed in one ISPs data centre in USA for quickest propagation. However, I think there does need to be limits and 30 seconds delay is ridiculous. Furthermore, my pool was penalised for this other pools poor network connectivity which is a bit perverse.

Topic		Replies	Views
Getting Blocks, but PoolTool shows Propagation Delay > 5000ms Operate a Stake Pool	20	1071	18 February 2022
PoolTool Propagation Delay -4996ms (Time Sync) Operate a Stake Pool	10	759	8 March 2022
Propagation time, slot tip diff, qubeos problems Operate a Stake Pool stake-pools , spo	3	473	9 August 2023
String of orphans Operate a Stake Pool	8	440	13 March 2024
Should your relay nodes be geographically distant? Operate a Stake Pool	8	889	13 January 2021

Evaluation of Propagation Times Ranking - Peer Review wanted

If you wonder why the consensus mechanism uses this VRF score:

Related topics