How to gauge the BP node performance?

adalf1 · 2 February 2022 15:38

If an expert is looking at the LiveView snapshot below, what would be her conclusion? A BP node is on par, so-so, not good? If somebody could comment on the BLOCK PROPAGATION section and “Missed slot leader checks”, it would be great.

┌────────────────────────────────┬────────────┬────────────────────────┐
│ Uptime: 4d 20:12:52            │ Port: 6000 │ Guild LiveView v1.25.1 │
│--------------------------------└────────────┴────────────────────────┤
│ Epoch 318 [54.7%], 2d 06:18:21 remaining                             │
│ ▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▌▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖ │
│                                                                      │
│ Block      : 6832877   Tip (ref)  : 52249299  Forks      : 861       │
│ Slot       : 52249296  Tip (diff) : 3 :)      Total Tx   : 0         │
│ Slot epoch : 236496    Density    : 4.536     Pending Tx : 0/0K      │
│- CONNECTIONS --------------------------------------------------------│
│ P2P        : disabled  Incoming   : 2         Outgoing   : 2         │
│- BLOCK PROPAGATION --------------------------------------------------│
│ Last Delay : 1.49s     Served     : 517       Late (>5s) : 23        │
│ Within 1s  : 3.89%     Within 3s  : 94.44%    Within 5s  : 99.72%    │
│- NODE RESOURCE USAGE ------------------------------------------------│
│ CPU node   : 18.9%     Mem (Live) : 4.6G      GC Minor   : 263254    │
│ Mem (RSS)  : 9.6G      Mem (Heap) : 9.6G      GC Major   : 1063      │
├─ CORE ───────────────────────────────────────────────────────────────┤
│ KES current/remaining             : 403 / 44                         │
│ KES expiration date               : 2022-04-09 05:44:51 EDT          │
│ Missed slot leader checks         : 783 (0.1887 %)                   │
│- BLOCK PRODUCTION ---------------------------------------------------│
│ Leader     : 0         Adopted    : 0         Missed     : 0         │
│ Ideal      : 0.01      Confirmed  : 0         Ghosted    : 0         │
│ Luck       : 0.0%      Invalid    : 0         Stolen     : 0         │
└──────────────────────────────────────────────────────────────────────┘

georgem1976 · 3 February 2022 09:14

Block propagation does not look good. It is >99.5% for all my nodes.
Missed slot leader checks looks pretty bad. If the hosting was of a good quality, not it would be around 130 (because during the last epoch transition, the “Missed slot leader checks” was between 125 and 130 for a few nodes where I and other SPOs made some statistics), and a block producer on a good hosting will have 0 missed slot leader checks for the period between epoch transitions. A cheap hosting will probably not be a good hosting.
But I’ve seen a lot worse, I’ve heard of more than 1%, too.

adalf1 · 3 February 2022 13:32

Thank you for your reply! Just wanted to clarify “Block propagation does not look good. It is >99.5% for all my nodes”. What measurement is >99.5% on your nodes? Within 1s? Within 3s?

georgem1976 · 3 February 2022 14:12

Sorry, within 3 seconds. I thought I wrote it.

adalf1 · 3 February 2022 21:04

Great, thank you!
This brings the next two questions.

What is a good enough hosting as far as a Cardano pool is concerned?
My BP and two relays are virtual servers. All three have 6 cores (2.8 GHz), 16GB memory, 400 SSD, IP address, 32TB traffic.

The pings to cnn.com, yahoo.com, bbc.co.uk are 1.5, 10.0, 1.6 ms accordingly, which seems OK. What else can I measure to estimate the hosting quality? Some benchmark utility maybe?

Another aspect of this is Haskell runtime parameters. If anybody had success tweaking those, please share. The only parameter I changed is this: CPU_CORES=6.

georgem1976 · 3 February 2022 21:22

Your configurations are very good. And CPU_CORES settings is ok.
You cannot really know if your hosting provider has a good quality before trying it. Price might be a good hint: if it is cheap, it is probably not good, because it is probably over-provisioning the resources.

adalf1 · 5 February 2022 14:34

The numbers are slightly better if i run the node with --nonmoving-gc. This is clearly at the expense of the higher memory consumption, which is expected considering the nature of the flag. The memory consumption increased ~40%. And cncli.sh leaderlog kills the node because there is not enough memory for both cnode and cncli leaderlog. Rolling back to the original params.

┌────────────────────────────────┬────────────┬────────────────────────┐
│ Uptime: 1d 15:17:04 │ Port: 6000 │ Guild LiveView v1.25.1 │
│--------------------------------└────────────┴────────────────────────┤
│ Epoch 319 [13.8%], 4d 07:21:48 remaining │
│ ▌▌▌▌▌▌▌▌▌▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖▖ │
│ │
│ Block : 6844867 Tip (ref) : 52504692 Forks : 261 │
│ Slot : 52504627 Tip (diff) : 65 Total Tx : 0 │
│ Slot epoch : 59827 Density : 4.640 Pending Tx : 0/0K │
│- CONNECTIONS --------------------------------------------------------│
│ P2P : disabled Incoming : 2 Outgoing : 2 │
│- BLOCK PROPAGATION --------------------------------------------------│
│ Last Delay : 1.12s Served : 322 Late (>5s) : 12 │
│ Within 1s : 11.94% Within 3s : 98.52% Within 5s : 100.00% │
│- NODE RESOURCE USAGE ------------------------------------------------│
│ CPU node : 18.6% Mem (Live) : 4.5G GC Minor : 76337 │
│ Mem (RSS) : 15.2G Mem (Heap) : 15.4G GC Major : 98 │
├─ CORE ───────────────────────────────────────────────────────────────┤
│ KES current/remaining : 405 / 42 │
│ KES expiration date : 2022-04-09 05:44:51 EDT │
│ Missed slot leader checks : 203 (0.1443 %) │
│- BLOCK PRODUCTION ---------------------------------------------------│
│ Leader : 0 Adopted : 0 Missed : 0 │
│ Ideal : - Confirmed : 0 Ghosted : 0 │
│ Luck : - Invalid : 0 Stolen : 0 │
└──────────────────────────────────────────────────────────────────────┘

Topic		Replies	Views
Does this BP output look normal? Setup a Stake Pool	2	350	15 December 2020
Live view bp-server, Should it look like this? Operate a Stake Pool	3	219	16 April 2023
Is my BP node running normal? Setup a Stake Pool	10	696	21 September 2022
Tip (diff) problem Operate a Stake Pool	3	522	14 February 2022
Since LiveView is Down and gLiveview won't work for me, is there a way to see if peers are connected and TXs are increasing? Screenshots Included Operate a Stake Pool	19	615	13 December 2020

How to gauge the BP node performance?

Related Topics