No more missed slots (during epoch) after changing one setting

Should the ‘+RTS -N -RTS’ not be set before the …run? Means “cardano-node +RTS -N -RTS run …”

I run it this way and it works!

1 Like

@mini1pool
It doesn’t really matter.
+RTS signals to the runtime that this is the start of the rts options.
-RTS signals the end of the rts options array.
In theory, you should be able to stick it anywhere as long as you wrap it around with +RTS and -RTS
But, saying that, when I was testing, setting it before the run errored out for me for some reason. I don’t know why. Setting it after the run worked.
Go figure…

1 Like

Not sure if it’s related but we recently went through a similar issue with missed slots, which we solved by increasing cpu. We believe it to be caused by io_wait during the node taking ledger snapshots. Missed slots correspond to snapshot taking in logs.

Issue is on github.

so during epoch switch right? You were able to totally avoid missing slots during the first 5 min of the epoch by increasing cpu? Can you explain a bit more please?
And the github issue says it will be looked at after alonzo
Could you please share the github issue link?

Yes, looks like it. But we need to phrase it differently otherwise the solution might not be understood correctly. Just increasing the CPUs doesn’t cut it. Haskell will still use only one by default no matter how many physical CPUs you have. You actually need to get the binary to use the available cores. Setting RTS finally did it for me.

I can confirm that after setting the RTS core options, I have no more missing slots during the epoch and only 62 missed slots during the epoch change.

For Coincashew setup you will want to make the change in your script startBlockProducingNode.sh

vi or favorite editor startBlockProducingNode.sh
cardano-node +RTS -N4 -RTS run --Topology ${TOPOLOGY}…

then restart the service
sudo systemctl restart cardano-node

3 Likes

I took second snaphots of the 8 core BP 4 dedicated to cardano usage during the epoch change when I was missing blocks

Started 09:45:51 - 09:47:27 UTC

Happy to share the logs with you but there was no %iowait during this period. Question though you mentioned ledger snpashots that you saw on the logs of snapshot taking. What logs were you viewing?

I posted three days ago that I was going to try updating the setting, and so far it has worked for me. No missed slots since then other than those missed at the epoch change. Fingers crossed there are no missed slots this epoch.

And fewer missed slots during today’s epoch change compared to the last

. /home/xxxxx/.profile
CPU_CORES=4
[[ -n ${CPU_CORES} ]] && CPU_RUNTIME=( “+RTS” “-N${CPU_CORES}” “-RTS” ) || CPU_RUNTIME=()
/home/xxxxx/.local/bin/cardano-node “${CPU_RUNTIME[@]}” run \

You don’t need to specify the number of cores. Just use “-N” without the number and it will use all available cores

I am not talking about the epoch cutover. The node keeps the chain in memory and dumps that on disk on a regular interval called ledger snapshot. Default is about every 75 minutes or so. That is when I would see one or two missed slots sometimes

The settings seem to work fine. I got a few missed blocks after epoch change (150), but the number is not increasing, I’ll monitor the value along the epoch to see.
I have a question: since everyone is experiencing missing blocks at epoch change, does that mean that systematically the first few blocks of every epoch are not minted by anyone? What happens to the TXs belonging to those blocks?

They are transferred to next blocks that will be solved.

Great work on all xSPO alliance members. :smiling_face_with_three_hearts:

Got it! thanks for the explanation. This is exactly what I can confirm too. I had regular missed slots during the epoch. increased cores and never had any again, except the epoch switch

VRF values and slot leader schedule are only known by the pool. Unless it mints and propagates a block; then the pool’s VRF value for that slot (and only that slot) will be included in the block (alongside a proof that it’s the right value).

So in your example, the other pool’s block will get adopted and no one will ever know that their was another pool elected as slot leader for that same slot with a lower VRF value. Except that other pool’s SPO (who will be very frustrated) and everyone he/she told.

On a side note, while playing with RTS, it’s interesting to also read this post about memory usage: Solving the Cardano node huge memory usage - done

1 Like

Hi,

Can you please elaborate on the snapshot? What do you mean it keeps the chain in memory?
Did you mean it caches a portion of it (like any other database) and flushes it to disk periodically? Otherwise I don’t know how it is possible. Average per node instance memory consumption is around 7.5Gb. Cardano db size is 12Gb at the moment.

Just wondering if the multicore has to be enabled by passing the above options to cabal (ie recompile cardano-node with this params) or will setting the runtime options “+RTS -N4 -RTS” work without recompiling?

Thanks

I think he means the collection of UTxOs (and stake address balances), because that’s enough for the snapshot. The whole chain isn’t needed for that. Also, the actual memory usage is much less than 7.5 gig, a big chunk of it is due to how GHC manages memory and garbage collection. See the link in my previous reaction to this post.

No, it will not work.
The program will run concurrently without any options, but the threads will not run in parallel on multiple physical cores. You need to set the -threaded at compile time to link it with the correct library.
You can set your cores at compile time as well and not bother setting it at runtime. In this case you will need:
-rtsopts
-threaded
-with-rtsopts=-N

1 Like

Since today I personally use the following in cabal.project.local:

executable cardano-node
  ghc-options:          -O3
                        -threaded
                        -rtsopts
                        -with-rtsopts=-N
                        -with-rtsopts=-qg1

package cardano-crypto-praos
  flags: -external-libsodium-vrf

Maybe this is helpul for someone :slight_smile:

1 Like

Note that the -threaded option is there by default in cardano-node.cabal:

executable cardano-node
  hs-source-dirs:      app
  main-is:             cardano-node.hs
  default-language:    Haskell2010
  ghc-options:         -threaded
                       -Wall
                       -rtsopts
                       -Wno-unticked-promoted-constructors
  if arch(arm)
    ghc-options:         "-with-rtsopts=-T -I0 -N1 -A16m"
  else
    ghc-options:         "-with-rtsopts=-T -I0 -N2 -A16m"

Why do you disable the parallel GC in generation 0?

1 Like