Cardano node is crashing after several hours of running

Hello folks, I setup a relay server (4 GB RAM; 2 vCPU) few weeks ago running on the mainnet . I seem to be running into an issue where the cardano-node process seem to die after several hours. The block producing node however appears to crash when the epoch transitions. This is what I noticed in the logs:

`bash[3066]: cardano-node: internal error: Unable to commit 1048576 bytes of memory`
`bash[3066]: (GHC version 8.10.2 for x86_64_unknown_linux)`
`bash[3066]: Please report this as a GHC bug:  ` `https://www.haskell.org/ghc/reportabug` 

Upon researching a little, I ran into this bug which identifies a memory-leak in ouroboros:

I have few follow up questions:

  1. Have any of you experienced this issue? If so, how did you resolve it?
  2. I am running v1.25.1 of the cardano-node but how do I patch the memory leak that’s referenced here in the github issue I listed above?
  3. Has anyone tried increasing the memory from 4GB to 8GB and know if that alleviates the problem (or just postpones the issue)?

I appreciate any pointers to help resolve the issue.

Thanks,
Sumanth

With recent increase of Tx load, 4G is not enough under normal running conditions. Neither for the Relay nor for the BP.

Thank you for the clarification, @tomdx . I will plan to increase it to 8GB.

If you don’t mind me asking, in your experience, what might be the ideal memory/cpu config for the current load? Is there any documentation or website you could point me to help better manage the pool servers?

Sumanth

I run this config on Docker

Relay: Intel, 4 CPU, 8GB RAM, 200 Mbit/s, Docker.
Block Producer: ARM64, 4 CPU, 8GB RAM, 200 Mbit/s, Docker.

1 Like

Thank you @tomdx … much appreciated!

1 Like