New producer unstable - spiking mem to 100% then crash repeatedly

Hi all, I’m a long time observer on cardano, finally trying to get a stake pool up and running. Whilst I have plenty of development experience, I dont use linux much, so whilst ive read a lot - im still learing!

I have 1 relay up on a digitalocean VPS (2cpu, 4GB) - it’s showing 16 peers in and is talking to my producer and running topologyUpdater (glad your staying with us).

Producer (on a seperate digitalocean VPS (same spec as relay) looks ok intermittently - is registered, showing increasing Tx in gLiveView when its running, but it is experiencing rapid memory & cpu spikes to 100% and then restarting. There is so much clutter in syslog with all the cncli and UFW msgs its hard to see what is starting off the problems.

Any suggestions would be welcomed - I haven’t seen similar issues on the forums so have hit a bit of a wall! Thanks in advance.

Yes, turn off cncli script

Cheers,

Hi, Thanks for the prompt response,
I’ve got it all hooked up with systemd services - what is the best way to turn off cncli ?

  1. If you want to disable some of the deployed services, run sudo systemctl disable <service>
  • cnode-cncli-sync.service
  • cnode-cncli-leaderlog.service
  • cnode-cncli-validate.service
  • cnode-cncli-ptsendtip.service
  • cnode-cncli-ptsendslots.service

On a 2 CPU, 4GB memory system you will get a lot of CPU memory spikes as a result of the TraceMemPool setting in your config. This has been discussed in other forums and hopefully the next version of the node will help fix it. In the meantime, you can turn off the traceMempool trace in config.json in order to reduce the CPU usage but you will lose the “Transaction Processed” metric on the node. It is a trade off for sure. If you are sure that your block producer node is getting the TX’s then turn off the TraceMemPool logging for the ** relay **. Also, make sure that you create a swap file for your relay and BP. That should help with the limited core 4GB memory.

Yep thanks, I’d tried switching off traceMempool prior to reaching out - it appeared to lower my steady memory use but but didn’t appear to impact the mem spikes/shutdowns.

hmm - that appears to have done the trick. Producer has been running steadily no mem spikes no crashes for a couple of hours.
So I guess one of the CNCLI services must have been leaking mem?
To be honest I haven’t deeply researched the utility of the various cncli tools - is it particularly recommend to use any/all of them - or is it better to keep the producer installation clean?

Yes, because it is consumimg a lot of resources

So you’d recommend more capacity on the producer in a production environment?
I’m assuming 8GB would do it for ram - are extra cpus recommended as well?

U can upgrade also the cpu… 4vcpu should be enough

Many thanks for your assistance - much appreciated!

1 Like

You’re welcome, anytime!