New producer unstable - spiking mem to 100% then crash repeatedly

PaulD · 28 February 2021 22:31

Hi all, I’m a long time observer on cardano, finally trying to get a stake pool up and running. Whilst I have plenty of development experience, I dont use linux much, so whilst ive read a lot - im still learing!

I have 1 relay up on a digitalocean VPS (2cpu, 4GB) - it’s showing 16 peers in and is talking to my producer and running topologyUpdater (glad your staying with us).

Producer (on a seperate digitalocean VPS (same spec as relay) looks ok intermittently - is registered, showing increasing Tx in gLiveView when its running, but it is experiencing rapid memory & cpu spikes to 100% and then restarting. There is so much clutter in syslog with all the cncli and UFW msgs its hard to see what is starting off the problems.

Any suggestions would be welcomed - I haven’t seen similar issues on the forums so have hit a bit of a wall! Thanks in advance.

Alexd1985 · 28 February 2021 22:38

Yes, turn off cncli script

Cheers,

PaulD · 28 February 2021 22:46

Hi, Thanks for the prompt response,
I’ve got it all hooked up with systemd services - what is the best way to turn off cncli ?

Alexd1985 · 28 February 2021 22:49

If you want to disable some of the deployed services, run sudo systemctl disable <service>

cnode-cncli-sync.service
cnode-cncli-leaderlog.service
cnode-cncli-validate.service
cnode-cncli-ptsendtip.service
cnode-cncli-ptsendslots.service

MantisPool-MANT · 28 February 2021 23:55

On a 2 CPU, 4GB memory system you will get a lot of CPU memory spikes as a result of the TraceMemPool setting in your config. This has been discussed in other forums and hopefully the next version of the node will help fix it. In the meantime, you can turn off the traceMempool trace in config.json in order to reduce the CPU usage but you will lose the “Transaction Processed” metric on the node. It is a trade off for sure. If you are sure that your block producer node is getting the TX’s then turn off the TraceMemPool logging for the ** relay **. Also, make sure that you create a swap file for your relay and BP. That should help with the limited core 4GB memory.

PaulD · 1 March 2021 00:44

Yep thanks, I’d tried switching off traceMempool prior to reaching out - it appeared to lower my steady memory use but but didn’t appear to impact the mem spikes/shutdowns.

PaulD · 1 March 2021 00:51

hmm - that appears to have done the trick. Producer has been running steadily no mem spikes no crashes for a couple of hours.
So I guess one of the CNCLI services must have been leaking mem?
To be honest I haven’t deeply researched the utility of the various cncli tools - is it particularly recommend to use any/all of them - or is it better to keep the producer installation clean?

Alexd1985 · 1 March 2021 00:55

Yes, because it is consumimg a lot of resources

PaulD · 1 March 2021 01:36

So you’d recommend more capacity on the producer in a production environment?
I’m assuming 8GB would do it for ram - are extra cpus recommended as well?

Alexd1985 · 1 March 2021 01:43

U can upgrade also the cpu… 4vcpu should be enough

PaulD · 1 March 2021 01:46

Many thanks for your assistance - much appreciated!

Alexd1985 · 1 March 2021 01:47

You’re welcome, anytime!

Topic		Replies	Views
Setting Up Stake Pool on Digital Ocean (Relay node + Producer node) Setup a Stake Pool stake-pools , stakepool , setup	4	3140	8 February 2021
Block producer and Relay node both stuck on starting on 1.30.1 Setup a Stake Pool	12	533	11 October 2021
Cncli.sh leaderlog is killing the block producing node Operate a Stake Pool	6	430	28 January 2022
Both Relays Offline After 1.29 update Setup a Stake Pool	23	840	27 September 2021
1/1 Undetermined in peer in GLiveView Setup a Stake Pool	21	702	19 September 2021

New producer unstable - spiking mem to 100% then crash repeatedly

Related Topics