Relay - Failed with result 'signal'

Hi all,

After each 24h one node of mine keeps shutting down with the following message:

cnode.service: Failed with result 'signal'.
Stopped Cardano Node.
Started Cardano Node.

After this crash/ error/ restart the node runs for 24h again and does it again. Other nodes of mine don’t get this error but I can’t seem to figure out what the problem might be.

Could someone explain to me what might be happening here or how to solve it?

Kind regards,
Ideal

could be the topology updater or a job on crontab which is restarting the node once/24 hours

1 Like

Other nodes of mine also restart but don’t show the same message. The node is still performing well so it doesn’t seem to be a serious issue. Still weird tho…

did u checked if there is a job on crontab to kill the node each 24 hours?

I have been monitoring the node for a while now. There is nothing on crontab and there are no other unusual messages in the journal. The node performs more than fine so I am kind of clueless now.

Is it when the memory gets exhausted? Do you have swap enabled?

You can check memory usage with:

free

If you run the following you will get an output showing your current RTS settings, missed slots, time your node has been running, and memory usage:

printf "%0.s-" {1..70}; echo; startsec=$(curl -s -H 'Accept: application/json' http:/localhost:12788 | jq '.cardano.node.metrics.nodeStartTime.int.val'); startdate="$(date -d @${startsec})"; nowsec=$(date '+%s'); nowdate="$(date)"; runhrs=$(( (nowsec - startsec) / 3600 )); runmins=$(( (nowsec - startsec) % 3600 / 60 )); rtsconf="$(ps aux | grep -Po "cardano-node\s.*\+RTS\s.*\-RTS")"; missedslots=$(curl -s -H 'Accept: application/json' http:/localhost:12788 | jq '.cardano.node.metrics.slotsMissedNum.int.val'); echo "Node Started: ${startdate} (Running: ${runhrs} hrs ${runmins} mins)"; echo "RTS settings: ${rtsconf}"; echo "Missed slots: ${missedslots}"; echo "Memory use:"; free;

Maybe that will give you some clues?

1 Like

Thanks for the suggestion. This node still has 15GB of available RAM while operating. So, I don’t believe RAM is an issue. The last command also returns that the node hasn’t missed a slot so far.

I will leave it as is since I can’t seem to find the cause and the node is performing well. Thanks for the help!

check the restarting process and check if there are differences (the way how the nodes are restarted)

sudo systemctl status cnode-tu-restart.service

The reason I was wondering about the memory and swap was because I had a similar issue some months ago.

When the haskell cardano-node runs its garbage collector it needs to allocate a lot of memory. If I recall correctly, the haskell system was allocating double its previous amount of memory which was blowing above my 16G RAM limit and causing the operating system to kill the process due to “out of memory” (oom) error.

I recall the log file showing kernel messages about killing the cardano-node process due to this oom error. Did you see any such errors in your logs?

Once I allocated 16G of swap the operating system didn’t kill the process any more.