Relay - Failed with result 'signal'

Ideal · 21 March 2022 12:25

Hi all,

After each 24h one node of mine keeps shutting down with the following message:

cnode.service: Failed with result 'signal'.
Stopped Cardano Node.
Started Cardano Node.

After this crash/ error/ restart the node runs for 24h again and does it again. Other nodes of mine don’t get this error but I can’t seem to figure out what the problem might be.

Could someone explain to me what might be happening here or how to solve it?

Kind regards,
Ideal

Alexd1985 · 21 March 2022 14:55

could be the topology updater or a job on crontab which is restarting the node once/24 hours

Ideal · 21 March 2022 19:47

Other nodes of mine also restart but don’t show the same message. The node is still performing well so it doesn’t seem to be a serious issue. Still weird tho…

Alexd1985 · 21 March 2022 19:55

did u checked if there is a job on crontab to kill the node each 24 hours?

Ideal · 27 March 2022 20:58

I have been monitoring the node for a while now. There is nothing on crontab and there are no other unusual messages in the journal. The node performs more than fine so I am kind of clueless now.

7.4d4 · 28 March 2022 12:29

Is it when the memory gets exhausted? Do you have swap enabled?

You can check memory usage with:

free

If you run the following you will get an output showing your current RTS settings, missed slots, time your node has been running, and memory usage:

printf "%0.s-" {1..70}; echo; startsec=$(curl -s -H 'Accept: application/json' http:/localhost:12788 | jq '.cardano.node.metrics.nodeStartTime.int.val'); startdate="$(date -d @${startsec})"; nowsec=$(date '+%s'); nowdate="$(date)"; runhrs=$(( (nowsec - startsec) / 3600 )); runmins=$(( (nowsec - startsec) % 3600 / 60 )); rtsconf="$(ps aux | grep -Po "cardano-node\s.*\+RTS\s.*\-RTS")"; missedslots=$(curl -s -H 'Accept: application/json' http:/localhost:12788 | jq '.cardano.node.metrics.slotsMissedNum.int.val'); echo "Node Started: ${startdate} (Running: ${runhrs} hrs ${runmins} mins)"; echo "RTS settings: ${rtsconf}"; echo "Missed slots: ${missedslots}"; echo "Memory use:"; free;

Maybe that will give you some clues?

Ideal · 29 March 2022 12:34

Thanks for the suggestion. This node still has 15GB of available RAM while operating. So, I don’t believe RAM is an issue. The last command also returns that the node hasn’t missed a slot so far.

I will leave it as is since I can’t seem to find the cause and the node is performing well. Thanks for the help!

Alexd1985 · 29 March 2022 12:47

check the restarting process and check if there are differences (the way how the nodes are restarted)

sudo systemctl status cnode-tu-restart.service

7.4d4 · 29 March 2022 12:52

The reason I was wondering about the memory and swap was because I had a similar issue some months ago.

When the haskell cardano-node runs its garbage collector it needs to allocate a lot of memory. If I recall correctly, the haskell system was allocating double its previous amount of memory which was blowing above my 16G RAM limit and causing the operating system to kill the process due to “out of memory” (oom) error.

I recall the log file showing kernel messages about killing the cardano-node process due to this oom error. Did you see any such errors in your logs?

Once I allocated 16G of swap the operating system didn’t kill the process any more.

Topic		Replies	Views
My relay keep restarting every 24 hours Setup a Stake Pool	30	1460	7 May 2021
Issues stopping cardano-node Operate a Stake Pool	2	703	20 February 2021
All nodes keep restarting? Operate a Stake Pool	9	804	26 October 2021
Block producer crashes every ~24hours status=137 Operate a Stake Pool	12	835	2 January 2022
Node never cleanly shuts down Operate a Stake Pool	11	770	30 April 2022

Relay - Failed with result 'signal'

Related topics