My relay keep restarting every 24 hours

Jose_Carlos · 24 April 2021 14:32

For some strange reason some of my relay randomly crashes. they are in a google cloud vps 24 and was running ok since 1.26.1, when I had the 1.26.2 I started again from scratch using the Alex CNTOOLs tutorial because it makes easier the upgrade.
Following my configuration, how can I see the logs of what can be happened?
Screenshot 2021-04-24 at 15.30.08

Alexd1985 · 24 April 2021 17:15

U can check the logs

journalctl -e -f -u cnode.service

Jose_Carlos · 24 April 2021 17:36

Apr 24 14:38:17 cardano-relay-2-migrate systemd[1]: cnode.service: Main process exited, code=killed, status=2/INT

Apr 24 14:38:17 cardano-relay-2-migrate systemd[1]: cnode.service: Succeeded.

Apr 24 14:38:17 cardano-relay-2-migrate systemd[1]: Stopped Cardano Node.

Apr 24 14:38:23 cardano-relay-2-migrate systemd[1]: Started Cardano Node.

Apr 24 14:38:24 cardano-relay-2-migrate cnode[6690]: Failed to query protocol-parameters from node, not yet fully started?

Apr 24 14:38:24 cardano-relay-2-migrate cnode[6690]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.

Apr 24 14:38:25 cardano-relay-2-migrate cnode[6690]: Listening on http://0.0.0.0:12798

Apr 24 14:43:13 cardano-relay-2-migrate systemd[1]: cnode.service: Current command vanished from the unit file, execution of the command list won’t be resumed.

Apr 24 14:46:49 cardano-relay-2-migrate systemd[1]: Stopping Cardano Node…

Apr 24 14:46:49 cardano-relay-2-migrate cnode[6690]: Shutting down…

Apr 24 14:46:49 cardano-relay-2-migrate systemd[1]: cnode.service: Main process exited, code=killed, status=2/INT

Apr 24 14:46:49 cardano-relay-2-migrate systemd[1]: cnode.service: Succeeded.

Apr 24 14:46:49 cardano-relay-2-migrate systemd[1]: Stopped Cardano Node.

Apr 24 14:46:55 cardano-relay-2-migrate systemd[1]: Started Cardano Node.

Apr 24 14:46:55 cardano-relay-2-migrate cnode[9124]: Failed to query protocol-parameters from node, not yet fully started?

Apr 24 14:46:55 cardano-relay-2-migrate cnode[9124]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.

Apr 24 14:46:56 cardano-relay-2-migrate cnode[9124]: Listening on http://0.0.0.0:12798

Apr 24 15:19:22 cardano-relay-2-migrate cnode[9124]: /opt/cardano/cnode/scripts/cnode.sh: line 57: 9197 Killed cardano-node “{CPU_RUNTIME[@]}" run --topology "{TOPOLOGY}” --config “{CONFIG}" --database-path "{DB_DIR}” --socket-path "{CARDANO_NODE_SOCKET_PATH}" --port {CNODE_PORT} “${host_addr[@]}”

Apr 24 15:19:22 cardano-relay-2-migrate systemd[1]: cnode.service: Main process exited, code=exited, status=137/n/a

Apr 24 15:19:22 cardano-relay-2-migrate systemd[1]: cnode.service: Failed with result ‘exit-code’.

Apr 24 15:19:28 cardano-relay-2-migrate systemd[1]: cnode.service: Service RestartSec=5s expired, scheduling restart.

Apr 24 15:19:28 cardano-relay-2-migrate systemd[1]: cnode.service: Scheduled restart job, restart counter is at 1.

Apr 24 15:19:28 cardano-relay-2-migrate systemd[1]: Stopped Cardano Node.

Apr 24 15:19:36 cardano-relay-2-migrate systemd[1]: Started Cardano Node.

Alexd1985 · 24 April 2021 17:42

Type top

check the cpu and mem usage

Jose_Carlos · 24 April 2021 19:34

I don’t have problems with memory now, by the way it has 8gb ram both relay, I can’t understand what’s going on. I only have Grafana and prometheus installed besides cardano node. right now its running for 4h without crashing but it happens after running more than 12 hours. is there anyway to reduce the so many peers? I have more than 15 peers on this topology updater is that really needed?

Alexd1985 · 24 April 2021 19:46

If both nodes have same issue, try for one of them to set the tracemempool to false in configuration file (and keep it under monitoring)

Jose_Carlos · 24 April 2021 19:46

good I’ll try that on one of them and see what happens thank you

laplasz · 27 April 2021 20:21

Hi!

Any update on this?

ToTheMoonADA · 27 April 2021 21:17

I had a similar issue with exit code 137 after upgrading to 1.26.2 on the relay. I had to upgrade to an 8gb instance with 100gb storage.

laplasz · 27 April 2021 21:19

exit code 137 can refer to out of memory issue, yes

dstratio · 28 April 2021 16:10

I see you are using a google cloud machine which by default comes without a SWAP partition ! Having a swap partition avoids the node crashing (i tested) when it runs out of ram. I suggest either add a bit more ram to cope during the spikes, or find out how to add a swap partition.

Jose_Carlos · 28 April 2021 17:07

I have no idea on how to do that can you point me how did you do?
I had 50gb on disk, increased to 100gb I already have 8gb ram.

ToTheMoonADA · 28 April 2021 23:30

Jose_Carlos · 29 April 2021 15:02

I’ll accept that as solution, I did it right now I hadn’t any problems on it anymore.
thanks

ToTheMoonADA · 29 April 2021 17:57

Sounds good - You may try rebooting the instance. 8gb should be enough memory to allow the relay to boot. 4gb for sure will throw 137 code. I tested it with amazon and google.

Jose_Carlos · 6 May 2021 08:44

sorry, I don’t get it. my relay runs fine about 23h and just restarts… my BP doesn’t have the same issue even tough followed the same guide and same version of code cli, the relay keeps restarting.
it has swap setup.
total used free shared buff/cache available
Mem: 7.8Gi 5.5Gi 1.5Gi 0.0Ki 822Mi 2.1Gi
Swap: 2.0Gi 0.0Ki 2.0Gi

laplasz · 6 May 2021 08:49

Did you get any log about the reason of the crash?

Jose_Carlos · 6 May 2021 09:27

sorry I don’t…

laplasz · 6 May 2021 10:37

how do you run the cardano-node - as a service? or?

Jose_Carlos · 6 May 2021 16:39

its a service, it crashed after 23h running… my Grafana shows me the time when it crashed
checked the glView and it restarted the time running, so the service restarted
Screenshot 2021-05-06 at 17.29.47

Topic		Replies	Views
Relay using a lot of CPU, a lot Setup a Stake Pool	34	1375	10 November 2021
New node hangs at about 70% and then wont start Setup a Stake Pool	4	516	14 August 2021
Node stuck starting Setup a Stake Pool	65	3634	23 November 2021
Relay suddently Not working Operate a Stake Pool	6	459	25 October 2021
I can not go past this step on my relay node Setup a Stake Pool	15	624	14 April 2022

My relay keep restarting every 24 hours

Related topics