For some strange reason some of my relay randomly crashes. they are in a google cloud vps 24 and was running ok since 1.26.1, when I had the 1.26.2 I started again from scratch using the Alex CNTOOLs tutorial because it makes easier the upgrade.
Following my configuration, how can I see the logs of what can be happened?
U can check the logs
journalctl -e -f -u cnode.service
Apr 24 14:38:17 cardano-relay-2-migrate systemd[1]: cnode.service: Main process exited, code=killed, status=2/INT
Apr 24 14:38:17 cardano-relay-2-migrate systemd[1]: cnode.service: Succeeded.
Apr 24 14:38:17 cardano-relay-2-migrate systemd[1]: Stopped Cardano Node.
Apr 24 14:38:23 cardano-relay-2-migrate systemd[1]: Started Cardano Node.
Apr 24 14:38:24 cardano-relay-2-migrate cnode[6690]: Failed to query protocol-parameters from node, not yet fully started?
Apr 24 14:38:24 cardano-relay-2-migrate cnode[6690]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.
Apr 24 14:38:25 cardano-relay-2-migrate cnode[6690]: Listening on http://0.0.0.0:12798
Apr 24 14:43:13 cardano-relay-2-migrate systemd[1]: cnode.service: Current command vanished from the unit file, execution of the command list wonât be resumed.
Apr 24 14:46:49 cardano-relay-2-migrate systemd[1]: Stopping Cardano NodeâŚ
Apr 24 14:46:49 cardano-relay-2-migrate cnode[6690]: Shutting downâŚ
Apr 24 14:46:49 cardano-relay-2-migrate systemd[1]: cnode.service: Main process exited, code=killed, status=2/INT
Apr 24 14:46:49 cardano-relay-2-migrate systemd[1]: cnode.service: Succeeded.
Apr 24 14:46:49 cardano-relay-2-migrate systemd[1]: Stopped Cardano Node.
Apr 24 14:46:55 cardano-relay-2-migrate systemd[1]: Started Cardano Node.
Apr 24 14:46:55 cardano-relay-2-migrate cnode[9124]: Failed to query protocol-parameters from node, not yet fully started?
Apr 24 14:46:55 cardano-relay-2-migrate cnode[9124]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.
Apr 24 14:46:56 cardano-relay-2-migrate cnode[9124]: Listening on http://0.0.0.0:12798
Apr 24 15:19:22 cardano-relay-2-migrate cnode[9124]: /opt/cardano/cnode/scripts/cnode.sh: line 57: 9197 Killed cardano-node â{CPU_RUNTIME[@]}" run --topology "{TOPOLOGY}â --config â{CONFIG}" --database-path "{DB_DIR}â --socket-path "{CARDANO_NODE_SOCKET_PATH}" --port {CNODE_PORT} â${host_addr[@]}â
Apr 24 15:19:22 cardano-relay-2-migrate systemd[1]: cnode.service: Main process exited, code=exited, status=137/n/a
Apr 24 15:19:22 cardano-relay-2-migrate systemd[1]: cnode.service: Failed with result âexit-codeâ.
Apr 24 15:19:28 cardano-relay-2-migrate systemd[1]: cnode.service: Service RestartSec=5s expired, scheduling restart.
Apr 24 15:19:28 cardano-relay-2-migrate systemd[1]: cnode.service: Scheduled restart job, restart counter is at 1.
Apr 24 15:19:28 cardano-relay-2-migrate systemd[1]: Stopped Cardano Node.
Apr 24 15:19:36 cardano-relay-2-migrate systemd[1]: Started Cardano Node.
Type top
check the cpu and mem usage
I donât have problems with memory now, by the way it has 8gb ram both relay, I canât understand whatâs going on. I only have Grafana and prometheus installed besides cardano node. right now its running for 4h without crashing but it happens after running more than 12 hours. is there anyway to reduce the so many peers? I have more than 15 peers on this topology updater is that really needed?
If both nodes have same issue, try for one of them to set the tracemempool to false in configuration file (and keep it under monitoring)
good Iâll try that on one of them and see what happens thank you
Hi!
Any update on this?
I had a similar issue with exit code 137 after upgrading to 1.26.2 on the relay. I had to upgrade to an 8gb instance with 100gb storage.
exit code 137 can refer to out of memory issue, yes
I see you are using a google cloud machine which by default comes without a SWAP partition ! Having a swap partition avoids the node crashing (i tested) when it runs out of ram. I suggest either add a bit more ram to cope during the spikes, or find out how to add a swap partition.
I have no idea on how to do that can you point me how did you do?
I had 50gb on disk, increased to 100gb I already have 8gb ram.
Iâll accept that as solution, I did it right now I hadnât any problems on it anymore.
thanks
Sounds good - You may try rebooting the instance. 8gb should be enough memory to allow the relay to boot. 4gb for sure will throw 137 code. I tested it with amazon and google.
sorry, I donât get it. my relay runs fine about 23h and just restarts⌠my BP doesnât have the same issue even tough followed the same guide and same version of code cli, the relay keeps restarting.
it has swap setup.
total used free shared buff/cache available
Mem: 7.8Gi 5.5Gi 1.5Gi 0.0Ki 822Mi 2.1Gi
Swap: 2.0Gi 0.0Ki 2.0Gi
Did you get any log about the reason of the crash?
sorry I donâtâŚ
how do you run the cardano-node - as a service? or?
its a service, it crashed after 23h running⌠my Grafana shows me the time when it crashed
checked the glView and it restarted the time running, so the service restarted