Node is restarting almost every hour by itself!

Hi there, I’ve updated my node with - software_upgrade.sh. Although my node is syncing, but it syncs very slowly. Also, I’ve noticed that my cnode is stopping and starting every other hour. What is here happening?

My Specs:

Node version: 1.35.3

*cabal --version*
cabal-install version 3.6.2.0
compiled using version 3.6.2.0 of the Cabal library

*ghc --version*
The Glorious Glasgow Haskell Compilation System, version 8.10.7

*free -m*
               total        used        free      shared  buff/cache   available
Mem:           15977        2188       11147           2        2642       13484
Swap:          12006           0       12006

Screenshot 2022-09-06 at 08.53.05

You’re probably running out of memory so the system kills the node automatically.

Can you paste the result of this command?

sudo tail -f /var/log/syslog | grep "Out of memory"

There is no output with your command. I also did

sudo cat  /var/log/syslog | grep "Out of memory"

Nothing output.

But i see in the log files entries like below. Can you find something?

Sep  4 06:13:23 aha-momo systemd[1]: Stopped Cardano Node Submit API.
Sep  4 06:13:23 aha-momo systemd[1]: Started Cardano Node Submit API.
Sep  4 06:13:24 aha-momo cnode-submit-api[1962802]: #033[31mLooks like cardano-node is running with socket-path as #033[94m/opt/cardano/cnode/sockets/node0.socket#033[31m, but the actual socket file does not exist.
Sep  4 06:13:24 aha-momo cnode-submit-api[1962802]: This could occur if the node hasnt completed startup or if a second instance of node startup was attempted!
Sep  4 06:13:24 aha-momo cnode-submit-api[1962802]: If this does not resolve automatically in a few minutes, you might want to restart your node and try again.#033[0m
Sep  4 06:13:24 aha-momo cnode-submit-api[1962911]: TERM environment variable not set.
Sep  4 06:13:24 aha-momo cnode-submit-api[1962802]: ERROR: Could not locate socket file at /opt/cardano/cnode/sockets/node0.socket, the node may not have completed startup !!
Sep  4 06:13:24 aha-momo systemd[1]: cnode-submit-api.service: Main process exited, code=exited, status=1/FAILURE
Sep  4 06:13:24 aha-momo systemd[1]: cnode-submit-api.service: Failed with result 'exit-code'.
Sep  4 06:13:28 aha-momo systemd[1]: cnode-tu-blockperf.service: Scheduled restart job, restart counter is at 3598.
Sep  4 06:13:28 aha-momo systemd[1]: Stopped Cardano Node - Block Performance.

Your log entries means the node is not running or have just started and havent gone to syncing mode. May i know the specs of your server?

My node is syncing as can be seen from the screenshots (24.0%) this morning and now (28.3%). But sync is very slow.

I’ve put my server specs above. What else do you want to know?
Screenshot 2022-09-06 at 16.00.29

Mine took 2 days to get to 91%. :slight_smile:

Try setting up the swap space.

Thanks for your encouragment.

Do you have any idea what this below means?

Sep 06 16:52:09 aha-momo systemd[1]: /etc/systemd/system/cnode.service:17: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether.

U can speed up the sync by downloding the db from here, it should take only 2-3 hours but the question is why is re-syncing after software upgrade… did u delete the db?

You should have enough RAM, this is the relay or BP? Do u run cncli on this node? Or any other services?

1 Like

Hi @Alexd1985 , Yes I tried to download the db from csnapshots.io on sunday few times, but every time after about an hour it failed with error stream not closed properly. (something like that).

No, after software update, I did not delete db.

This is my relay node, it has 16GB ram. My BP node is also updated and has about same progress, but it don’t have this issue.

There is this error message now,

COULD NOT CONNECT TO A RUNNING INSTANCE, 3 FAILED ATTEMPTS IN A ROW!

but why restarted… can u check with

journalctl -e -f -u cnode
journalctl -e -f -u cnode | grep kill

question… when u set the nodes first time u used coincashew or cntools?

if u updated the nodes with my script (which is for cntools set up) then probably u have 2 nodes now…

Try

sudo systemctl status cnode
sudo systemctl status cardano-cnode

I am using cntools.

output from sudo systemctl status cnode

● cnode.service - Cardano Node
     Loaded: loaded (/etc/systemd/system/cnode.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2022-09-06 21:07:52 CEST; 13min ago
   Main PID: 901941 (bash)
      Tasks: 17 (limit: 19090)
     Memory: 2.6G
        CPU: 19min 14.948s
     CGroup: /system.slice/cnode.service
             ├─901941 bash /opt/cardano/cnode/scripts/cnode.sh
             └─902051 /home/carda/.cabal/bin/cardano-node run --topology /opt/cardano/cnode/files/topology.json --con>

Sep 06 21:07:52 aha-momo.ch systemd[1]: Started Cardano Node.
Sep 06 21:07:52 aha-momo.ch cnode[901941]: INFO: Cleaned-up stale socket file
Sep 06 21:07:53 aha-momo.ch cnode[902051]: Listening on http://127.0.0.1:12798

output from sudo systemctl status cardano-cnode

Unit cardano-cnode.service could not be found.

o/p form journalctl -e -f -u cnode | grep kill

 Sep 05 09:16:51 aha-momo.ch cnode[3277424]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 3277508 ) is not listening on this port. Please update the config or kill the conflicting process first.
Sep 05 10:16:38 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 11:16:39 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 12:16:40 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 12:16:54 aha-momo.ch cnode[3322176]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 3322261 ) is not listening on this port. Please update the config or kill the conflicting process first.
Sep 05 13:16:40 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 13:31:28 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 13:32:25 aha-momo.ch cnode[3345433]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 3345517 ) is not listening on this port. Please update the config or kill the conflicting process first.
Sep 05 14:16:56 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 15:16:57 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 15:17:06 aha-momo.ch cnode[3552598]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 3552683 ) is not listening on this port. Please update the config or kill the conflicting process first.
Sep 05 16:16:58 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 16:17:10 aha-momo.ch cnode[3568817]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 3568902 ) is not listening on this port. Please update the config or kill the conflicting process first.
Sep 05 17:17:10 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 17:17:20 aha-momo.ch cnode[3583531]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 3583614 ) is not listening on this port. Please update the config or kill the conflicting process first.
Sep 05 18:17:11 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 18:17:28 aha-momo.ch cnode[3599612]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 3599697 ) is not listening on this port. Please update the config or kill the conflicting process first.
Sep 05 19:17:11 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 20:17:11 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 21:17:12 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Sep 05 21:35:02 aha-momo.ch systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT

Relay:
Relay

BP node:
BPNode

hmm… try this

./deploy-as-systemd.sh and choose N for topology updater

Then keep it under monitoring

I did ./deploy-as-systemd.sh as you told this morning and since then restart issue seems disappeared. My node is now running for more than 5 hours.
Screenshot 2022-09-07 at 12.34.11

That means perhaps u set the topology updater to restart the node hourly?

activate again but keep the default timer 86400 seconds (24 hours) and being a relay press yes only for topology updater

Cheers,

Thank you @Alexd1985 , I do it again and press Yes for topology updater.

1 Like