Odd errors in code log, missed a block last night

Missed a block late last night, I think that may have been because my relay’s disk was completely full of logs. I sorted that out, and will be adding a second relay soon to add some redundancy.

However, while I was trying to figure out what was going on, I noticed these errors in the PRODUCER’s logs:

-- Reboot --
Oct 24 01:06:50 staking01 systemd[1]: Started Cardano Node.
**Oct 24 01:06:56 staking01 cnode[996]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 1318 ) is not listening on this port. Please update the config or kill the conflicting process first.**
Oct 24 01:06:56 staking01 cnode[996]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.
**Oct 24 01:06:58 staking01 cnode[1402]: mkdir: cannot create directory ‘/archive’: Permission denied**
Oct 24 01:07:02 staking01 cnode[1406]: Listening on http://127.0.0.1:12798
Oct 24 01:07:02 staking01 cnode[1406]: Shutting down..
**Oct 24 01:07:02 staking01 cnode[1406]: cardano-node: FatalError {fatalErrorMessage = "Cardano.Node.Run.handleSimpleNode.readTopologyFile: Cardano.Node.Configuration.Topology.readTopologyFile: : openBinaryFile: does not exist (No such file or directory)"}**
Oct 24 01:07:02 staking01 systemd[1]: cnode.service: Main process exited, code=exited, status=1/FAILURE
Oct 24 01:07:02 staking01 systemd[1]: cnode.service: Failed with result 'exit-code'.
Oct 24 01:07:07 staking01 systemd[1]: cnode.service: Scheduled restart job, restart counter is at 1.
Oct 24 01:07:13 staking01 systemd[1]: Stopped Cardano Node.
Oct 24 01:07:13 staking01 systemd[1]: Started Cardano Node.
Oct 24 01:07:15 staking01 cnode[2634]: Listening on http://127.0.0.1:12798

EDIT: tried to bold the errors, but apparently that doesn’t work in a code quote block

It seems that the producer is starting the node, it’s crashing immediately with errors complaining about the EKG port and mkdir permissions errors, and then it’s starting up with no error the second time. What is going on? The producer is still connecting to the relay, so it MUST be able to read topology.json, right? I’m assuming the /archive folder it’s trying to create is the one inside the logs directory, but it’s already there…

The relay seemed to be rebooting more often than it should be, but there weren’t any useful errors in code’s logs… just “shutting down” “stopped” “starting” and “listening” over and over. deleting the old logs seems to have cleared that up, but I’m going to try deleting the db folder on there just to be safe.

Also, is there any way to cap the size of the logs folder?

update: deleting the db folder did not fix the errors, so I restored it. I didn’t think it would, but that step seems to solve half the problems I read about on this forum so I figured it was worth a shot.

next time use journalctl -e -f -u cardano-node

1 Like

$ journalctl -e -f -u cardano-node
– Logs begin at Mon 2021-07-26 21:02:00 CDT. –

Are u using cntools?

then try journalctl -e -f -u cnode

yeah, cntools install.
those logs were the ones from cntools.
everything above it is just more of the same.
which Cardano-node points to the one in the .cabal folder, which looks to be the only one on the node other than what’s in the git folder.
the cncli and cardano-node versions both show as 1.30

Restart the node and show me the output for
journalctl -e -f -u cnode

Oct 24 03:28:53 staking01 systemd[1]: Started Cardano Node.
Oct 24 03:28:53 staking01 cnode[38153]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 38375 ) is not listening on this port. Please update the config or kill the conflicting process first.
Oct 24 03:28:53 staking01 cnode[38153]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.
Oct 24 03:28:53 staking01 cnode[38443]: mkdir: cannot create directory ‘/archive’: Permission denied
Oct 24 03:28:54 staking01 cnode[38447]: Listening on http://127.0.0.1:12798
Oct 24 03:28:55 staking01 cnode[38447]: Shutting down…
Oct 24 03:28:55 staking01 cnode[38447]: cardano-node: FatalError {fatalErrorMessage = “Cardano.Node.Run.handleSimpleNode.readTopologyFile: Cardano.Node.Configuration.Topology.readTopologyFile: : openBinaryFile: does not exist (No such file or directory)”}
Oct 24 03:28:55 staking01 systemd[1]: cnode.service: Main process exited, code=exited, status=1/FAILURE
Oct 24 03:28:55 staking01 systemd[1]: cnode.service: Failed with result ‘exit-code’.
Oct 24 03:29:00 staking01 systemd[1]: cnode.service: Scheduled restart job, restart counter is at 1.
Oct 24 03:29:00 staking01 systemd[1]: Stopped Cardano Node.
Oct 24 03:29:00 staking01 systemd[1]: Started Cardano Node.
Oct 24 03:29:02 staking01 cnode[39316]: Listening on http://127.0.0.1:12798
Oct 24 05:57:35 staking01 systemd[1]: Stopping Cardano Node…
Oct 24 05:57:35 staking01 systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Oct 24 05:57:35 staking01 cnode[39316]: Shutting down…
Oct 24 05:57:36 staking01 systemd[1]: cnode.service: Failed with result ‘signal’.
Oct 24 05:57:36 staking01 systemd[1]: Stopped Cardano Node.
Oct 24 05:57:36 staking01 systemd[1]: Started Cardano Node.
Oct 24 05:57:37 staking01 cnode[323906]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.
Oct 24 05:57:38 staking01 cnode[324370]: Listening on http://127.0.0.1:12798

Do u have the topology file inside files folder?

same place it’s always been, with -rw-rw-r–* permissions

Try to restart the all server and check again with journal…

Oct 24 05:57:38 staking01 cnode[324370]: Listening on http://127.0.0.1:12798
Oct 24 06:14:52 staking01 systemd[1]: Stopping Cardano Node…
Oct 24 06:14:52 staking01 systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Oct 24 06:14:52 staking01 cnode[324370]: Shutting down…
Oct 24 06:14:53 staking01 systemd[1]: cnode.service: Failed with result ‘signal’.
Oct 24 06:14:53 staking01 systemd[1]: Stopped Cardano Node.
– Reboot –
Oct 24 06:15:28 staking01 systemd[1]: Started Cardano Node.
Oct 24 06:15:36 staking01 cnode[985]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.
Oct 24 06:15:41 staking01 cnode[1653]: Listening on http://127.0.0.1:12798

looks normal this time… gonna try again

same message with the topology file? Check above… if u press top u should see the CPU ~100%

Here’s both reboots for clarity:

– Reboot –
Oct 24 06:15:28 staking01 systemd[1]: Started Cardano Node.
Oct 24 06:15:36 staking01 cnode[985]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.
Oct 24 06:15:41 staking01 cnode[1653]: Listening on http://127.0.0.1:12798
Oct 24 06:19:32 staking01 systemd[1]: Stopping Cardano Node…
Oct 24 06:19:32 staking01 systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Oct 24 06:19:32 staking01 cnode[1653]: Shutting down…
Oct 24 06:19:32 staking01 systemd[1]: cnode.service: Failed with result ‘signal’.
Oct 24 06:19:32 staking01 systemd[1]: Stopped Cardano Node.
– Reboot –
Oct 24 06:20:15 staking01 systemd[1]: Started Cardano Node.
Oct 24 06:20:21 staking01 cnode[996]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 1321 ) is not listening on this port. Please update the config or kill the conflicting process first.
Oct 24 06:20:21 staking01 cnode[996]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.
Oct 24 06:20:25 staking01 cnode[1409]: mkdir: cannot create directory ‘/archive’: Permission denied
Oct 24 06:20:27 staking01 cnode[1413]: Listening on http://127.0.0.1:12798
Oct 24 06:20:28 staking01 cnode[1413]: Shutting down…
Oct 24 06:20:28 staking01 cnode[1413]: cardano-node: FatalError {fatalErrorMessage = “Cardano.Node.Run.handleSimpleNode.readTopologyFile: Cardano.Node.Configuration.Topology.readTopologyFile: : openBinaryFile: does not exist (No such file or directory)”}
Oct 24 06:20:28 staking01 systemd[1]: cnode.service: Main process exited, code=exited, status=1/FAILURE
Oct 24 06:20:28 staking01 systemd[1]: cnode.service: Failed with result ‘exit-code’.
Oct 24 06:20:33 staking01 systemd[1]: cnode.service: Scheduled restart job, restart counter is at 1.
Oct 24 06:20:33 staking01 systemd[1]: Stopped Cardano Node.

Don’t really see Cardano-node doing much in htop… except using 1.0T of VIRT mem???
and what looks like 12 instances of it running? EDIT: guess it just runs 2x core count, since I have 6 cores on this VM.

let’s download the default files again

cd ~/tmp
./prereqs.sh - f

restart the node and check again with journalctl

(Do not edit yet the topology file or env)

1 Like

Roger. will update when it’s finished

Oct 24 06:37:10 staking01 systemd[1]: Started Cardano Node.
Oct 24 06:37:11 staking01 cnode[39203]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.
Oct 24 06:37:12 staking01 cnode[39652]: Listening on http://127.0.0.1:12798

looks normal, trying a couple more times to be sure…

last time didn’t look so pretty:

Oct 24 06:38:20 staking01 cnode[42034]: Listening on http://127.0.0.1:12798
Oct 24 06:38:33 staking01 systemd[1]: Stopping Cardano Node…
Oct 24 06:38:33 staking01 systemd[1]: cnode.service: Control process exited, code=killed, status=2/INT
Oct 24 06:38:38 staking01 systemd[1]: cnode.service: State ‘stop-sigterm’ timed out. Killing.
Oct 24 06:38:38 staking01 systemd[1]: cnode.service: Killing process 41782 (bash) with signal SIGKILL.
Oct 24 06:38:38 staking01 systemd[1]: cnode.service: Killing process 42034 (cardano-node) with signal SIGKILL.
Oct 24 06:38:38 staking01 systemd[1]: cnode.service: Killing process 42035 (ghc_ticker) with signal SIGKILL.
Oct 24 06:38:38 staking01 systemd[1]: cnode.service: Killing process 42036 (cardano-node:w) with signal SIGKILL.
Oct 24 06:38:38 staking01 systemd[1]: cnode.service: Killing process 42038 (cardano-node:w) with signal SIGKILL.
Oct 24 06:38:38 staking01 systemd[1]: cnode.service: Killing process 42039 (cardano-node:w) with signal SIGKILL.
Oct 24 06:38:38 staking01 systemd[1]: cnode.service: Killing process 42040 (n/a) with signal SIGKILL.
Oct 24 06:38:38 staking01 systemd[1]: cnode.service: Killing process 42159 (cardano-node:w) with signal SIGKILL.
Oct 24 06:38:38 staking01 systemd[1]: cnode.service: Main process exited, code=killed, status=9/KILL
Oct 24 06:38:38 staking01 systemd[1]: cnode.service: Killing process 42159 (cardano-node:w) with signal SIGKILL.
Oct 24 06:38:38 staking01 systemd[1]: cnode.service: Failed with result ‘signal’.
Oct 24 06:38:38 staking01 systemd[1]: Stopped Cardano Node.
Oct 24 06:38:38 staking01 systemd[1]: Started Cardano Node.
Oct 24 06:38:40 staking01 cnode[42866]: Listening on http://127.0.0.1:12798

maybe I killed it too fast that time?

Something is not ok … ohh wait what are the specs of the server? How many RAM do u have?

16 GB, 8 GB swap. 120GB storage, 6 xeon cores.

EDIT: 32 GB RAM on this one. 16 on the relay.

Ha! I completely agree!