Hello guys, I have upgraded to 1.27 version all was successful. BP started fine and was working for 3 days. last night starting at 3 am PST I noticed cnode service reboots every 10 min or so. Live view shows “Starting…” stays like that for 5 min then goes up process 30-50 transactions then goes to “starting…” again. Please advice where to start looking why service reboots it self.
Upgrade the nodes
BP and relays are all running 1.27
I meant upgrade the hardware for the nodes
What is the actual hardware configuration?
This is VM in azure, 4 vCPU 8GB RAM, all was running fine for 3 month.
It ran but starting with 1.27.0 u will need more resources, perhaps next version will consume less but till then… u will need the upgrade… or go to configuration file and set the TraceMempool=false
U will not see tx processed in glive but at least the server should not restart anymore
do you have specs ?
I am looking here Releases · input-output-hk/cardano-node · GitHub
looks like still the same. How do you know it needs more ?
- An Intel or AMD x86 processor with two or more cores, at 1.6GHz or faster (2GHz or faster for a stake pool or relay)
- 8GB of RAM
- 10GB of free storage (20GB for a stake pool)
@Alexd1985 disabled on BP rebooted service waiting to come back up.
@Alexd1985 TraceMempool disabled. CPU at 17% RAM at 37%. Service keeps on rebooting.
journalctl -e -f -u cardano-node
What file are logs going to ?
-- Logs begin at Wed 2021-02-24 04:33:09 UTC. --
May 24 09:45:54 SPCORENODE01 cnode[1067]: /opt/cardano/cnode/scripts/cnode.sh: line 57: 1841 Killed cardano-node "${CPU_RUNTIME[@]}" run --topology "${TOPOLOG Y}" --config "${CONFIG}" --database-path "${DB_DIR}" --socket-path "${CARDANO_NODE_SOCKE T_PATH}" --shelley-kes-key "${POOL_DIR}/${POOL_HOTKEY_SK_FILENAME}" --shelley-vrf-key "$ {POOL_DIR}/${POOL_VRF_SK_FILENAME}" --shelley-operational-certificate "${POOL_DIR}/${POO L_OPCERT_FILENAME}" --port ${CNODE_PORT} "${host_addr[@]}"
May 24 09:45:53 SPCORENODE01 systemd[1]: cnode.service: Main process exited, code=exited , status=137/n/a
May 24 09:45:53 SPCORENODE01 systemd[1]: cnode.service: Failed with result 'exit-code'.
May 24 09:45:59 SPCORENODE01 systemd[1]: cnode.service: Service hold-off time over, sche duling restart.
May 24 09:45:59 SPCORENODE01 systemd[1]: cnode.service: Scheduled restart job, restart c ounter is at 1.
May 24 09:45:59 SPCORENODE01 systemd[1]: Stopped Cardano Node.
May 24 09:45:59 SPCORENODE01 systemd[1]: Started Cardano Node.
May 24 09:45:59 SPCORENODE01 cnode[13596]: WARN: A prior running Cardano node was not cl eanly shutdown, socket file still exists. Cleaning up.
May 24 09:46:01 SPCORENODE01 cnode[13596]: Listening on http://0.0.0.0:12798
May 24 09:55:27 SPCORENODE01 cnode[13596]: /opt/cardano/cnode/scripts/cnode.sh: line 57: 14144 Killed cardano-node "${CPU_RUNTIME[@]}" run --topology "${TOPOLO GY}" --config "${CONFIG}" --database-path "${DB_DIR}" --socket-path "${CARDANO_NODE_SOCK ET_PATH}" --shelley-kes-key "${POOL_DIR}/${POOL_HOTKEY_SK_FILENAME}" --shelley-vrf-key " ${POOL_DIR}/${POOL_VRF_SK_FILENAME}" --shelley-operational-certificate "${POOL_DIR}/${PO OL_OPCERT_FILENAME}" --port ${CNODE_PORT} "${host_addr[@]}"
May 24 09:55:27 SPCORENODE01 systemd[1]: cnode.service: Main process exited, code=exited , status=137/n/a
May 24 09:55:27 SPCORENODE01 systemd[1]: cnode.service: Failed with result 'exit-code'.
May 24 09:55:32 SPCORENODE01 systemd[1]: cnode.service: Service hold-off time over, sche duling restart.
May 24 09:55:32 SPCORENODE01 systemd[1]: cnode.service: Scheduled restart job, restart c ounter is at 2.
May 24 09:55:32 SPCORENODE01 systemd[1]: Stopped Cardano Node.
May 24 09:55:32 SPCORENODE01 systemd[1]: Started Cardano Node.
May 24 09:55:33 SPCORENODE01 cnode[1974]: WARN: A prior running Cardano node was not cle anly shutdown, socket file still exists. Cleaning up.
May 24 09:55:35 SPCORENODE01 cnode[1974]: Listening on http://0.0.0.0:12798
May 24 10:05:05 SPCORENODE01 cnode[1974]: /opt/cardano/cnode/scripts/cnode.sh: line 57: 2459 Killed cardano-node "${CPU_RUNTIME[@]}" run --topology "${TOPOLOG Y}" --config "${CONFIG}" --database-path "${DB_DIR}" --socket-path "${CARDANO_NODE_SOCKE T_PATH}" --shelley-kes-key "${POOL_DIR}/${POOL_HOTKEY_SK_FILENAME}" --shelley-vrf-key "$ {POOL_DIR}/${POOL_VRF_SK_FILENAME}" --shelley-operational-certificate "${POOL_DIR}/${POO L_OPCERT_FILENAME}" --port ${CNODE_PORT} "${host_addr[@]}"
May 24 10:05:05 SPCORENODE01 systemd[1]: cnode.service: Main process exited, code=exited , status=137/n/a
May 24 10:05:05 SPCORENODE01 systemd[1]: cnode.service: Failed with result 'exit-code'.
May 24 10:05:11 SPCORENODE01 systemd[1]: cnode.service: Service hold-off time over, sche duling restart.
May 24 10:05:11 SPCORENODE01 systemd[1]: cnode.service: Scheduled restart job, restart c ounter is at 3.
May 24 10:05:11 SPCORENODE01 systemd[1]: Stopped Cardano Node.
May 24 10:05:11 SPCORENODE01 systemd[1]: Started Cardano Node.
May 24 10:05:11 SPCORENODE01 cnode[22898]: WARN: A prior running Cardano node was not cl eanly shutdown, socket file still exists. Cleaning up.
May 24 10:05:13 SPCORENODE01 cnode[22898]: Listening on http://0.0.0.0:12798
May 24 10:14:42 SPCORENODE01 cnode[22898]: /opt/cardano/cnode/scripts/cnode.sh: line 57: 23478 Killed cardano-node "${CPU_RUNTIME[@]}" run --topology "${TOPOLO GY}" --config "${CONFIG}" --database-path "${DB_DIR}" --socket-path "${CARDANO_NODE_SOCK ET_PATH}" --shelley-kes-key "${POOL_DIR}/${POOL_HOTKEY_SK_FILENAME}" --shelley-vrf-key " ${POOL_DIR}/${POOL_VRF_SK_FILENAME}" --shelley-operational-certificate "${POOL_DIR}/${PO OL_OPCERT_FILENAME}" --port ${CNODE_PORT} "${host_addr[@]}"
May 24 10:14:42 SPCORENODE01 systemd[1]: cnode.service: Main process exited, code=exited , status=137/n/a
May 24 10:14:42 SPCORENODE01 systemd[1]: cnode.service: Failed with result 'exit-code'.
May 24 10:14:47 SPCORENODE01 systemd[1]: cnode.service: Service hold-off time over, sche duling restart.
May 24 10:14:47 SPCORENODE01 systemd[1]: cnode.service: Scheduled restart job, restart c ounter is at 4.
May 24 10:14:47 SPCORENODE01 systemd[1]: Stopped Cardano Node.
May 24 10:14:47 SPCORENODE01 systemd[1]: Started Cardano Node.
May 24 10:14:48 SPCORENODE01 cnode[11440]: WARN: A prior running Cardano node was not cl eanly shutdown, socket file still exists. Cleaning up.
May 24 10:14:50 SPCORENODE01 cnode[11440]: Listening on http://0.0.0.0:12798
May 24 10:24:08 SPCORENODE01 cnode[11440]: /opt/cardano/cnode/scripts/cnode.sh: line 57: 11969 Killed cardano-node "${CPU_RUNTIME[@]}" run --topology "${TOPOLO GY}" --config "${CONFIG}" --database-path "${DB_DIR}" --socket-path "${CARDANO_NODE_SOCK ET_PATH}" --shelley-kes-key "${POOL_DIR}/${POOL_HOTKEY_SK_FILENAME}" --shelley-vrf-key " ${POOL_DIR}/${POOL_VRF_SK_FILENAME}" --shelley-operational-certificate "${POOL_DIR}/${PO OL_OPCERT_FILENAME}" --port ${CNODE_PORT} "${host_addr[@]}"
May 24 10:24:08 SPCORENODE01 systemd[1]: cnode.service: Main process exited, code=exited , status=137/n/a
May 24 10:24:08 SPCORENODE01 systemd[1]: cnode.service: Failed with result 'exit-code'.
May 24 10:24:13 SPCORENODE01 systemd[1]: cnode.service: Service hold-off time over, sche duling restart.
May 24 10:24:13 SPCORENODE01 systemd[1]: cnode.service: Scheduled restart job, restart c ounter is at 5.
May 24 10:24:13 SPCORENODE01 systemd[1]: Stopped Cardano Node.
May 24 10:24:13 SPCORENODE01 systemd[1]: Started Cardano Node.
May 24 15:18:46 SPCORENODE01 systemd[1]: Started Cardano Node.
May 24 15:18:47 SPCORENODE01 cnode[32392]: WARN: A prior running Cardano node was not cl eanly shutdown, socket file still exists. Cleaning up.
May 24 15:18:48 SPCORENODE01 cnode[32392]: Listening on http://0.0.0.0:12798
May 24 15:28:20 SPCORENODE01 cnode[32392]: /opt/cardano/cnode/scripts/cnode.sh: line 57: 459 Killed cardano-node "${CPU_RUNTIME[@]}" run --topology "${TOPOLO GY}" --config "${CONFIG}" --database-path "${DB_DIR}" --socket-path "${CARDANO_NODE_SOCK ET_PATH}" --shelley-kes-key "${POOL_DIR}/${POOL_HOTKEY_SK_FILENAME}" --shelley-vrf-key " ${POOL_DIR}/${POOL_VRF_SK_FILENAME}" --shelley-operational-certificate "${POOL_DIR}/${PO OL_OPCERT_FILENAME}" --port ${CNODE_PORT} "${host_addr[@]}"
May 24 15:28:20 SPCORENODE01 systemd[1]: cnode.service: Main process exited, code=exited , status=137/n/a
May 24 15:28:20 SPCORENODE01 systemd[1]: cnode.service: Failed with result 'exit-code'.
May 24 15:28:25 SPCORENODE01 systemd[1]: cnode.service: Service hold-off time over, sche duling restart.
May 24 15:28:25 SPCORENODE01 systemd[1]: cnode.service: Scheduled restart job, restart c ounter is at 8.
May 24 15:28:25 SPCORENODE01 systemd[1]: Stopped Cardano Node.
May 24 15:28:25 SPCORENODE01 systemd[1]: Started Cardano Node.
May 24 15:28:26 SPCORENODE01 cnode[29347]: WARN: A prior running Cardano node was not cl eanly shutdown, socket file still exists. Cleaning up.
May 24 15:28:28 SPCORENODE01 cnode[29347]: Listening on http://0.0.0.0:12798
May 24 15:30:54 SPCORENODE01 systemd[1]: Stopping Cardano Node...
May 24 15:30:59 SPCORENODE01 systemd[1]: cnode.service: State 'stop-sigterm' timed out. Killing.
May 24 15:30:59 SPCORENODE01 systemd[1]: cnode.service: Killing process 29347 (cnode.sh) with signal SIGKILL.
May 24 15:30:59 SPCORENODE01 systemd[1]: cnode.service: Killing process 29966 (cardano-n ode) with signal SIGKILL.
May 24 15:30:59 SPCORENODE01 systemd[1]: cnode.service: Main process exited, code=killed , status=9/KILL
May 24 15:30:59 SPCORENODE01 systemd[1]: cnode.service: Killing process 29966 (cardano-n ode) with signal SIGKILL.
May 24 15:30:59 SPCORENODE01 systemd[1]: cnode.service: Failed with result 'timeout'.
May 24 15:30:59 SPCORENODE01 systemd[1]: Stopped Cardano Node.
May 24 15:30:59 SPCORENODE01 systemd[1]: Started Cardano Node.
May 24 15:31:01 SPCORENODE01 cnode[3283]: Listening on http://0.0.0.0:12798
May 24 15:33:07 SPCORENODE01 systemd[1]: Stopping Cardano Node...
May 24 15:33:12 SPCORENODE01 systemd[1]: cnode.service: State 'stop-sigterm' timed out. Killing.
May 24 15:33:12 SPCORENODE01 systemd[1]: cnode.service: Killing process 3283 (cnode.sh) with signal SIGKILL.
May 24 15:33:12 SPCORENODE01 systemd[1]: cnode.service: Killing process 3749 (cardano-no de) with signal SIGKILL.
May 24 15:33:12 SPCORENODE01 systemd[1]: cnode.service: Main process exited, code=killed , status=9/KILL
May 24 15:33:12 SPCORENODE01 systemd[1]: cnode.service: Killing process 3749 (cardano-no de) with signal SIGKILL.
May 24 15:33:12 SPCORENODE01 systemd[1]: cnode.service: Failed with result 'timeout'.
May 24 15:33:12 SPCORENODE01 systemd[1]: Stopped Cardano Node.
-- Reboot --
Something is killing the node
May 24 15:33:12 SPCORENODE01 systemd[1]: cnode.service: Killing process 3749
How do i know which process is that ?
Try to restart the server, I don’t know why on cacti the mem + swap is full
Rebooted many times does not help.
SWAP file is n/A I do not have that file. this is all good.
One of the DISK is full this is temp DISK just the way Azure works
My production drive at 92%
Do u have the possibility to update one node for test?