1.33 BP Restarts Randomly

Not sure whats happening BP was running perfectly last epoch until last day. Rebuilt the node copied the db over and still having same issues where node wont start properly. Have not changed anything since upgrade RAM 16GB Disk Space 320GB

Jan 27 02:20:14 xxxxxx cnode[2201806]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.

Jan 27 02:20:17 xxxxxxx1 cnode[2202400]: Node configuration: NodeConfiguration {ncNodeIPv4Addr = Just 0.0.0.0, ncNodeIPv6Addr = Nothing, ncNodePortNumber = Just 6000, ncConfigFile = “/opt/cardano/cnode/files/config.json”, ncTopologyFile = “/opt/cardano/cnode/files/topology.json”, ncDatabaseFile = “/opt/cardano/cnode/db”, ncProtocolFiles = ProtocolFilepaths {byronCertFile = Nothing, byronKeyFile = Nothing, shelleyKESFile = Just “/opt/cardano/cnode/priv/pool/pool01/hot.skey”, shelleyVRFFile = Just “/opt/cardano/cnode/priv/pool/pool01/vrf.skey”, shelleyCertFile = Just “/opt/cardano/cnode/priv/pool/pool01/op.cert”, shelleyBulkCredsFile = Nothing}, ncValidateDB = False, ncShutdownIPC = Nothing, ncShutdownOnSlotSynced = NoMaxSlotNo, ncProtocolConfig = NodeProtocolConfigurationCardano (NodeByronProtocolConfiguration {npcByronGenesisFile = “/opt/cardano/cnode/files/byron-genesis.json”, npcByronGenesisFileHash = Nothing, npcByronReqNetworkMagic = RequiresNoMagic, npcByronPbftSignatureThresh = Nothing, npcByronApplicationName = ApplicationName {unApplicationName = “cardano-sl”}, npcByronApplicationVersion = 1, npcByronSupportedProtocolVersionMajor = 3, npcByronSupportedProtocolVersionMinor = 0, npcByronSupportedProtocolVersionAlt = 0}) (NodeShelleyProtocolConfiguration {npcShelleyGenesisFile = “/opt/cardano/cnode/files/genesis.json”, npcShelleyGenesisFileHash = Nothing}) (NodeAlonzoProtocolConfiguration {npcAlonzoGenesisFile = “/opt/cardano/cnode/files/alonzo-genesis.json”, npcAlonzoGenesisFileHash = Just “7e94a15f55d1e82d10f09203fa1d40f8eede58fd8066542cf6566008068ed874”}) (NodeHardForkProtocolConfiguration {npcTestEnableDevelopmentHardForkEras = False, npcTestShelleyHardForkAtEpoch = Nothing, npcTestShelleyHardForkAtVersion = Nothing, npcTestAllegraHardForkAtEpoch = Nothing, npcTestAllegraHardForkAtVersion = Nothing, npcTestMaryHardForkAtEpoch = Nothing, npcTestMaryHardForkAtVersion = Nothing, npcTestAlonzoHardForkAtEpoch = Nothing, npcTestAlonzoHardForkAtVersion = Nothing}), ncSocketPath = Just “/opt/cardano/cnode/sockets/node0.socket”, ncDiffusionMode = InitiatorAndResponderDiffusionMode, ncSnapshotInterval = DefaultSnapshotInterval, ncTestEnableDevelopmentNetworkProtocols = False, ncMaxConcurrencyBulkSync = Nothing, ncMaxConcurrencyDeadline = Just 2, ncLoggingSwitch = True, ncLogMetrics = True, ncTraceConfig = TracingOn (TraceSelection {traceVerbosity = NormalVerbosity, traceAcceptPolicy = OnOff {isOn = False}, traceBlockFetchClient = OnOff {isOn = True}, traceBlockFetchDecisions = OnOff {isOn = True}, traceBlockFetchProtocol = OnOff {isOn = True}, traceBlockFetchProtocolSerialised = OnOff {isOn = True}, traceBlockFetchServer = OnOff {isOn = True}, traceBlockchainTime = OnOff {isOn = False}, traceChainDB = OnOff {isOn = True}, traceChainSyncBlockServer = OnOff {isOn = True}, traceChainSyncClient = OnOff {isOn = True}, traceChainSyncHeaderServer = OnOff {isOn = True}, traceChainSyncProtocol = OnOff {isOn = True}, traceConnectionManager = OnOff {isOn = False}, traceConnectionManagerCounters = OnOff {isOn = True}, traceDebugPeerSelectionInitiatorTracer = OnOff {isOn = False}, traceDebugPeerSelectionInitiatorResponderTracer = OnOff {isOn = False}, traceDiffusionInitialization = OnOff {isOn = False}, traceDnsResolver = OnOff {isOn = False}, traceDnsSubscription = OnOff {isOn = True}, traceErrorPolicy = OnOff {isOn = True}, traceForge = OnOff {isOn = True}, traceForgeStateInfo = OnOff {isOn = True}, traceHandshake = OnOff {isOn = False}, traceInboundGovernor = OnOff {isOn = False}, traceInboundGovernorCounters = OnOff {isOn = True}, traceIpSubscription = OnOff {isOn = True}, traceKeepAliveClient = OnOff {isOn = False}, traceLedgerPeers = OnOff {isOn = False}, traceLocalChainSyncProtocol = OnOff {isOn = True}, traceLocalConnectionManager = OnOff {isOn = False}, traceLocalErrorPolicy = OnOff {isOn = True}, traceLocalHandshake = OnOff {isOn = False}, traceLocalInboundGovernor = OnOff {isOn = False}, traceLocalMux = OnOff {isOn = False}, traceLocalRootPeers = OnOff {isOn = False}, traceLocalServer = OnOff {isOn = False}, traceLocalStateQueryProtocol = OnOff {isOn = False}, tListening on http://127.0.0.1:12798

caught this error in journal also. I have not modified env at all but somehow node looks like it is listening on 12798 Promethus port

Jan 27 02:57:11 xxxxxx cnode[2257119]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 2257504 ) is not listening on this port. Please update the config or kill the conflicting process first.

Jan 27 02:57:11 xxxxxxxx cnode[2257119]: ERROR: Failed to load common env file

Jan 27 02:57:11 xxxxxxxx cnode[2257119]: Please verify set values in ‘User Variables’ section in env file or log an issue on GitHub

Type

journalctl -e -f -u cnode

Do u see any killing, killed message?

Nothing it’s strange. Above is full journalctl entry

Ok, try to restart the server

In fact I did been running several hours stuck in “starting”. Yesterday I recompiled cardano-node deleted db and copied from relay. Appeared to be working for about an hour then random restart. syslog showed process killed but not a reason why and nothing in journalctl other than when restarting shows previous process not shut down cleanly…

did u installed 1.33.0 or 1.33.1 (1.33.0 is the latest official release) … also type free -m let’s check the RAM

ardano-node --version

cardano-node 1.33.0 - linux-x86_64 - ghc-8.10

git rev e9de7a2cf70796f6ff26eac9f9540184ded0e4e6

total used free shared buff/cache available
Mem: 16008 410 5763 1 9834 15298
Swap: 0 0 0

just checked journal again and saw this

Jan 27 11:01:07 xxxxxxx cnode[2698199]: /opt/cardano/cnode/scripts/cnode.sh: line 138: 2698729 Killed “${CNODEBIN}” “${CPU_RUNTIME[@]}” run --topology “${TOPOLOGY}” --config “${CONFIG}” --database-path “${DB_DIR}” --socket-path “${CARDANO_NODE_SOCKET_PATH}” --shelley-kes-key “${POOL_DIR}/${POOL_HOTKEY_SK_FILENAME}” --shelley-vrf-key “${POOL_DIR}/${POOL_VRF_SK_FILENAME}” --shelley-operational-certificate “${POOL_DIR}/${POOL_OPCERT_FILENAME}” --port ${CNODE_PORT} “${host_addr[@]}”

Ok, did u restarted the server?
Also did u modified the cnode.sh script (did u added some lines there in the past)?

server restarted itself uptime showing only 12min above.

Definitely no modifications to the cnode.sh script… I have been running pool for almost 1yr this is first time I have ever had deep issue like this.

restart the machine

– Reboot –

Jan 27 12:33:56 xxxxxx cnode[1320]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 1504 ) is not listening on this port. Please update the config or kill the conflicting process first.

Jan 27 12:34:00 xxxxxx cnode[1690]: Node configuration: NodeConfiguration {ncNodeIPv4Addr = Just 0.0.0.0, ncNodeIPv6Addr = Nothing, ncNodePortNumber = Just 6000, ncConfigFile = “/opt/cardano/cnode/files/config.json”, ncTopologyFile = “/opt/cardano/cnode/files/topology.json”, ncDatabaseFile = “/opt/cardano/cnode/db”, ncProtocolFiles = ProtocolFilepaths {byronCertFile = Nothing, byronKeyFile = Nothing, shelleyKESFile = Just “/opt/cardano/cnode/priv/pool/pool01/hot.skey”, shelleyVRFFile = Just “/opt/cardano/cnode/priv/pool/pool01/vrf.skey”, shelleyCertFile = Just “/opt/cardano/cnode/priv/pool/pool01/op.cert”, shelleyBulkCredsFile = Nothing}, ncValidateDB = False, ncShutdownIPC = Nothing, ncShutdownOnSlotSynced = NoMaxSlotNo, ncProtocolConfig = NodeProtocolConfigurationCardano (NodeByronProtocolConfiguration {npcByronGenesisFile = “/opt/cardano/cnode/files/byron-genesis.json”, npcByronGenesisFileHash = Nothing, npcByronReqNetworkMagic = RequiresNoMagic, npcByronPbftSignatureThresh = Nothing, npcByronApplicationName = ApplicationName {unApplicationName = “cardano-sl”}, npcByronApplicationVersion = 1, npcByronSupportedProtocolVersionMajor = 3, npcByronSupportedProtocolVersionMinor = 0, npcByronSupportedProtocolVersionAlt = 0}) (NodeShelleyProtocolConfiguration {npcShelleyGenesisFile = “/opt/cardano/cnode/files/genesis.json”, npcShelleyGenesisFileHash = Nothing}) (NodeAlonzoProtocolConfiguration {npcAlonzoGenesisFile = “/opt/cardano/cnode/files/alonzo-genesis.json”, npcAlonzoGenesisFileHash = Just “7e94a15f55d1e82d10f09203fa1d40f8eede58fd8066542cf6566008068ed874”}) (NodeHardForkProtocolConfiguration {npcTestEnableDevelopmentHardForkEras = False, npcTestShelleyHardForkAtEpoch = Nothing, npcTestShelleyHardForkAtVersion = Nothing, npcTestAllegraHardForkAtEpoch = Nothing, npcTestAllegraHardForkAtVersion = Nothing, npcTestMaryHardForkAtEpoch = Nothing, npcTestMaryHardForkAtVersion = Nothing, npcTestAlonzoHardForkAtEpoch = Nothing, npcTestAlonzoHardForkAtVersion = Nothing}), ncSocketPath = Just “/opt/cardano/cnode/sockets/node0.socket”, ncDiffusionMode = InitiatorAndResponderDiffusionMode, ncSnapshotInterval = DefaultSnapshotInterval, ncTestEnableDevelopmentNetworkProtocols = False, ncMaxConcurrencyBulkSync = Nothing, ncMaxConcurrencyDeadline = Just 2, ncLoggingSwitch = True, ncLogMetrics = True, ncTraceConfig = TracingOn (TraceSelection {traceVerbosity = NormalVerbosity, traceAcceptPolicy = OnOff {isOn = False}, traceBlockFetchClient = OnOff {isOn = True}, traceBlockFetchDecisions = OnOff {isOn = True}, traceBlockFetchProtocol = OnOff {isOn = True}, traceBlockFetchProtocolSerialised = OnOff {isOn = True}, traceBlockFetchServer = OnOff {isOn = True}, traceBlockchainTime = OnOff {isOn = False}, traceChainDB = OnOff {isOn = True}, traceChainSyncBlockServer = OnOff {isOn = True}, traceChainSyncClient = OnOff {isOn = True}, traceChainSyncHeaderServer = OnOff {isOn = True}, traceChainSyncProtocol = OnOff {isOn = True}, traceConnectionManager = OnOff {isOn = False}, traceConnectionManagerCounters = OnOff {isOn = True}, traceDebugPeerSelectionInitiatorTracer = OnOff {isOn = False}, traceDebugPeerSelectionInitiatorResponderTracer = OnOff {isOn = False}, traceDiffusionInitialization = OnOff {isOn = False}, traceDnsResolver = OnOff {isOn = False}, traceDnsSubscription = OnOff {isOn = True}, traceErrorPolicy = OnOff {isOn = True}, traceForge = OnOff {isOn = True}, traceForgeStateInfo = OnOff {isOn = True}, traceHandshake = OnOff {isOn = False}, traceInboundGovernor = OnOff {isOn = False}, traceInboundGovernorCounters = OnOff {isOn = True}, traceIpSubscription = OnOff {isOn = True}, traceKeepAliveClient = OnOff {isOn = False}, traceLedgerPeers = OnOff {isOn = False}, traceLocalChainSyncProtocol = OnOff {isOn = True}, traceLocalConnectionManager = OnOff {isOn = False}, traceLocalErrorPolicy = OnOff {isOn = True}, traceLocalHandshake = OnOff {isOn = False}, traceLocalInboundGovernor = OnOff {isOn = False}, traceLocalMux = OnOff {isOn = False}, traceLocalRootPeers = OnOff {isOn = False}, traceLocalServer = OnOff {isOn = False}, traceLocalStateQueryProtocol = OnOff {isOn = False}, tListening on http://127.0.0.1:12798

something is off I am not modifying env or anything and looks like node is listening on promethus port

http://127.0.0.1:12798

Here is syslog output. I checked sockets dir and there are no files there.

Jan 27 12:42:07 xxxxx systemd[1]: cnode-logmonitor.service: Scheduled restart job, restart counter is at 23.

Jan 27 12:42:07 xxxxx systemd[1]: Stopped Cardano Node - Log Monitor.

Jan 27 12:42:07 xxxxxx systemd[1]: Started Cardano Node - Log Monitor.

Jan 27 12:42:08 xxxxxx cnode-cncli-sync[1324]: #033[31mLooks like cardano-node is running with socket-path as #033[94m/opt/cardano/cnode/sockets/node0.socket#033[31m, but the actual socket file does not exist.

Jan 27 12:42:08 xxxxx cnode-cncli-sync[1324]: This could occur if the node hasnt completed startup or if a second instance of node startup was attempted!

Jan 27 12:42:08 xxxxx cnode-cncli-sync[1324]: If this does not resolve automatically in a few minutes, you might want to restart your node and try again.#033[0m

Jan 27 12:42:08 xxxxx cnode-cncli-sync[1324]: sleeping for 10s and testing again...

Jan 27 12:42:09 xxxxx cnode-logmonitor[15990]: #033[31mLooks like cardano-node is running with socket-path as #033[94m/opt/cardano/cnode/sockets/node0.socket#033[31m, but the actual socket file does not exist.

Jan 27 12:42:09 xxxxxx cnode-logmonitor[15990]: This could occur if the node hasnt completed startup or if a second instance of node startup was attempted!

Jan 27 12:42:09 xxxxx cnode-logmonitor[15990]: If this does not resolve automatically in a few minutes, you might want to restart your node and try again.#033[0m

Jan 27 12:42:09 xxxxx systemd[1]: cnode-logmonitor.service: Main process exited, code=exited, status=1/FAILURE

Jan 27 12:42:09 xxxxx systemd[1]: cnode-logmonitor.service: Failed with result 'exit-code'.

That’s why I told u to try to reboot the machine

this is all after reboot

and now if u check with sudo systemctl status cnode what is the output?

cardano@xxxxx:/opt/cardano/cnode/scripts$ sudo systemctl status cnode
[sudo] password for cardano:
● cnode.service - Cardano Node
Loaded: loaded (/etc/systemd/system/cnode.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2022-01-27 12:33:55 UTC; 32min ago
Main PID: 1320 (bash)
Tasks: 28 (limit: 19169)
Memory: 14.6G
CGroup: /system.slice/cnode.service
├─1320 bash /opt/cardano/cnode/scripts/cnode.sh
└─1690 /home/cardano/.local/bin/cardano-node +RTS -N8 -RTS run --topology /opt/cardano/cnode/files/topology.json --config /opt/cardano/cnode/files/config.json --database-path /opt/cardano/cnode/db --socket-path /opt/cardano/cnode/sockets/node0.socket --sh>

Jan 27 12:33:55 xxxxx systemd[1]: Started Cardano Node.
Jan 27 12:33:56 xxxxx cnode[1320]: ERROR: You specified 12788 as your EKG port, but it looks like the cardano-node (PID: 1504 ) is not listening on this port. Please update the config or kill the conflicting process first.
Jan 27 12:34:00 xxxxx cnode[1690]: Node configuration: NodeConfiguration {ncNodeIPv4Addr = Just 0.0.0.0, ncNodeIPv6Addr = Nothing, ncNodePortNumber = Just 6000, ncConfigFile = “/opt/cardano/cnode/files/config.json”, ncTopologyFile = "/opt/cardano/cnode/files/topo>

Understand now

So u should configure the swap file

sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

now to make it permanently type
sudo nano /etc/fstab

and add to the end, as a new line
/swapfile swap swap defaults 0 0
save the file (Ctrl + x then Y then ENTER)

check if the configuration was successfully

free -m

then try
sudo systemctl restart cnode