Relay restarts every few hours, with some log errors I don't understand, e.g. IPsubscriptionError"

hi,

I notice that my relays (but not the BP) restart quite often. The logs show things like this:

    Oct 14 16:06:17 ubuntu cnode[1637733]:
        cardano.node.IpSubscription:Error:275693] [2021-10-14 14:06:17.25 UTC]
        IPs: 0.0.0.0:0 [... long list of IPs] Application Exception: 192.46.XXX.XX:6001 SubscriberError {seType = SubscriberWorkerCancelled, seMessage = "SubscriptionWorker exiting", seStack = []}
    Oct 14 16:06:17 ubuntu cnode[1637733]:
        cardano.node.IpSubscription:Error:280705] [... long list of IPs] Application Exception: 18.119.XX.XX:9201 SubscriberError {seType = SubscriberWorkerCancelled, seMessage = "SubscriptionWorker exiting", seStack = []}
    Oct 14 16:06:17 ubuntu cnode[1637733]:
        cardano.node.DiffusionInitializationTracer:Info:5] [2021-10-14 14:06:17.25 UTC] DiffusionErrored user interrupt
    Oct 14 16:06:18 ubuntu systemd[1]: Stopped Cardano Node.

Any ideas? Tnx!

Try journalctl -e -f -u cardano-node

do u see any killing message?

Hi @hamish I’d post a few additional questions:

  • Was this pool running ok before?
  • What version are you running?
  • Can you confirm the topology file is correct (or maybe paste it here while masking the BP address) ?
  • Can you confirm the required ports are accessible and not blocked?
  • Do you have the latest config files: Cardano Configurations

hi Alex,
no, there’s nothing about a kill signal…

hi!

  • the pool appears fine in all respects that I can figure, except that the relay systemd services restart every few hours
  • it is on 1.30.1 now
  • topology: I’m using the guild updater on the relays, which seems to work fine
  • config: I updated for 1.30.1

best, h

Please also check the syslog for an oom kill, or generally for errors.

sudo tail -n 1000 /var/log/syslog | grep -i kill | more

or just inspect the last ie 300 lines:
sudo tail -n 300 /var/log/syslog

Also check RAM consumption with: cat /proc/meminfo

yes, no memory kills

I’ve got 16GB RAM and cnode usage doesn’t seem to run above 10G

still puzzled :slight_smile:

are you starting the node with 0.0.0.0 host-addr right?

yes, via cnode.sh

I can’t find anything related… no other messages if u type journalctl -e -f -u cnode ?

There are a huge amount of log messages from the node (see examples above).

The last ones before the restart included “IpSubscription:Error”… does that indicate something that would kill the node?

This was fixed by adding a 1GB swap file :slight_smile:

I’m unsure why, as the machine as 16GB RAM and reports that there is plently of memory spare, but it seems that something somewhere is expecting swap to exist.

(Thanks to Stefan of CO2 pool for the suggestion!)

THanks all, this forum is extremely useful! Have a good one.

I have same issue, I can’t find anything related. I have checked my system, Free mem is large.
total used free shared buff/cache available
Mem: 32075 13659 3639 0 14776 17998
Swap: 20479 85 20394

In logs of node, just only : {“app”:,“at”:“2023-07-28T00:42:45.18Z”,“data”:{“addBlock”:“6207b6d12cae3611a69b398d4655d614d62f05
1b46f009d069a7bae2e908e1e3@96699369”,“blockingRead”:false,“kind”:“ChainSyncServerEvent.TraceChainSy
ncServerUpdate”,“tip”:{“block”:“52f6c601c047ef22f6c69dec7c3bcd2bda842818577e76abcd08853ae9224533”,"
blockNo":{“unBlockNo”:9084986},“slot”:98938670}},“env”:“1.35.5:8762a”,“host”:“blockcha”,“loc”:null,
“msg”:“”,“ns”:[“cardano.node.ChainSyncBlockServer”],“pid”:“3219555”,“sev”:“Info”,“thread”:“758”}
{“app”:,“at”:“2023-07-28T00:42:46.54Z”,“data”:{“domain”:“"cerp-relay2.cerp.dev"”,“event”:“Appli
cation Exception: 183.88.13.11:36208 SubscriberError {seType = SubscriberWorkerCancelled, seMessage
= "SubscriptionWorker exiting", seStack = }”,“kind”:“SubscriptionTrace”},“env”:“1.35.5:8762a”,
“host”:“blockcha”,“loc”:null,“msg”:“”,“ns”:[“cardano.node.DnsSubscription”],“pid”:“3219555”,“sev”:"
Error",“thread”:“2000”}

Do you have any idea ?

Hi,

Do u set something inside crontab or cncli to restart the node automatically?
Also, which version of the node do you run?

Cheers,

@Alexd1985 . I don’t set crontab or cncli to restart . My version is cardano-node 1.35.5 . I find same issue at [BUG] - SubscriptionWorker exiting · Issue #1714 · input-output-hk/cardano-node · GitHub. But I can got it . please

You are on mainnet?
U should upgrade the nodes to 8.1.2

Cheers,