Producer & Relay fail to start subscriptions/Connects during sync [IpSubscription:Info:56]

Started up my nodes for the first time and I am waiting for the syncing to complete. So far it looks like we are doing well in that area, but I have a pretty much one way communication. I am attempting to deploy to mainnet.

I have been getting this error where my producer node fails to start all required subscriptions.

Current topology is 1 producer and 1 relay. I am working on building a third relay as we speak.

What I find interesting is that at the beginning everything seems to be working correctly. All peers are showing connected and data is flowing. But after around 14% on the producer node this issue starts.

Syncing Status as of writing this
Relay Node: 95.5%
Producer Node: 14.6% (rebuilt producer as on of my attempts to repair)

Node vHardware:

8GB RAM | 4v CPU | 160GB Storage

cabal-install version
compiled using version of the Cabal library

GHCUP version 8.10.4

cardano-node 1.27.0 - linux-x86_64 - ghc-8.10
git rev 8fe46140a52810b6ca456be01d652ca08fe730bf

cardano-cli 1.27.0 - linux-x86_64 - ghc-8.10
git rev 8fe46140a52810b6ca456be01d652ca08fe730bf

Log Errors are as follows:
Producer Node

May 30 23:00:40 node-ada cardano-node[811]: [ada-node:cardano.node.IpSubscription:Info:56] [2021-05-31 03:00:40.06 UTC] IPs: [RelayIP:Port] Skipping peer RelayIP:Port
May 30 23:00:40 node-ada cardano-node[811]: [ada-node:cardano.node.IpSubscription:Error:56] [2021-05-31 03:00:40.06 UTC] IPs: [RelayIP:Port] Failed to start all required subscriptions
May 30 23:00:41 node-ada cardano-node[811]: [ada-node:cardano.node.IpSubscription:Info:56] [2021-05-31 03:00:41.06 UTC] IPs: [RelayIP:Port] Restarting Subscription after 1.001295749s desired valency 1 current valency 0

Relay Node
May 31 03:34:52 relay-node cardano-node[810]: [ada-node:cardano.node.IpSubscription:Info:30821] [2021-05-31 03:34:52.69 UTC] IPs: [producerIP:Port] Closed socket to producerIP:Port
May 31 03:34:53 relay-node cardano-node[810]: [ada-node:cardano.node.IpSubscription:Info:1338] [2021-05-31 03:34:53.74 UTC] IPs: [producerIP:Port] Restarting Subscription after 2.110583293s desired valency 1 current valency 0

Here is the the connection testing I have completed on the Relay Node:

1.netcat -zvn producerIP Port# | Connection to producerIP PORT# port [tcp/*] succeeded!
2. rm -rf cardano-node/db && sudo systemctl reload-or-restart cardano-node
3. Even rebuilt from source towards the end

Might be a memory issue. I’ve had a similar problem.

Look up: No more Peers Suddenly - to find the discussion on the forum.

I’d go with 4core CPU and 16gb ram to be future proof. Anything less than that can cause problems.

Woah! I have to bump the memory up even further than 8GB? I am running this on digitalocean. That would bring my costs to $240 for a base level 1 producer and 2 relays.

I don’t believe it’s memory for me as my machine has plenty of memory left.

Top results:
MiB Mem : 15.8/7961.8

Do you think there is a mechanism in the cardano build that sets a minimum Mem GB to run? I haven’t seen any mem errors in my logs. Now I could totally be wrong, but there is just so much mem resource left to think this is the issue.

I’ll see if I can find anything indicating this.

Thank you for your insight. :bowing_man: I’ll keep chipping at this.

I haven’t dug that deep into it. I had two r610 with 48gb (12*4gb sticks) and had problems.

You can also try turning off TraceMemPool to False in config.json

I upgraded to new server with 2*32gb and had no problem.

I’ll give this a test soon. Only thing stopping me is really that I just scaled up all my nodes from 4gb to 8gb. I will try to see later if scaling up my producer only to 16gb will help out.

If this is really going to need 16gb to run I will have to rethink my cloud approach and upgrade my home network to run this on-prem. I got a few big boy machines in the closet I can run kvm qeemu on :slight_smile:

I’ll report back here if this fixed it.

1 Like

Updated my producer to 16GB, but no changes.

The error is now IpSubscription:Info:63 after scaling up.

ada-node:cardano.node.IpSubscription:Info:63] [2021-05-31 18:21:54.21 UTC] IPs: [,] Restarting Subscription after 1.066310531s desired valency 2 current valency 0

Stop the Relay & BP nodes. Delete the database. Restart & Resync with default Topology file.
Start your relays first, and Make sure BP is only connected to your relays. - Monitor the nodes.

It did that for me for a good minute until I came back to check and it was working fine.

Ok, got it working.

This was caused by a downloading incompatible json config files. There was an update of the files during mainnet* files for the new Alonzo update. This is an issue with the original coincashew docs being in a miss alignment with the latest.

I just had to rerun these wget commands in my cardano-node/ directory to download and replace the .json config files. Also remember to update your mainnet-topology.json. I forgot that I replaced that file as well. Oh the laugh I had after realizing I shot myself in the foot.


Hope this helps someone else.