Producer & Relay fail to start subscriptions/Connects during sync [IpSubscription:Info:56]

eddycarra · 31 May 2021 03:43

Started up my nodes for the first time and I am waiting for the syncing to complete. So far it looks like we are doing well in that area, but I have a pretty much one way communication. I am attempting to deploy to mainnet.

I have been getting this error where my producer node fails to start all required subscriptions.

Current topology is 1 producer and 1 relay. I am working on building a third relay as we speak.

What I find interesting is that at the beginning everything seems to be working correctly. All peers are showing connected and data is flowing. But after around 14% on the producer node this issue starts.

Syncing Status as of writing this
Relay Node: 95.5%
Producer Node: 14.6% (rebuilt producer as on of my attempts to repair)

Node vHardware:

8GB RAM | 4v CPU | 160GB Storage

cabal-install version 3.4.0.0
compiled using version 3.4.0.0 of the Cabal library

GHCUP version 8.10.4

cardano-node 1.27.0 - linux-x86_64 - ghc-8.10
git rev 8fe46140a52810b6ca456be01d652ca08fe730bf

cardano-cli 1.27.0 - linux-x86_64 - ghc-8.10
git rev 8fe46140a52810b6ca456be01d652ca08fe730bf

Log Errors are as follows:
Producer Node

May 30 23:00:40 node-ada cardano-node[811]: [ada-node:cardano.node.IpSubscription:Info:56] [2021-05-31 03:00:40.06 UTC] IPs: 0.0.0.0:0 [RelayIP:Port] Skipping peer RelayIP:Port
May 30 23:00:40 node-ada cardano-node[811]: [ada-node:cardano.node.IpSubscription:Error:56] [2021-05-31 03:00:40.06 UTC] IPs: 0.0.0.0:0 [RelayIP:Port] Failed to start all required subscriptions
May 30 23:00:41 node-ada cardano-node[811]: [ada-node:cardano.node.IpSubscription:Info:56] [2021-05-31 03:00:41.06 UTC] IPs: 0.0.0.0:0 [RelayIP:Port] Restarting Subscription after 1.001295749s desired valency 1 current valency 0

Relay Node
May 31 03:34:52 relay-node cardano-node[810]: [ada-node:cardano.node.IpSubscription:Info:30821] [2021-05-31 03:34:52.69 UTC] IPs: 0.0.0.0:0 [producerIP:Port] Closed socket to producerIP:Port
May 31 03:34:53 relay-node cardano-node[810]: [ada-node:cardano.node.IpSubscription:Info:1338] [2021-05-31 03:34:53.74 UTC] IPs: 0.0.0.0:0 [producerIP:Port] Restarting Subscription after 2.110583293s desired valency 1 current valency 0

Here is the the connection testing I have completed on the Relay Node:

1.netcat -zvn producerIP Port# | Connection to producerIP PORT# port [tcp/*] succeeded!
2. rm -rf cardano-node/db && sudo systemctl reload-or-restart cardano-node
3. Even rebuilt from source towards the end

Active-LLC · 31 May 2021 05:44

Might be a memory issue. I’ve had a similar problem.

Look up: No more Peers Suddenly - to find the discussion on the forum.

I’d go with 4core CPU and 16gb ram to be future proof. Anything less than that can cause problems.

eddycarra · 31 May 2021 05:51

Woah! I have to bump the memory up even further than 8GB? I am running this on digitalocean. That would bring my costs to $240 for a base level 1 producer and 2 relays.

I don’t believe it’s memory for me as my machine has plenty of memory left.

Top results:
MiB Mem : 15.8/7961.8

Do you think there is a mechanism in the cardano build that sets a minimum Mem GB to run? I haven’t seen any mem errors in my logs. Now I could totally be wrong, but there is just so much mem resource left to think this is the issue.

I’ll see if I can find anything indicating this.

Thank you for your insight. I’ll keep chipping at this.

Active-LLC · 31 May 2021 06:04

I haven’t dug that deep into it. I had two r610 with 48gb (12*4gb sticks) and had problems.

You can also try turning off TraceMemPool to False in config.json

I upgraded to new server with 2*32gb and had no problem.

eddycarra · 31 May 2021 06:08

I’ll give this a test soon. Only thing stopping me is really that I just scaled up all my nodes from 4gb to 8gb. I will try to see later if scaling up my producer only to 16gb will help out.

If this is really going to need 16gb to run I will have to rethink my cloud approach and upgrade my home network to run this on-prem. I got a few big boy machines in the closet I can run kvm qeemu on

I’ll report back here if this fixed it.

eddycarra · 31 May 2021 18:23

Updated my producer to 16GB, but no changes.

The error is now IpSubscription:Info:63 after scaling up.

ada-node:cardano.node.IpSubscription:Info:63] [2021-05-31 18:21:54.21 UTC] IPs: 0.0.0.0:0 [xxx.xxx.xxx.xxx:6000,xxx.xxx.xxx.xxx:6000] Restarting Subscription after 1.066310531s desired valency 2 current valency 0

Active-LLC · 31 May 2021 20:14

Stop the Relay & BP nodes. Delete the database. Restart & Resync with default Topology file.
Start your relays first, and Make sure BP is only connected to your relays. - Monitor the nodes.

It did that for me for a good minute until I came back to check and it was working fine.

eddycarra · 4 June 2021 05:02

Ok, got it working.

This was caused by a downloading incompatible json config files. There was an update of the files during mainnet* files for the new Alonzo update. This is an issue with the original coincashew docs being in a miss alignment with the latest.

I just had to rerun these wget commands in my cardano-node/ directory to download and replace the .json config files. Also remember to update your mainnet-topology.json. I forgot that I replaced that file as well. Oh the laugh I had after realizing I shot myself in the foot.

wget https://hydra.iohk.io/build/6198010/download/1/mainnet-config.json
wget https://hydra.iohk.io/build/6198010/download/1/mainnet-byron-genesis.json
wget https://hydra.iohk.io/build/6198010/download/1/mainnet-shelley-genesis.json
wget https://hydra.iohk.io/build/6198010/download/1/mainnet-topology.json

Hope this helps someone else.

Topic		Replies	Views
Relay giving '"Failed to start all required subscriptions", but they do establish a connection Setup a Stake Pool	2	1810	9 March 2021
Failed to start all required subscriptions with new relay Setup a Stake Pool	19	1581	21 December 2021
Trying to Connect to & Failed to Start All Required Subscription Community Technical Support	0	388	24 April 2022
Block Producing node having issue connecting to relay node Setup a Stake Pool	3	1111	26 May 2021
Block producer and Relay node both stuck on starting on 1.30.1 Setup a Stake Pool	12	610	11 October 2021

Producer & Relay fail to start subscriptions/Connects during sync [IpSubscription:Info:56]

Related topics