Failed to start all required subscriptions error

Finally resorting to posting after scouring this forum and using it to troubleshoot over the past week. I apologize because I am inexperienced in Linux and am learning as I go.
I have basically made it through CoinCashew’s guide though step 8, trying to get the nodes running before getting block-producer keys

I have 2 nodes set up on virtualbox machines running Ubuntu 20.04 on my Windows 10 pc. I have managed to get both synced with the blockchain. However, when I edit my block producing node’s topology file to point only to port 6000 on my relay node, I get the error code “DNS subscription warning 71: failed to start all required subscriptions” .

The two nodes do seem to be able to connect to one another, because when I pull up the gLiveView monitor, I see the relay peer’s in number go from 0 to 1, and the block producing node peer out number is 1.

I have allowed port forwarding of 6000 on my router and also within the virtual machines and they register as open.

I pored thru the topology and env files for errors, but everything seems to point to the right place.
Anybody care to help me with my next troubleshooting step?

Thanks!
-Sully

Hi!

some more infos please provide.
The whole text of the warning - perhaps there was an IP address mentioned as well.
could you check what interface the nodes listening? by using sudo netstat -lnaopt
and can you check that these listening ports are open for the other virtual machine? by using netcat -znv <other_nodeIP> <other_node_port>

OK, the error when running the node is:

Mar 22 14:49:51 sullynode-VirtualBox bash[666]: [sullynod:cardano.node.DnsSubscription:Warning:415] [2021-03-22 18:49:51.63 UTC] Domain: “<sully.mypublicdnsname.com>” Failed to start all required subscriptions

The relevant part of my netstat report:

tcp 0 0 127.0.0.1:12798 0.0.0.0:* LISTEN 666/cardano-node off (0.00/0/0)
tcp 0 0 0.0.0.0:6000 0.0.0.0:* LISTEN 666/cardano-node off (0.00/0/0)
tcp 0 0 127.0.0.1:12788 0.0.0.0:* LISTEN 666/cardano-node off (0.00/0/0)
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN 466/systemd-resolve off (0.00/0/0)
tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN 511/cupsd off (0.00/0/0)
tcp 0 0 10.0.2.15:46294 99.84.176.113:443 TIME_WAIT - timewait (4.66/0/0)
tcp 0 0 10.0.2.15:44856 172.217.164.163:80 TIME_WAIT - timewait (3.65/0/0)

It also looks like it is listening on port 6000 on the relay node.

I am having difficulty knowing exactly what ip and host address to use for each configuration. I have an external ip that I am mapping with a free ddns server. Then I have the ip address of the physical machine 192.168.1.2. Then I have the ip address when i query ‘ip a’ in a terminal within each virtual machine. This is the same for both the BP node and the relay (10.0.2.15), though each is a different machine. I have disabled all firewalls for testing, and port forwarded port 6000 on my main router as well as within each virtual machine.
I do plan on eventually running node and relay on different machines, but I was trying to prove I could get things running before making any investments.
I’m sure my virtual machine setup is causing some extra headaches, but I still feel like I’m close to making this work. Thanks for you guys’ help!
-Sully

1 Like

you dont need to use your public hostname in the topology.json… please use 192.168.1.2 instead. But first try to reach the listening port with netcat -zvn 192.168.1.2 6000 from the other virtual machine.
Also if you have trouble with NAT address you can assign bridged network to your virtual machine - in this case the virtual machine will use the host network (192.168.1.2) instead of creating its own (10.0.2.15).

Well this is embarrassing. In re-editing my topology file, I realized that I had left in the < > symbols around the ip address in my block-producing node. When I deleted those and changed it to 192.168.1.2, it worked fine. And then I changed it back to my public DDNS address and it still worked. So thanks for helping me work through that part.

I’m still a bit confused as to when to use the main public address (the one assigned by my ISP) vs an internal machine address. As I think through it, if I have my relay and block producer in the the same home network (eventually different machines) I can just use the machine addresses in the topology? But if I move a relay off-site, then I would need to use the external public ip to contact it?

Also, in this situation of the two virtual machines, the block node AND relay node topology have the same ip address (192.168.1.2 port 6000). How does that not confuse the whole set up when the nodes are referring to themselves and each other at the same time? And though they seem to be communicating, does this defeat the whole purpose of having the block producing node separate from the relay node for security reasons?

Thanks for your help in teaching me network 101 and better understanding the node process.
-Sully

2 Likes

Yes, the public address should be used when the node is not in the same internal network.
So good point with the address - but actually they have different addresses since both now behind the NAT - so 10.0.2.15:6000 and 10.0.2.16:6000 but if both addresses forwarded to the same port on host network 192.168.1.2 then it will mess things up… so would be better to start on different ports…

But when I use the ip a command, both node and relay seem to have the same ip of 10.0.2.15. and in the looking at the peer screen of gLiveView, the block node has the 192.168.1.2 :6000 as the only out peer. My relay node has the same address as one of the OUT peers as well as a couple others from the public relays in the topology file. Then there are two ip’s (10.0.2.2:57586 and 10.0.2.2:57593) on the IN side.
Things seem to be running correctly, but still not sure if the two nodes have the same address and port. Should I change the relay node to a different port to be safe?

-Sully

I think the network of the relay node is 10.0.2.2 then…
could you verified that listening port of the block node can be reachable from relay node - netcat -vnz

so netcat of 10.0.2.2 6000 from the relay node was a success. Trying port 57586 and 57593 was a fail

ok - that means that the relay node listens on 10.0.2.2 6000 - but can you reach the block node from relay? netcat -vnz <block ip> <block port>

yes netcat is a sucess on 192.168.1.2 6000 and on 10.0.2.15 6000 as well. And that is true in both directions relay to node, and node to relay. Also true of port 6000 on my external ip.

ok lets do another test to see who is getting the data…
stop both nodes… then start listening with netcat on both machines on the same port: netcat -l 6000

Then from outside the virtual machine from windows shell try to telnet into the openned port: telnet 192.168.1.2 6000 and if you have a successful connection just type something into the telnet and hit enter - that massage should be arrive to one of the virtual machine console

I have allowed port forwarding of 6000 on my router and also within the virtual machines and they register as open.

you wrote this in the first post - your setup can actually work if only the relay virtual machine has the open port…

ok that was interesting. When I type into the telnet session, my text comes up into to virtual machine that i have most recently clicked on. So it is not mirrored by both node and relay simultaneously. Whichever of the virtual machines is in the foreground will get the text, the other receives nothing.

yes - since only one can get it… so if you close port forward on the block producing node, then it will be a good setup… so nodes can communicate on the 10.0.2.0/24 subnet and the public relay nodes will be communicating via 192.168.1.2 6000 which is forwarded to the relay node…

But I still have 192.168.1.2 6000 as the relay’s address in the block producing topology, as well as the same address for the block node in the relay’s topology. Should I change this?
-Sully

yes, use the internal address - keep in mind that only 1 node is reachable on 192.168.1.2

because I just turned off port forwarding of the 6000 port on the block node, right?
OK. I’m starting to better understand network pathing now. Thanks so much for your time on this!
You’ve been so helpful, I might come back to this thread if I have more questions, but I’ll give you a break for now.
Thanks again!
-Sully

1 Like

no problem! consider mark one of the answers as solution to indicate to others that this topic is no longer need attention