Problems after Alonzo Hard Fork about to miss a block

I’m having some major issues after the Hard Fork.
I have one relay connecting and synced fine.
The other doesn’t seem to be connecting to the topography locations:

ErrorPolicySuspendPeer (Just (ApplicationExceptionTrace (MuxError MuxBearerClosed "<socket: 44> closed when reading data, waiting on next header True"))) 20s 20s
Sep 12 19:37:34 sully-VirtualBox bash[3776]: [sully-Vi:cardano.node.ErrorPolicy:Warning:121] [2021-09-12 23:37:34.34 UTC] IP 10.0.2.2:40027 ErrorPolicySuspendPeer (Just (ApplicationExceptionTrace (MuxError MuxBearerClosed "<socket: 28> closed when reading data, waiting on next header True"))) 20s 20s
Sep 12 19:37:34 sully-VirtualBox bash[3776]: [sully-Vi:cardano.node.ErrorPolicy:Warning:121] [2021-09-12 23:37:34.86 UTC] IP 10.0.2.2:42395 ErrorPolicySuspendPeer (Just (ApplicationExceptionTrace (MuxError (MuxIOException Network.Socket.recvBuf: resource vanished (Connection reset by peer)) "(recv errored)"))) 20s 20s
Sep 12 19:37:35 sully-VirtualBox bash[3776]: [sully-Vi:cardano.node.ErrorPolicy:Warning:121] [2021-09-12 23:37:35.05 UTC] IP 10.0.2.2:33553 ErrorPolicySuspendPeer (Just (ApplicationExceptionTrace (MuxError MuxBearerClosed "<socket: 27> closed when reading data, waiting on next header True"))) 20s 20s
Sep 12 19:37:36 sully-VirtualBox bash[3776]: [sully-Vi:cardano.node.ErrorPolicy:Warning:121] [2021-09-12 23:37:36.39 UTC] IP 10.0.2.2:34665 ErrorPolicySuspendPeer (Just (ApplicationExceptionTrace (MuxError (MuxIOException Network.Socket.recvBuf: resource vanished (Connection reset by peer)) "(recv errored)"))) 20s 20s
Sep 12 19:37:36 sully-VirtualBox bash[3776]: [sully-Vi:cardano.node.ErrorPolicy:Warning:121] [2021-09-12 23:37:36.75 UTC] IP 10.0.2.2:36543 ErrorPolicySuspendPeer (Just (ApplicationExceptionTrace (MuxError MuxBearerClosed "<socket: 28> closed when reading data, waiting on next header True"))) 20s 20s
Sep 12 19:37:37 sully-VirtualBox bash[3776]: [sully-Vi:cardano.node.ErrorPolicy:Warning:121] [2021-09-12 23:37:37.90 UTC] IP 10.0.2.2:32569 ErrorPolicySuspendPeer (Just (ApplicationExceptionTrace (MuxError MuxBearerClosed "<socket: 27> closed when reading data, waiting on next header True"))) 20s 20s

And then my block producing node is coming up with this, but never syncing…

Sep 12 19:38:46 sully-OptiPlex-7050 bash[2092]: [sully-Op:cardano.node.LeadershipCheck:Info:74] [2021-09-12 23:38:46.40 UTC] {"kind":"TraceStartLeadershipCheck","chainDensity":5.074388e-2,"slot":39923635,"delegMapSize":827585,"utxoSize":3228794,"credentials":"Cardano"}
Sep 12 19:38:48 sully-OptiPlex-7050 bash[2092]: [sully-Op:cardano.node.Forge:Info:74] [2021-09-12 23:38:48.27 UTC] fromList [("val",Object (fromList [("kind",String "TraceNodeNotLeader"),("slot",Number 3.9923635e7)])),("credentials",String "Cardano")]
Sep 12 19:38:48 sully-OptiPlex-7050 bash[2092]: [sully-Op:cardano.node.LeadershipCheck:Info:74] [2021-09-12 23:38:48.27 UTC] {"kind":"TraceStartLeadershipCheck","chainDensity":5.074388e-2,"slot":39923637,"delegMapSize":827585,"utxoSize":3228794,"credentials":"Cardano"}

Not sure where to go next. I updated the topography of my non-syncing relay, but still no luck…
-Sully

2 Likes

I’m not sure if this will help but I was having some issues after the hard fork as well with relays dropping off. Once I set TraceMempool to false in mainnet-config.json on the relays and restarted the cardano node service, that seemed to fix things.

1 Like

I was already has that set to false. Still having problems after rebooting everything.

I’ve been forced to reset the db in one of my relay nodes. I had one synced briefly, but now it too, won’t start. Apparently some others had db corruption during the hardfork as well. My db re-sync is going to take like 4 days. Need some help to try to salvage this epoch or I’m going to miss a bunch of blocks.

1 Like

So, any updates

I have one relay node that is synced.
I have one relay that is 99.9% syncing and stuck over night after deleting and recopying a db but and are restart is stuck in “starting” giving me this in journal like the topologies are bad…

ep 13 06:05:46 sully-VirtualBox bash[17807]: [sully-Vi:cardano.node.DnsSubscription:Error:1333] [2021-09-13 10:05:46.31 UTC] Domain: "relays-new.cardano-mainnet.iohk.io" Application Exception: 3.132.200.230:3001 MuxError MuxBearerClosed "<socket: 48> closed when reading data, waiting on next header True"
Sep 13 06:05:46 sully-VirtualBox bash[17807]: [sully-Vi:cardano.node.ErrorPolicy:Warning:432] [2021-09-13 10:05:46.31 UTC] IP 3.132.200.230:3001 ErrorPolicySuspendPeer (Just (ApplicationExceptionTrace (MuxError MuxBearerClosed "<socket: 48> closed when reading data, waiting on next header True"))) 20s 20s
Sep 13 06:05:46 sully-VirtualBox bash[17807]: [sully-Vi:cardano.node.DnsSubscription:Error:1342] [2021-09-13 10:05:46.80 UTC] Domain: "relay1.888pool.io" Application Exception: 62.171.180.213:6000 MuxError MuxBearerClosed "<socket: 51> closed when reading data, waiting on next header True"
Sep 13 06:05:46 sully-VirtualBox bash[17807]: [sully-Vi:cardano.node.ErrorPolicy:Warning:459] [2021-09-13 10:05:46.80 UTC] IP 62.171.180.213:6000 ErrorPolicySuspendPeer (Just (ApplicationExceptionTrace (MuxError MuxBearerClosed "<socket: 51> closed when reading data, waiting on next header True"))) 20s 20s
Sep 13 06:05:47 sully-VirtualBox bash[17807]: [sully-Vi:cardano.node.DnsSubscription:Error:1343] [2021-09-13 10:05:47.29 UTC] Domain: "relay2.adaocean.com" Connection Attempt Exception, destination 23.227.207.90:6000 exception: Network.Socket.connect: <socket: 50>: does not exist (Connection refused)

My blocknode is stuck in “starting” and sending this in the journal:

Sep 13 06:04:01 sully-OptiPlex-7050 bash[2674586]: [sully-Op:cardano.node.Forge:Info:416] [2021-09-13 10:04:01.96 UTC] fromList [("val",Object (fromList [("kind",String "TraceNodeNotLeader"),("slot",Number 3.9961149e7)])),("credentials",String "Cardano")]
Sep 13 06:04:01 sully-OptiPlex-7050 bash[2674586]: [sully-Op:cardano.node.LeadershipCheck:Info:416] [2021-09-13 10:04:01.96 UTC] {"kind":"TraceStartLeadershipCheck","chainDensity":5.074388e-2,"slot":39961150,"delegMapSize":827585,"utxoSize":3228794,"credentials":"Cardano"}
Sep 13 06:04:02 sully-OptiPlex-7050 bash[2674586]: [sully-Op:cardano.node.IpSubscription:Info:426] [2021-09-13 10:04:02.27 UTC] IPs: 0.0.0.0:0 [192.168.1.2:6000,192.168.1.41:6001] Trying to connect to 192.168.1.2:6000
Sep 13 06:04:02 sully-OptiPlex-7050 bash[2674586]: [sully-Op:cardano.node.IpSubscription:Info:3347185] [2021-09-13 10:04:02.27 UTC] IPs: 0.0.0.0:0 [192.168.1.2:6000,192.168.1.41:6001] Connection Attempt Start, destination 192.168.1.2:6000
Sep 13 06:04:02 sully-OptiPlex-7050 bash[2674586]: [sully-Op:cardano.node.IpSubscription:Notice:426] [2021-09-13 10:04:02.27 UTC] IPs: 0.0.0.0:0 [192.168.1.2:6000,192.168.1.41:6001] Waiting 0.025s before attempting a new connection
Sep 13 06:04:02 sully-OptiPlex-7050 bash[2674586]: [sully-Op:cardano.node.IpSubscription:Info:3347185] [2021-09-13 10:04:02.27 UTC] IPs: 0.0.0.0:0 [192.168.1.2:6000,192.168.1.41:6001] Connection Attempt End, destination 192.168.1.2:6000 outcome: ConnectSuccess
Sep 13 06:04:02 sully-OptiPlex-7050 bash[2674586]: [sully-Op:cardano.node.IpSubscription:Error:3347185] [2021-09-13 10:04:02.27 UTC] IPs: 0.0.0.0:0 [192.168.1.2:6000,192.168.1.41:6001] Application Exception: 192.168.1.2:6000 MuxError (MuxIOException Network.Socket.recvBuf: resource vanished (Connection reset by peer)) "(recv errored)"
Sep 13 06:04:02 sully-OptiPlex-7050 bash[2674586]: [sully-Op:cardano.node.IpSubscription:Info:3347185] [2021-09-13 10:04:02.27 UTC] IPs: 0.0.0.0:0 [192.168.1.2:6000,192.168.1.41:6001] Closed socket to 192.168.1.2:6000

I can ping the blocknode from the working relay and vice versa.

-Sully

to start your producer you can add the IOHK relays inside topology file and restart the PRoducer

Well, I tried that on the relay that isn’t syncing and it hasn’t worked.

I was getting this:

Sep 13 06:14:20 sully-VirtualBox bash[28112]: [sully-Vi:cardano.node.DnsSubscription:Notice:508] [2021-09-13 10:14:20.91 UTC] Domain: "relays-new.cardano-mainnet.iohk.io" Connection Attempt End, destination 3.129.158.233:3001 outcome: ConnectSuccessLast
Sep 13 06:14:20 sully-VirtualBox bash[28112]: [sully-Vi:cardano.node.ErrorPolicy:Notice:94] [2021-09-13 10:14:20.91 UTC] IP 18.180.136.78:3001 ErrorPolicySuspendConsumer (Just (ConnectionExceptionTrace (SubscriberError {seType = SubscriberParallelConnectionCancelled, seMessage = "Parallel connection cancelled", seStack = []}))) 1s
Sep 13 06:14:20 sully-VirtualBox bash[28112]: [sully-Vi:cardano.node.ErrorPolicy:Notice:94] [2021-09-13 10:14:20.91 UTC] IP 204.236.161.163:3001 ErrorPolicySuspendConsumer (Just (ConnectionExceptionTrace (SubscriberError {seType = SubscriberParallelConnectionCancelled, seMessage = "Parallel connection cancelled", seStack = []}))) 1s

but Let me put that into both topologies and restart.

the relays should start… did u upgraded to 1.29.0 do you have enough RAM?

yes. upgraded to 1.29.0 and 16gb.

What is Parallel connection cancelled? I keep seeing that…

hmm
for you relays… try to sudo systemctl status cardano-node
journalctl -e -f -u cardano-node

do u see any errors?

Connection Attempt Exception, destination 23.227.207.90:6000 exception: Network.Socket.connect: <socket: 46>: does not exist (Connection refused)
ySuspendConsumer (Just (ConnectionExceptionTrace Network.Socket.connect: <socket: 46>: does not exist (Connection refused))) 20s
m" Failed to start all required subscriptions
spendPeer (Just (ApplicationExceptionTrace (MuxError MuxBearerClosed "<socket: 27> closed when reading data, waiting on next header True"))) 20s 20s
spendPeer (Just (ApplicationExceptionTrace (MuxError MuxBearerClosed "<socket: 27> closed when reading data, waiting on next header True"))) 20s 20s
spendPeer (Just (ApplicationExceptionTrace (MuxError MuxBearerClosed "<socket: 35> closed when reading data, waiting on next header True"))) 20s 20s
com" Application Exception: 95.111.252.107:3001 ExceededTimeLimit (Handshake) (ServerAgency TokConfirm)
cySuspendConsumer (Just (ApplicationExceptionTrace ExceededTimeLimit (Handshake) (ServerAgency TokConfirm))) 20s
spendPeer (Just (ApplicationExceptionTrace (MuxError MuxBearerClosed "<socket: 41> closed when reading data, waiting on next header True"))) 20s 20s
com" Failed to start all required subscriptions

I’m getting a lot of waiting on next header errors

can u show me the topology file for this node?

{
  "resultcode": "201",
  "networkMagic": "764824073",
  "ipType": 4,
  "requestedIpVersion": "4",
  "max": "15",
  "Producers": [
{
        "addr": "relays-new.cardano-mainnet.iohk.io",
        "port": 3001,
        "valency": 2
      },   


 {
      "addr": "node1.acmestaking.com",
      "port": 55444,
      "valency": 1,
      "distance": 251,
      "continent": "NA",
      "country": "US",
      "region": "SC"
    },
    {
      "addr": "relay2.adaocean.com",
      "port": 6000,
      "valency": 1,
      "distance": 651,
      "continent": "NA",
      "country": "US",
      "region": "DE"
    },
 {
      "addr": "209.151.152.96",
      "port": 50005,
      "valency": 1,
      "distance": 801,
      "continent": "NA",
      "country": "US",
      "region": "NJ"
    },
    {
      "addr": "45.46.196.228",
      "port": 6002,
      "valency": 1,
      "distance": 987,
      "continent": "NA",
      "country": "US",
      "region": "NY"
    },
    {
      "addr": "relay3.cardanosky.com",
      "port": 3001,
      "valency": 1,
      "distance": 1203,
      "continent": "NA",
      "country": "US",
      "region": "MO"
    },
    {
      "addr": "143.244.169.112",
      "port": 6000,
      "valency": 1,
      "distance": 1580,
      "continent": "NA",
      "country": "US",
   "region": "KS"
    },
    {
      "addr": "34.125.10.190",
      "port": 3001,
      "valency": 1,
      "distance": 3375,
      "continent": "NA",
      "country": "US",
      "region": "NV"
    },
    {
      "addr": "gys-relay1.growyourstake.com",
      "port": 6000,
      "valency": 1,
      "distance": 3946,
      "continent": "NA",
      "country": "US",
      "region": "WA"
    },
    {
      "addr": "193.193.115.187",
      "port": 6007,
      "valency": 1,
      "distance": 6143,
      "continent": "EU",
      "country": "GB",
      "region": "ENG"
    },
  {
      "addr": "188.34.184.168",
      "port": 6000,
      "valency": 1,
      "distance": 7105,
      "continent": "EU",
      "country": "DE",
      "region": "BY"
    },
    {
      "addr": "relay1.staking-ada.de",
      "port": 3001,
      "valency": 1,
      "distance": 7127,
      "continent": "EU",
      "country": "DE",
      "region": "BE"
    },
    {
      "addr": "192.168.1.27",
      "port": 6000,
      "valency": 1
    }
  ]
}

I might have left out some… I was trying to copy and paste from nano

understand… so u have enough peers… how is the node? started, starting, syncing?

stuck in "starting " for 30 mins

Sep 13 06:43:00 sully-OptiPlex-7050 bash[1304413]: [sully-Op:cardano.node.DnsSubscription:Error:244524] [2021-09-13 10:43:00.73 UTC] Domain: “relays-new.cardano-mainnet.iohk.io” Application Exception: 18.158.211.17:3001 MuxError MuxBearerClosed “<socket: 27> closed when reading data, waiting on next header True”

This is the repetitive error on both nodes. what does it signify?

ok, remove the nodes from IOHK save the file and restart the node… then show me the glive

on the blocknode?