Issues between relay and BP on 1.33.0

Hi,

I updated my relay and BP to 1.33.0. My relay seems to be running fine, connecting to external peers, synced chain, etc. but my BP and relay won’t talk to each other after the update. I am getting a number of errors, e.g.

:cardano.node.IpSubscription:Info:138] [2022-01-09 09:00:58.05 UTC] IPs: 0.0.0.0:0 [X.X.X.X:XXXX] Restarting Subscription after 1.001100604s desired valency 1 current valency 0

cardano.node.Forge:Error:128] [2022-01-09 09:01:03.00 UTC] fromList [(“val”,Object (fromList [(“kind”,String “TraceNoLedgerView”),(“slot”,Number 5.0152572e7)])),(“credentials”,String “Cardano”)

The BP shows as core, KES displays right, but is stuck on “starting” and won’t connect to the relay (it has been more than enough hours for the ledger rebuild). It was obviously working fine before the update. I can ping between the servers, and in the relay log, it says connection successful sometimes, but then it closes the connection. I have updated the config files as well, no difference.

The BP tip ref is right in gLiveView.

Has anyone has this issue with 1.33.0 or have any ideas…

share the glive output for BP and also the startup script for BP

Standard coincashew startup for BP.

DIRECTORY=$NODE_HOME

PORT=6000

HOSTADDR=0.0.0.0

TOPOLOGY=\${DIRECTORY}/${NODE_CONFIG}-topology.json

DB_PATH=\${DIRECTORY}/db

SOCKET_PATH=\${DIRECTORY}/db/socket

CONFIG=\${DIRECTORY}/${NODE_CONFIG}-config.json

KES=\${DIRECTORY}/kes.skey

VRF=\${DIRECTORY}/vrf.skey

CERT=\${DIRECTORY}/node.cert

/usr/local/bin/cardano-node run +RTS -N -A16m -qg -qb -RTS --topology \${TOPOLOGY} --database-path \${DB_PATH} --socket-path \${SOCKET_PATH} --host-addr \${HOSTADDR} --port \${PORT} --config \${CONFIG} --shelley-kes-key \${KES} --shelley-vrf-key \${VRF} --shelley-operational-certificate \${CERT}

I also tried changing the HOSTADDR to the BP IP, but no difference. I’m just waiting for the BP to start again as I restarted it.

Ok, let’s change the topology file

Replace your actual topology file with this one (IOHK nodes) and restart the BP


wget https://hydra.iohk.io/job/Cardano/cardano-node/cardano-deployment/latest-finished/download/1/mainnet-topology.json

For mainnet

hang on, I am going to tackle this on a fresh brain, it’s 11.30pm here and I think I should sleep before I get frustrated…I’ll try with the new topology file in the morning…

2 Likes

I’m getting a different error now, which doesn’t look good.

cardano.node.DnsSubscription:Error:7134] [2022-01-09 19:43:41.62 UTC] Domain: “relays-new.cardano-mainnet.iohk.io” Application Exception: 3.128.217.217:3001 InvalidBlock (At (Block {blockPointSlot = SlotNo 50004870, blockPointHash = e2b612ad73d579946f6af4c359326e5820a7cdfa2a2e139ca515ffc6afac6016})) (ValidationError (ExtValidationErrorHeader (HeaderProtocolError (HardForkValidationErrFromEra S (S (S (S (Z…(error continues with more)

Thoughts…?

can be an error with the db… do u have the posibility to download from another node (relay)?

Yeah, I think I’ll have to do that.

I checked the logs on my relay and it had some errors of " remoteAddress = 95.216.202.156:6000}) (Left FetchDeclineChainNotPlausible)"

I think both db are corrupted. I will let them resync from scratch. Now that they are both syncing, I can see connections between them.

Looking back, when I upgraded to 1.33.0 I did both relay and BP at about the same time. I think I should have let the relay fully rebuild the ledger and then do the BP after that. Instead of concurrently. Do you reckon that could have been an issue?

I’ll let them sync and report back.

Thanks @Alexd1985 for your help so far, you’re always reliable

You are welcome :beers:

Hello @jeremyisme

I’m worried. I also use the Coincashew guide. The guide is not up to date and I am afraid.

Is your problem solved?

Thanks.

I’m still waiting for the sync to finish. I haven’t seen anyone else with a problem, so I think it might have been my fault for upgrading both at the same time, rather than waiting for one to finish and then starting on the next one.

I don’t think there is any urgency to upgrade to 1.33.0 just yet. It is just an improvement release, so you’re safe to stay on one of the other recent versions for now.

I should have an update in a few hours once I am resynced.

Hey Jeremy, Alex taught me this trick with my relay way back. Toss this in your topology:


{
  "Producers": [
    {
      "addr": "relays-new.cardano-mainnet.iohk.io",
      "port": 3001,
      "valency": 2
    }
  ]
}

Once its synced up you can change your relay’s topology back to normal and direct it to your block producer IPv4 again. At this point, the BP should begin to sync as well (if it hasn’t already). Good luck brother :+1:

1 Like

Thanks, I’ve got both BP and relay syncing now. But I think iohk servers don’t like my running both syncing concurrently, so one is well behind the other (errors in relay saying iohk servers rejected due to concurrent connections). I’m baremetals so it is all coming from one IP address as far as they are concerned. I am hoping it picks up speed once the BP is synced fully.

KES dates aren’t showing on the BP anymore but I will wait till it has synced before I try fix that. It could just be synced problem.

sigh only a few hours to go I think. There were no blocks scheduled for this epoch anyway, so I’m not missing anything.

you will not see the KES date while the node is syncing

2 Likes

A bit of a late update, but after deleting and resyncing the db the relay and BP are talking to each other.

The lesson, I think, is to let each machine fully start (i.e. rebuild the ledger) before you upgrade the next one. That’s what might have caused my problem, but I’m not sure.