Issues between relay and BP on 1.33.0

jeremyisme · 9 January 2022 09:09

Hi,

I updated my relay and BP to 1.33.0. My relay seems to be running fine, connecting to external peers, synced chain, etc. but my BP and relay won’t talk to each other after the update. I am getting a number of errors, e.g.

:cardano.node.IpSubscription:Info:138] [2022-01-09 09:00:58.05 UTC] IPs: 0.0.0.0:0 [X.X.X.X:XXXX] Restarting Subscription after 1.001100604s desired valency 1 current valency 0

cardano.node.Forge:Error:128] [2022-01-09 09:01:03.00 UTC] fromList [(“val”,Object (fromList [(“kind”,String “TraceNoLedgerView”),(“slot”,Number 5.0152572e7)])),(“credentials”,String “Cardano”)

The BP shows as core, KES displays right, but is stuck on “starting” and won’t connect to the relay (it has been more than enough hours for the ledger rebuild). It was obviously working fine before the update. I can ping between the servers, and in the relay log, it says connection successful sometimes, but then it closes the connection. I have updated the config files as well, no difference.

The BP tip ref is right in gLiveView.

Has anyone has this issue with 1.33.0 or have any ideas…

Alexd1985 · 9 January 2022 09:59

share the glive output for BP and also the startup script for BP

jeremyisme · 9 January 2022 10:11

Standard coincashew startup for BP.

DIRECTORY=$NODE_HOME

PORT=6000

HOSTADDR=0.0.0.0

TOPOLOGY=\${DIRECTORY}/${NODE_CONFIG}-topology.json

DB_PATH=\${DIRECTORY}/db

SOCKET_PATH=\${DIRECTORY}/db/socket

CONFIG=\${DIRECTORY}/${NODE_CONFIG}-config.json

KES=\${DIRECTORY}/kes.skey

VRF=\${DIRECTORY}/vrf.skey

CERT=\${DIRECTORY}/node.cert

/usr/local/bin/cardano-node run +RTS -N -A16m -qg -qb -RTS --topology \${TOPOLOGY} --database-path \${DB_PATH} --socket-path \${SOCKET_PATH} --host-addr \${HOSTADDR} --port \${PORT} --config \${CONFIG} --shelley-kes-key \${KES} --shelley-vrf-key \${VRF} --shelley-operational-certificate \${CERT}

I also tried changing the HOSTADDR to the BP IP, but no difference. I’m just waiting for the BP to start again as I restarted it.

Alexd1985 · 9 January 2022 10:17

Ok, let’s change the topology file

Replace your actual topology file with this one (IOHK nodes) and restart the BP


wget https://hydra.iohk.io/job/Cardano/cardano-node/cardano-deployment/latest-finished/download/1/mainnet-topology.json

For mainnet

jeremyisme · 9 January 2022 10:24

hang on, I am going to tackle this on a fresh brain, it’s 11.30pm here and I think I should sleep before I get frustrated…I’ll try with the new topology file in the morning…

jeremyisme · 9 January 2022 19:46

I’m getting a different error now, which doesn’t look good.

cardano.node.DnsSubscription:Error:7134] [2022-01-09 19:43:41.62 UTC] Domain: “relays-new.cardano-mainnet.iohk.io” Application Exception: 3.128.217.217:3001 InvalidBlock (At (Block {blockPointSlot = SlotNo 50004870, blockPointHash = e2b612ad73d579946f6af4c359326e5820a7cdfa2a2e139ca515ffc6afac6016})) (ValidationError (ExtValidationErrorHeader (HeaderProtocolError (HardForkValidationErrFromEra S (S (S (S (Z…(error continues with more)

Thoughts…?

Alexd1985 · 9 January 2022 19:50

can be an error with the db… do u have the posibility to download from another node (relay)?

jeremyisme · 9 January 2022 19:59

Yeah, I think I’ll have to do that.

jeremyisme · 9 January 2022 20:25

I checked the logs on my relay and it had some errors of " remoteAddress = 95.216.202.156:6000}) (Left FetchDeclineChainNotPlausible)"

I think both db are corrupted. I will let them resync from scratch. Now that they are both syncing, I can see connections between them.

Looking back, when I upgraded to 1.33.0 I did both relay and BP at about the same time. I think I should have let the relay fully rebuild the ledger and then do the BP after that. Instead of concurrently. Do you reckon that could have been an issue?

I’ll let them sync and report back.

Thanks @Alexd1985 for your help so far, you’re always reliable

Alexd1985 · 9 January 2022 20:44

You are welcome

dotMaxi · 10 January 2022 02:53

Hello @jeremyisme

I’m worried. I also use the Coincashew guide. The guide is not up to date and I am afraid.

Is your problem solved?

Thanks.

jeremyisme · 10 January 2022 03:29

I’m still waiting for the sync to finish. I haven’t seen anyone else with a problem, so I think it might have been my fault for upgrading both at the same time, rather than waiting for one to finish and then starting on the next one.

I don’t think there is any urgency to upgrade to 1.33.0 just yet. It is just an improvement release, so you’re safe to stay on one of the other recent versions for now.

I should have an update in a few hours once I am resynced.

J_Sal · 10 January 2022 05:00

Hey Jeremy, Alex taught me this trick with my relay way back. Toss this in your topology:


{
  "Producers": [
    {
      "addr": "relays-new.cardano-mainnet.iohk.io",
      "port": 3001,
      "valency": 2
    }
  ]
}

Once its synced up you can change your relay’s topology back to normal and direct it to your block producer IPv4 again. At this point, the BP should begin to sync as well (if it hasn’t already). Good luck brother

jeremyisme · 10 January 2022 05:14

Thanks, I’ve got both BP and relay syncing now. But I think iohk servers don’t like my running both syncing concurrently, so one is well behind the other (errors in relay saying iohk servers rejected due to concurrent connections). I’m baremetals so it is all coming from one IP address as far as they are concerned. I am hoping it picks up speed once the BP is synced fully.

KES dates aren’t showing on the BP anymore but I will wait till it has synced before I try fix that. It could just be synced problem.

sigh only a few hours to go I think. There were no blocks scheduled for this epoch anyway, so I’m not missing anything.

Alexd1985 · 10 January 2022 08:09

you will not see the KES date while the node is syncing

jeremyisme · 14 January 2022 19:07

A bit of a late update, but after deleting and resyncing the db the relay and BP are talking to each other.

The lesson, I think, is to let each machine fully start (i.e. rebuild the ledger) before you upgrade the next one. That’s what might have caused my problem, but I’m not sure.

Topic		Replies	Views
BP node not connecting to Relay node Operate a Stake Pool	44	1493	25 October 2021
Problems into Relays and BP from yesterday Operate a Stake Pool	18	662	22 September 2021
Relay died can't connect to BP after upgraded to 1.29.0 Stake Delegation	49	1229	22 September 2021
Upgraded to 1.35.3 but DB not syncing and relay not receiving incoming connections? Operate a Stake Pool	2	596	30 August 2022
P2P relay not connect to the bp after updating to p2p Operate a Stake Pool	1	430	10 April 2023

Issues between relay and BP on 1.33.0

Related topics