1.35.3 stuck in epoch 364

Fixed! Thanks all for the contributions.

@lauris helped me getting on the right path.
Will write up what was going on shortly.

3 Likes

When gLiveView lies to youā€¦ Bringing LOXE stake pool back online.

In retrospect, the solution was simple.

As pointed out throughout this thread: Just use the correct version of cardano-node.

All of the revisions that I had pulled down and/or compiled should have worked.

Thereā€™s no difference between git commit 950c4e222086fed5ca53564e642434ce9307b0b9 and 80b5637a5520648d50b0763d7677ddbe374cd598. The latter was the version that IO shipped as binaries and that was installed on our block producer and relay nodes on August 20th. The difference between those commits and ea6d78c7 was already shown in this thread and is also not an issue. ea6d78c7 is the version that ended up running and producing our first block in the babbage era as well. Pooltool shows that 9% of 1.35.3 nodes are running this version.

So all of those versions should work and gLiveView, which is one of the tools we use to monitor our nodes, reported that we have been running 1.35.3 (in various revisions since August 20th). Then what was the issue?

gLiveView (v1.27.2) reports the first cardano-node version it finds, which was not the same binary we use to actually run the node.

When I copied the new cardano-node binary to the relay node back in August, I did not replace the one in /usr/local/bin, which our systemd script uses. So gLiveView was reporting on the cardano-node that it found in the same directory - not the one thatā€™s actually running.

Since we donā€™t have our stake pool running on (legacy) testnet since the time it was/has been ā€œcatastrophically brokenā€ and therefore donā€™t run different cardano-node processes on the same machine any longer, I have replaced the cardano-node binary in the directory next to the gLiveView with a symbolic link to the one cardano-node binary on the system in /usr/local/bin.

Thanks everyone for your comments on the thread, it helped me finding the issue.

3 Likes

Thank you for digging deeper.

Here is a comprehensive guide for those having issues updating the /usr/local/bin.

Cardano Stackexchange - 1.35.3 not syncing since epoch 365

1 Like

I think I have the exact same issue, as my relay bombed out at the end of 364ā€¦ but since restarting it seems to have 0 outgoing connections and Iā€™m not sure how to interpret the errors :expressionless:

Any help would be GREATLY appreciatedā€¦

output from: sudo systemctl status cardano-node

gLiveView

versions:
image

To see if it is the same issue, you have to check which cardano-node binary your systemd unit starts. This does not have to be the same your shell finds (the one reporting to be 1.35.3, which apparently also gLiveView relies on).

The most bare-bone way is to look in the /etc/systemd/system/cardano-node.service file if it uses the full path to a different binary in the ExecStart, if it sets a $PATH in Environment, ā€¦

As usual, which binary it is that your shell uses, that reports the seemingly correct version, can be found out by which cardano-node or command -v cardano-node.

Iā€™m sorry Iā€™m struggling with this, but here is the contents of the cardano-node.service file:

and the output of which cardano-node
image

whatā€™s your output for grep 'cardano-node' /home/blink/cardano-my-node/startRelayNode1.sh?

This should be it:

#!/bin/bash
DIRECTORY=/home/blink/cardano-my-node
PORT=6000
HOSTADDR=0.0.0.0
TOPOLOGY=${DIRECTORY}/topology.json
DB_PATH=/media/blink/Cardano/db
SOCKET_PATH=${DIRECTORY}/db/socket
CONFIG=${DIRECTORY}/mainnet-config.json
/usr/local/bin/cardano-node run --topology ${TOPOLOGY} --database-path ${DB_PATH} --socket-path ${SOCKET_PATH} --host-addr ${HOSTADDR} --port ${PORT} --config ${CONFIG}

I assume /usr/local/bin/cardano-node --version will not put out 1.35.3 as your version.

You can either copy your cardano-node binary from /home/blink/.local/bin/cardano-node to /usr/local/bin/cardano-node or change your /home/blink/cardano-my-node/startRelayNode1.sh to use the cardano-node binary in your /home/blink/.local/bin directory.

1 Like

I created a tiny shell script to help you find out what cardano-node versions yous systems see.
This is by no means failsafe as you can still have a direct reference to a cardano-node via an absolute path that isnā€™t in your $PATH, but it may help you if you do - which has been the case for many.

#!/bin/sh
for node in $(which -a cardano-node); do
  echo "$node has version:"
  $node --version
done

https://gist.githubusercontent.com/manonthemat/e30c30ae8083daeef8b84431ae1d9e73/raw/452d871d13d7631a41c8bedb16e0c4253018e5e2/cnode-versions.sh

1 Like

Looks like thereā€™s a 1.34 in there somewhere?

Iā€™ll try follow your first step above and report back!

yeah youā€™re right o.0

As mentioned above then, you could change your /home/blink/cardano-my-node/startRelayNode1.sh to look like this:

#!/bin/bash
DIRECTORY=/home/blink/cardano-my-node
PORT=6000
HOSTADDR=0.0.0.0
TOPOLOGY=${DIRECTORY}/topology.json
DB_PATH=/media/blink/Cardano/db
SOCKET_PATH=${DIRECTORY}/db/socket
CONFIG=${DIRECTORY}/mainnet-config.json
/home/blink/.local/bin/cardano-node run --topology ${TOPOLOGY} --database-path ${DB_PATH} --socket-path ${SOCKET_PATH} --host-addr ${HOSTADDR} --port ${PORT} --config ${CONFIG}

Then restart the service.

Apologies for the delay, I had to do several restarts and fix one small issue with my topology file.

I copied the binary over as you suggested (on both nodes) and it appears the BP and relay are working now!!!

Relay:

BP:

Thank you so much, sincerely.

1 Like