1.35.3 stuck in epoch 364

Yes, cardano-node and cardano-cli are both version 1.35.3 as also verified with the --version argument. My systemd script use the same (and only) cardano-node binary that’s installed on the machine.

Do u have this revision number?
950c4e222086fed5ca53564e642434ce9307b0b9

No, as mentioned above, I’m on ea6d78c7.

This shouldn’t matter as evident in the output for git diff 950c4e222086fed5ca53564e642434ce9307b0b9 ea6d78c7

diff --git a/docker-compose.yml b/docker-compose.yml
index c85bc17c8..f6d6f9c70 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -2,7 +2,7 @@ version: "3.5"
 
 services:
   cardano-node:
-    image: inputoutput/cardano-node:${CARDANO_NODE_VERSION:-1.35.3}
+    image: inputoutput/cardano-node:${CARDANO_NODE_VERSION:-1.35.3-configs}
     environment:
       - NETWORK=${NETWORK:-mainnet}
     volumes:
@@ -15,7 +15,7 @@ services:
         max-file: "10"
 
   cardano-submit-api:
-    image: inputoutput/cardano-submit-api:${CARDANO_SUBMIT_API_VERSION:-1.35.3}
+    image: inputoutput/cardano-submit-api:${CARDANO_SUBMIT_API_VERSION:-1.35.3-configs}
     environment:
       - NETWORK=${NETWORK:-mainnet}
     depends_on:
diff --git a/flake.lock b/flake.lock
index 4855ece8f..751268fcc 100644
--- a/flake.lock
+++ b/flake.lock
@@ -4521,11 +4521,11 @@
         ]
       },
       "locked": {
-        "lastModified": 1653579289,
-        "narHash": "sha256-wveDdPsgB/3nAGAdFaxrcgLEpdi0aJ5kEVNtI+YqVfo=",
+        "lastModified": 1661431587,
+        "narHash": "sha256-KsWH17SMi+In5J9xkGfwxa8971GAsNi9Cp7zNpEtZdM=",
         "owner": "input-output-hk",
         "repo": "iohk-nix",
-        "rev": "edb2d2df2ebe42bbdf03a0711115cf6213c9d366",
+        "rev": "b8376719bf495694f538acf3e7eed5086cde303c",
         "type": "github"
       },
       "original": {

Since IO’s shipped binaries (80b5637a5520648d50b0763d7677ddbe374cd598) ceased to work, I’m compiling the binaries on my kvm host machine and pull it via scp into my kvm clients, so there’s no rust build machinery there, thus a .cargo directory would be absent. Of course, I had to install some build tools on the kvm clients after all for the shared libraries (libsodium namely).

I downloaded the snapshot, but after running into the telescope length error, I’ve removed files that fall within the last few days (epoch 365). My nodes are still replaying the blocks. :sleeping:

actually, I think you’re right and I messed up… will stop my relay, pull down another snapshot and test my hypothesis… will report back shortly. :crossed_fingers:

Ok, I thought 1.35.3 config is not compatible with mainnet

Release 1.35.3-configs

Latest

This release provides configuration file updates for the new Cardano test environements (Preview, Pre-Production)

Fixed! Thanks all for the contributions.

@lauris helped me getting on the right path.
Will write up what was going on shortly.

3 Likes

When gLiveView lies to you… Bringing LOXE stake pool back online.

In retrospect, the solution was simple.

As pointed out throughout this thread: Just use the correct version of cardano-node.

All of the revisions that I had pulled down and/or compiled should have worked.

There’s no difference between git commit 950c4e222086fed5ca53564e642434ce9307b0b9 and 80b5637a5520648d50b0763d7677ddbe374cd598. The latter was the version that IO shipped as binaries and that was installed on our block producer and relay nodes on August 20th. The difference between those commits and ea6d78c7 was already shown in this thread and is also not an issue. ea6d78c7 is the version that ended up running and producing our first block in the babbage era as well. Pooltool shows that 9% of 1.35.3 nodes are running this version.

So all of those versions should work and gLiveView, which is one of the tools we use to monitor our nodes, reported that we have been running 1.35.3 (in various revisions since August 20th). Then what was the issue?

gLiveView (v1.27.2) reports the first cardano-node version it finds, which was not the same binary we use to actually run the node.

When I copied the new cardano-node binary to the relay node back in August, I did not replace the one in /usr/local/bin, which our systemd script uses. So gLiveView was reporting on the cardano-node that it found in the same directory - not the one that’s actually running.

Since we don’t have our stake pool running on (legacy) testnet since the time it was/has been “catastrophically broken” and therefore don’t run different cardano-node processes on the same machine any longer, I have replaced the cardano-node binary in the directory next to the gLiveView with a symbolic link to the one cardano-node binary on the system in /usr/local/bin.

Thanks everyone for your comments on the thread, it helped me finding the issue.

3 Likes

Thank you for digging deeper.

Here is a comprehensive guide for those having issues updating the /usr/local/bin.

Cardano Stackexchange - 1.35.3 not syncing since epoch 365

1 Like

I think I have the exact same issue, as my relay bombed out at the end of 364… but since restarting it seems to have 0 outgoing connections and I’m not sure how to interpret the errors :expressionless:

Any help would be GREATLY appreciated…

output from: sudo systemctl status cardano-node

gLiveView

versions:
image

To see if it is the same issue, you have to check which cardano-node binary your systemd unit starts. This does not have to be the same your shell finds (the one reporting to be 1.35.3, which apparently also gLiveView relies on).

The most bare-bone way is to look in the /etc/systemd/system/cardano-node.service file if it uses the full path to a different binary in the ExecStart, if it sets a $PATH in Environment, …

As usual, which binary it is that your shell uses, that reports the seemingly correct version, can be found out by which cardano-node or command -v cardano-node.

I’m sorry I’m struggling with this, but here is the contents of the cardano-node.service file:

and the output of which cardano-node
image

what’s your output for grep 'cardano-node' /home/blink/cardano-my-node/startRelayNode1.sh?

This should be it:

#!/bin/bash
DIRECTORY=/home/blink/cardano-my-node
PORT=6000
HOSTADDR=0.0.0.0
TOPOLOGY=${DIRECTORY}/topology.json
DB_PATH=/media/blink/Cardano/db
SOCKET_PATH=${DIRECTORY}/db/socket
CONFIG=${DIRECTORY}/mainnet-config.json
/usr/local/bin/cardano-node run --topology ${TOPOLOGY} --database-path ${DB_PATH} --socket-path ${SOCKET_PATH} --host-addr ${HOSTADDR} --port ${PORT} --config ${CONFIG}

I assume /usr/local/bin/cardano-node --version will not put out 1.35.3 as your version.

You can either copy your cardano-node binary from /home/blink/.local/bin/cardano-node to /usr/local/bin/cardano-node or change your /home/blink/cardano-my-node/startRelayNode1.sh to use the cardano-node binary in your /home/blink/.local/bin directory.

1 Like

I created a tiny shell script to help you find out what cardano-node versions yous systems see.
This is by no means failsafe as you can still have a direct reference to a cardano-node via an absolute path that isn’t in your $PATH, but it may help you if you do - which has been the case for many.

#!/bin/sh
for node in $(which -a cardano-node); do
  echo "$node has version:"
  $node --version
done

https://gist.githubusercontent.com/manonthemat/e30c30ae8083daeef8b84431ae1d9e73/raw/452d871d13d7631a41c8bedb16e0c4253018e5e2/cnode-versions.sh

1 Like

Looks like there’s a 1.34 in there somewhere?

I’ll try follow your first step above and report back!

yeah you’re right o.0

As mentioned above then, you could change your /home/blink/cardano-my-node/startRelayNode1.sh to look like this:

#!/bin/bash
DIRECTORY=/home/blink/cardano-my-node
PORT=6000
HOSTADDR=0.0.0.0
TOPOLOGY=${DIRECTORY}/topology.json
DB_PATH=/media/blink/Cardano/db
SOCKET_PATH=${DIRECTORY}/db/socket
CONFIG=${DIRECTORY}/mainnet-config.json
/home/blink/.local/bin/cardano-node run --topology ${TOPOLOGY} --database-path ${DB_PATH} --socket-path ${SOCKET_PATH} --host-addr ${HOSTADDR} --port ${PORT} --config ${CONFIG}

Then restart the service.

Apologies for the delay, I had to do several restarts and fix one small issue with my topology file.

I copied the binary over as you suggested (on both nodes) and it appears the BP and relay are working now!!!

Relay:

BP:

Thank you so much, sincerely.

1 Like