Cardano node gets stuck while syncing

I have recently set up my own cardano node on a server I have rented. Everything went smoothly up until I actually launched the node. For background, I followed the official cardano documentation for installing the node and later running the node on a testnet.

I ran it as a systemd service and used gLiveView.sh to monitor the node as it began syncing. For some reason, the ‘Syncing’ field gets stuck at exactly 83.3% no matter what I do. First time this happened I cleared the db directory and made sure I was using the correct pre-production configuration files, although it still froze at 83.3% after restarting and about an hour of syncing.

I should note that the field Tip (ref) on gLiveView does in fact change, increasing by 2-3 every second or two. I’m not sure if this means that the node is indeed syncing, albeit slowly, because I don’t actually understand what that field means.

image

When I use cardano-cli query tip --testnet-magic 1 I get the following:

{
    "block": 883584,
    "epoch": 66,
    "era": "Babbage",
    "hash": "7a049cd5d67887518e284990c2d681d222b6145e3f664dd5c88ebeced3ded96f",
    "slot": 27059485,
    "slotInEpoch": 189085,
    "slotsToEpochEnd": 242915,
    "syncProgress": "84.09"
}

The syncProgress field here is interestingly different than what is shown on gLiveView.

Is my node just syncing incredibly slowly, or is it frozen completely?

the field Tip (ref) on gLiveView does in fact change, increasing by 2-3 every second or two. I’m not sure if this means that the node is indeed syncing, albeit slowly, because I don’t actually understand what that field means.

That’s simply current time expressed as slot (thus, reference tip) - you can read more about gLiveView fields here.

If your node is stuck, it could be because you’re using an older version of node (you should use either 1.35.[5-7] or node-8.x.x). You can check further status by examining node logs - location of which depends on your config.json. If using developers.cardano.org or IO (docs.cardano.org) guides - that would be viewable via journalctl

1 Like

Regarding the node versions, both my node and cli use version 8.1.1. As for logs, journalctl reports the expected Chain extended, new tip message for the first while as the node syncs. Once it reaches the slot 27059485 however (and it’s always this particular one, at block 883583 on epoch 66), the Chain extended logging stops and instead journalctl reports a collection of Info messages related to peers and connections which repeat continually, as shown by this log snippet:

[2023-07-02 22:17:16.92 UTC] TrConnectionManagerCounters (ConnectionManagerCounters {fullDuplexConns = 1, duplexConns = 26, unidirectionalConns = 4, inboundConns = 1, outboundConns = 32})
[2023-07-02 22:17:16.92 UTC] PeerStatusChanged (ColdToWarm (Just 185.15.244.215:3001) 144.24.168.10:3003)
[2023-07-02 22:17:16.92 UTC] TracePromoteColdDone 40 30 144.24.168.10:3003
[2023-07-02 22:17:16.92 UTC] PeerSelectionCounters {coldPeers = 50, warmPeers = 28, hotPeers = 2, localRoots = []}
[2023-07-02 22:17:16.92 UTC] TrInboundGovernorCounters (InboundGovernorCounters {coldPeersRemote = 12,
[2023-07-02 22:17:16.99 UTC] TrConnectionManagerCounters (ConnectionManagerCounters {fullDuplexConns = 1, duplexConns = 26, unidirectionalConns = 4, inboundConns = 1, outboundConns = 33})
[2023-07-02 22:17:17.10 UTC] TrConnectionHandler (ConnectionId {localAddress = 185.15.244.215:3001, remoteAddress = 35.185.48.55:3002}) (TrHandshakeSuccess NodeToNodeV_10 (NodeToNodeVersionData {networkMagic = NetworkMagic {unNetworkMagic = 1}, diffusionMode = InitiatorAndResponderDiffusionMode, peerSharing = NoPeerSharing,
[2023-07-02 22:17:17.10 UTC] TrConnectionManagerCounters (ConnectionManagerCounters {fullDuplexConns = 1, duplexConns = 27, unidirectionalConns = 4, inboundConns = 1, outboundConns = 33})
[2023-07-02 22:17:17.10 UTC] PeerStatusChanged (ColdToWarm (Just 185.15.244.215:3001) 35.185.48.55:3002)
[2023-07-02 22:17:17.10 UTC] TracePromoteColdDone 40 31 35.185.48.55:3002
[2023-07-02 22:17:17.10 UTC] TrInboundGovernorCounters (InboundGovernorCounters {coldPeersRemote = 12,
[2023-07-02 22:17:17.10 UTC] PeerSelectionCounters {coldPeers = 49, warmPeers = 29, hotPeers = 2, localRoots = []}

My configuration files are straight from the official cardano docs and I have not altered them in any way because I’m not familiar enough with them to warrant doing so, although I assume the problem may lie somewhere in them because I ended up creating a second node configuration using preview testnet configuration files instead of the preproduction testnet, and the node synced perfectly without problems after just 2 hours.

Fix:

Downgrade to 1.35.6 and delete DB
Let node sync
Upgrade to 8.1.1

Should be working after this.

Found this from Node on preproduction network gets stuck at slot 27059485 while syncing · Issue #5379 · input-output-hk/cardano-node · GitHub

and just started trying that one.