TESTNET - Sluggish Relay node syncing and no Total Tx

Hi All,

Have setup a relay node on the testnet and have setup a block producer (currently in relay mode). Have been having issues getting the relay to sync up and not get behind. It syncs up then falls behind. Also even with the topology updater and the result "“glad you’re staying with us” im not showing up on https://explorer.cardano-testnet.iohkdev.io/relays/topology.json and there is no Total Tx count.

I have checked that i am in fact on the testnet, that all the .json files are testnet ones, the topology update is pulling testnet the hardware are both the same 12GB memory, 4 cores

any ideas?

Still showing syncing 100%

Also check inside the configuration file if u have TraceMempool set to true

Cheers,

You’ve got some incoming connections, which is good. But check your config. You have pending transactions, so something isn’t processing correctly.

TraceMempool is set to true in testnet-config.json
whats strange is its sitting at 100% but i dont have the usual green :slight_smile: , which would be expected

Total Tx not showing up seems like its not talking to something correctly right?

Yeah, i agree, but cant seem to find anything, ill look a bit harder

what does your journalctl --unit=cardano-node --follow give you? Any unusual errors?

just watching it now, doesnt seem to be doing anything unfamiliar. Im going it restart it and watch

have double checked the .json files… all looks ok to me

Still battling with this issue, moved the whole node to a different hard drive as i thought that might be the issues. Nope, same issue. no “Total Tx” and the node syncs up, goes green :-), then falls behind.

There has to be something wrong with this config, i just cant figure out what

Can u share the glive for all nodes?

Hi @Alexd1985 do you want a screenshot or the gLive.sh ?

Yes, for BP and relays

also try to set both lines to true, save the file and restart the node


 "TraceMempool": true,
  "TraceMux": true,

Here is a screenshot of the relay i just restarted it with :

#!/bin/bash
DIRECTORY=/home/user/cardano-my-node
PORT=3000
HOSTADDR=0.0.0.0
TOPOLOGY=${DIRECTORY}/testnet-topology.json
DB_PATH=${DIRECTORY}/db
SOCKET_PATH=${DIRECTORY}/db/socket
CONFIG=${DIRECTORY}/testnet-config.json
#/home/user/.local/bin/cardano-node run +RTS -N -A16m -qg -qb -RTS --topology ${TOPOLOGY} --database-path ${DB_PATH} --socket-path ${SOCKET_PATH} --host-addr ${HOSTADDR} --port ${PORT} --config ${CONFIG}
/home/user/.local/bin/cardano-node +RTS -N -RTS run --topology ${TOPOLOGY} --database-path ${DB_PATH} --socket-path ${SOCKET_PATH} --host-addr ${HOSTADDR} --port ${PORT} --config ${CONFIG}

relay

The relay starts and immediately syncs then falls behind.

  "TraceMempool": true,
  "TraceMux": false,

i have now set:

  "TraceMempool": true,
  "TraceMux": true,

and restarted

just moving the Block Producer to another location, will get info for you in a few mins :slight_smile:

@Alexd1985
ok heres the BP and Relay gLive.
The relay has just sat there for over 45mins and the Block count has not changed. there is definitely something strange going on. It will Sync when started up, but then not take nay more blocks.

The BlockProducer just misses slots and only gets as far as the relay in Block count, which i guess is expected if the Relay is not growing its copy of the chain.

Can u add only iohk tesnet servers inside BP topology file and restart the node? Let’s see if there are any differences

One thing i just noticed inside topologyUpdater.sh is the line

[[ "${NWMAGIC}" = "764824073" ]] && NETWORK_IDENTIFIER="--mainnet" || NETWORK_IDENTIFIER="--testnet-magic ${NWMAGIC}"

I have changed the testnet magic to:

[[ "${NWMAGIC}" = "1097911063" ]] && NETWORK_IDENTIFIER="--mainnet" || NETWORK_IDENTIFIER="--testnet-magic ${NWMAGIC}"

Ok will everything except iohk testnet server and my BlockProducer (which i have stopped) from testnet-topology.json

@Alexd1985
Ok heres some logs on startup sync just before it stops syncing, looks like im hitting a bug (that you have also noted - link below)

Candidate contains blocks from future exceeding clock skew limit:”

Feb 08 22:30:40 ctestnet cardano-node[426325]: [ctestnet:cardano.node.ChainDB:Notice:711] [2022-02-08 22:30:40.39 UTC] Chain extended, new tip: 4a80100f6882e8b6363d9a653d9d4a4df64a5f9a86635bd57a0f13e8d3a26098 at slot 49987832
Feb 08 22:30:41 ctestnet cardano-node[426325]: [ctestnet:cardano.node.ChainDB:Notice:711] [2022-02-08 22:30:41.68 UTC] Chain extended, new tip: 95aa591f207bfb57e7200f313b568916888e33722f3e040ac10a21024fbbf8eb at slot 49988260
Feb 08 22:30:43 ctestnet cardano-node[426325]: [ctestnet:cardano.node.ChainDB:Notice:711] [2022-02-08 22:30:43.78 UTC] Chain extended, new tip: a0e59fc4624b926770f3878a6c51dcc6ea470a0d8af38e78dfe8ab1e04ad5036 at slot 49988456
Feb 08 22:30:45 ctestnet cardano-node[426325]: [ctestnet:cardano.node.ChainDB:Notice:711] [2022-02-08 22:30:45.32 UTC] Chain extended, new tip: 40eac66a8c52470b28e8022161f626a29fa61347df037a83034698f752fcc30b at slot 49988918
Feb 08 22:30:46 ctestnet cardano-node[426325]: [ctestnet:cardano.node.ChainDB:Notice:711] [2022-02-08 22:30:46.63 UTC] Chain extended, new tip: b390422e6e3429319b52c1823280df95081b83a28fb18ea353a6f86af0304f79 at slot 49989416
Feb 08 22:30:48 ctestnet cardano-node[426325]: [ctestnet:cardano.node.ChainDB:Notice:711] [2022-02-08 22:30:48.51 UTC] Chain extended, new tip: b55145ce6f4b36362156fc90e1f06cac400effc695c1280f40fc1d5620706ed6 at slot 49989975

...

Feb 08 22:30:49 ctestnet cardano-node[426325]: [ctestnet:cardano.node.ChainDB:Info:711] [2022-02-08 22:30:49.31 UTC] before next, messages elided = 16857770077514
Feb 08 22:30:49 ctestnet cardano-node[426325]: [ctestnet:cardano.node.ChainDB:Info:711] [2022-02-08 22:30:49.31 UTC] Valid candidate 1a10103f4dabcb12cbb015249922dbd1987c5168ce1c0b4f36a39a8172c287f6 at slot 49990259
Feb 08 22:30:49 ctestnet cardano-node[426325]: [ctestnet:cardano.node.ChainDB:Error:711] [2022-02-08 22:30:49.31 UTC] Candidate contains blocks from future exceeding clock skew limit: 1a10103f4dabcb12cbb015249922dbd1987c5168ce1c0b4f36a39a8172c287f6 at slot 49990259, slots 1a10103f4dabcb12cbb015249922dbd1987c5168ce1c0b4f36a39a8172c287f6@49990259

and link to issue:

Ok SO… turns out after all of that. My RTC was incorrect and not updating its NTP correctly… whats weird about this is that i have soem other nodes running on the same host… all were ok.

So… just enabled NTP on these testnet nodes… and its fine now.

2 Likes

wow, basically the nodes were in future… that’s why I saw the delay -179

glad u fixed

Yeah, interesting troubleshooting. I guess in future its wise to notice the negative Last delay time.
Thanks for the help!

1 Like