Testnet Node Error Syncing Ledger

Hi All

I’m new to building Cardano nodes, but I work as an infrastructure engineer, so I I’m not completely inexperienced!

I have recently build a node on testnet, the build worked fine and I was able to connect and begin downloading the ledger. However during the sync, I received the following errors

cardano.node.DnsSubscription:Error:70] [2021-09-02 16:22:39.07 UTC] Domain: “relays-new.cardano-testnet.iohkdev.io” Application Exception: 3.9.80.183:3001 MuxError (MuxIOException writev: resource vanished (Broken pipe)) “(sendAll errored)”

[localhos:cardano.node.DnsSubscription:Error:2571] [2021-09-02 16:22:40.56 UTC] Domain: “relays-new.cardano-testnet.iohkdev.io” Application Exception: 3.129.133.68:3001 MuxError MuxBearerClosed “<socket: 27> closed when reading data, waiting on next header True”

[localhos:cardano.node.ErrorPolicy:Warning:53] [2021-09-02 16:22:40.56 UTC] IP 3.129.133.68:3001 ErrorPolicySuspendPeer (Just (ApplicationExceptionTrace (MuxError MuxBearerClosed “<socket: 27> closed when reading data, waiting on next header True”))) 20s 20s

When I stop the sync and restart it, I get the same errors. I have deleted the db and node.socket and started again, for the first little while the sync is fine, and then it will error.

any help would be really appreciated.

Version: 1.27.0
Linux: Ubuntu 18.04

I am having the same exact issue now. Same exact setup also, using Docker. I am thinking to find a new test relay server. Any ideas?

are there other testnet relays that can be used in the topology json file?

Ive tried other relays and get the same errors

Try to use IPs instead DNS name inside topology file

I am not sure if the relays are the issue. I just upgraded to 1.29, as I thought that was the issue, but after I cleaned DB and tried again and also got stuck at 99%. I am guessing if the relays were bad, I would not get to 1%. So it seems something else is causing me to stop at 99%. I see this issue has happened to a few people, but I am not sure at this time. Are there any cli commands that i can run to check on the health of the node other than liveview? In closing, this seems to have something to do with the Alonso fork.

journalctl -e -f -u cardano-node and check for errors

There are no errors showing in logs

I have rebuilt the node from scratch but this time in CentOS 8, currently syncing the ledger, will update if that works. Starting to think this is just related to the Alonzo update on testnet

I am going at this also, re-reading the docs and checking my scripts

also I am using Docker on windows, so I dont have journalctl, I just keep getting this message after I get synced at 99%.

ErrorPolicySuspendConsumer (Just (ConnectionExceptionTrace (SubscriberError {seType = SubscriberParallelConnectionCancelled, seMessage = "Parallel connection cancelled", seStack = []}))) 1s
ErrorPolicySuspendConsumer (Just (ConnectionExceptionTrace (SubscriberError {seType = SubscriberParallelConnectionCancelled, seMessage = "Parallel connection cancelled", seStack = []}))) 1s
ErrorPolicySuspendConsumer (Just (ConnectionExceptionTrace (SubscriberError {seType = SubscriberParallelConnectionCancelled, seMessage = "Parallel connection cancelled", seStack = []}))) 1s
ErrorPolicySuspendConsumer (Just (ConnectionExceptionTrace (SubscriberError {seType = SubscriberParallelConnectionCancelled, seMessage = "Parallel connection cancelled", seStack = []}))) 1s
ErrorPolicySuspendConsumer (Just (ConnectionExceptionTrace (SubscriberError {seType = SubscriberParallelConnectionCancelled, seMessage = "Parallel connection cancelled", seStack = []}))) 1s

I have updated to 1.29, got latest configs including Alonzo, cleaned DB and started fresh multiple times

Magically all is good in the world and I am synced again!

My mainnet node works fine - no issue with the sync.
I’m now trying the testnet config again - hopefully it has just started working like yours as well!