Node sync suddenly extremely slow after half a day of syncing (testnet)

Hello,

First time post in this forum! I have successfully set up and started a block producing node on testnet using this guide for the most part. Guide: How to build a Cardano Stake Pool - CoinCashew . I would say that the only major difference is that I decided to dockerize everything, so the cardano node process is actually running inside of a docker container, which is being run inside of a Compute Engine VM running Ubuntu. (Hardware: 2 cores, 4 gb mem, 60 gb standard disk - obviously I’d do higher in a prod/mainnet environment)

After starting, I noticed the syncing was going quite fast (300-400 slots processed per second). After about 18 hours (reaching approx slot #30000000+), I’ve noticed a significant slowdown in syncing (now about 1 per second!). I have plenty of disk space left, CPU is only at 10% utilisation. At this rate it will almost never finish syncing!

Any idea what could be going on? I can provide more info as needed. Advice much appreciated, thank you.

4GB is low, you need to have at least 8GB, preferable 16GB.

My nodes are using 7.4GB RAM each, and all of they have 16GB.

Thanks. I upgraded to 4 cores + 16gb on both block producer and relay and will update here tomorrow.

1 Like

Perfect, thinks will run much better now. Let us know how it goes :slight_smile:

HI @DevJohn and all, so I did upgrade to 4 cores + 16gb ram each on both the relay node and the block producing node, and I did notice a real speed increase in the beginning for the blockchain sync, however after reaching Epoch 151-ish, it’s slowing down tremendously again.

I attached a screenshot of gLiveView on the block producing node, and some infra metrics of the VM so you can see that we’re not constrained on resources.

Any other info I can provide? I’m really perplexed by why there would be a sudden slowdown.


I can see something wrong in your gLiveView. Now this is the image of your relay, and it only shows 1 in and out, that’s bad news! If you did everything correctly, you should have much more. For example: One of my relays have “23 Out / 17 In”.

That means, there is potentially something wrong in your topologyUpdater. Visit this section 14. Configure your topology files and make sure you have everything right.

You need to do the above on all of your relays if you have more than one.

Let me know how it goes.

First of all, thank you very much @DevJohn and all. The support in this community is amazing, and I hope to pay this back to others shortly!

So a few clarifications:

  1. the gLiveView I showed is in fact the block producing node’s (even though it says relay at the top). I attached an image to show the nodes side by side (block producing is on left and relay is on right). . I don’t know why it says “relay” on both of them. Even though I followed CoinCashew’s instructions exactly, is it possible they missed a step and my “env” file for gLiveView is configured incorrectly? I notice that the “#TOPOLOGY” entry in that file is commented out for example.
  2. I have not gotten to Step 14 yet. I’m still on Step 8, which if I read correctly, is indicating that I should be able to fully sync the blockchain at this point.
  3. I checked and each node is running their expected entrypoint files (startBlockProducingNode.sh and startRelayNode1.sh, respectively).
  4. Here are my topology files:
    Block Producing testnet-topology.json
{
    "Producers": [
      {
        "addr": "35.X.X.X <relay node public IP>",
        "port": 6000,
        "valency": 1
      }
    ]
  }

Relay testnet-topology.json

{
    "Producers": [
      {
        "addr": "104.X.X.X <block producing node public IP>",
        "port": 6000,
        "valency": 1
      },
      {
        "addr": "relays-new.cardano-testnet.iohkdev.io",
        "port": 3001,
        "valency": 2
      }
    ]
  }
  1. When I look at journalctl logs, I get one line like this every 30 secs or so. This is extremely slow, as it was speeding through all of these slots in the first few hours. I should clarify that after your first feedback yesterday to increase the cpu/mem, I started from scratch, so this is the second time I’m observing this behaviour.
ug 19 12:07:33 cardano-block-producing-node-01-testnet cardano-block-producing-node[19755]: [cardano-:car
dano.node.ChainDB:Notice:150] [2021-08-19 12:07:33.29 UTC] Chain extended, new tip: cf6774e197abd36871cd15
9b06da8a3e491820172b3c6494833de342185c3a11 at slot 35005637

OHHHH! Epoch 151 is the latest epoch on Testnet! :man_facepalming:
https://explorer.cardano-testnet.iohkdev.io/en.html
I was under the false impression that it should go all the way to 285!

Any idea why it says “Relay - Testnet” at the top when this is definitely the block producing node?
Thanks again.

You need to switch download and over to mainnet config files on your instances.

https://hydra.iohk.io/build/7191656/download/1/index.html
or step 3 on coincashew guide as you mentioned earlier.
wget -N https://hydra.iohk.io/build/${NODE_BUILD_NUM}/download/1/${NODE_CONFIG}-byron-genesis.json

wget -N https://hydra.iohk.io/build/${NODE_BUILD_NUM}/download/1/${NODE_CONFIG}-topology.json

wget -N https://hydra.iohk.io/build/${NODE_BUILD_NUM}/download/1/${NODE_CONFIG}-shelley-genesis.json

wget -N https://hydra.iohk.io/build/${NODE_BUILD_NUM}/download/1/${NODE_CONFIG}-config.json

1 Like