Slow Synch - 1.35 - Mainnet - 1 Solution/Suggestion

WaterPecker · 6 July 2022 18:15

Just wanted to share here something too look at if you have experienced a slow synch. This may also apply to other versions.

I was running into huge slowdown issues past the 90% synch mark. I noticed my ram was hitting max, I had 16gb. Which coincidentally (maybe) was putting a big load on the SSD, don’t know if its related but the SSD was seeing big usage. I plopped an additional 16gb in and it started humming along nicely. Maxed out at 22gb or so. SSD activity had dropped to almost nothing.

In short, having ram headroom helped ALOT

Benagain · 10 July 2022 13:09

It seems like cncli is causing a large spike in ram when it calculates the leaderlog, which led me to do what you suggested and add some ram to my bp. I also increased my swap.

This if a fairly new issue, so hopefully it can get sorted in the next cncli update.

SGC2022 · 21 July 2022 20:41

Hmmm
been running with 24GB of memory and trying to sync for close to 2 weeks…
tho did forget to switch cnode.sh to more cores… so can’t completely put it down to memory.

just changed to 32GB less than an hour ago and it seems to have solved the slow sync issue i was having…
oddly enough i didn’t see massive ssd activity nor memory usage above 16GB…
but the change seems to have done something… at the current pace the sync should finish in the near future.

The SSD “activity” / bandwidth graph i got doesn’t show the IO pressure, so it could have been swapping high io which could slow down compute immensely.
that doesn’t explain why it didn’t use all the memory first… but maybe has something to do with that 24GB of memory is a slightly unusual configuration so that could affect the code being run…

i assume its possible that the cardano node has some configurations it does based on RAM and such… and it might only be able to see those in a power of 2… so 2,4,8,16,32…
and because i was at 24 it reverted to using only 16…

no clue if that is even close to being right… but its the only logic i can apply to make sense of it.

Benagain · 25 July 2022 01:11

After rereading this, I think my original comment probably was not the cause of your issue.

I find that connecting only to one or two relays during sync speeds things up, especially if they are iohk relays. Here is a good topology for syncing. You should also disable topology updater while syncing as you don’t want your node restarting.

{
  "Producers": [
    {
      "addr": "relays-new.cardano-mainnet.iohk.io",
      "port": 3001,
      "valency": 2
    }
  ]
}

7.4d4 · 25 July 2022 11:54

@Benagain Do you think that is because the processors are not fast enough?

I am running 1.35.0 on an ARM machine which is low power. The node is running within a virtual machine with only 4 Cortex-A72 cpus at 2GHz. This is only a bit more powerful than a raspberry pi but with more RAM (16G). This setup was able to sync OK and appears to be running fine for the past couple of weeks since I upgraded it to version 1.35.0. It is running in P2P mode on mainnet.

Benagain · 25 July 2022 14:32

Good questions.

I am hesitant to give you more/firm answers on CPUs as I have not devoted effort to ARM architecture but maybe someone who does could chime in and offer some insight. In my pool I am x64 arch only which means i am using a different node build all together.

I have also tested P2P on testnet and a little on main and decided to not use it in pool nodes just yet. I would suggest switching back to a known good configuration (non p2p) as a troubleshooting step.

Also it looks like 1.35.0 is no longer the latest release, so you could try updating to the latest as well.

Hope this helps.

7.4d4 · 26 July 2022 12:45

Sorry @Benagain I wasn’t very clear with my question:

I was wondering why you are seeing some reduced stability with Ryzen 5 series processors.

Because I would have thought that my ARM setup is less powerful that Ryzen 5 series processors. Since it has only:

But my setup seems to be running fine for several weeks now.

What is the problem with the Ryzen 5 series processors?

Topic		Replies	Views
Node crashes after trying to fetch leaderlogs Operate a Stake Pool	12	852	26 November 2021
Node sync suddenly extremely slow after half a day of syncing (testnet) Setup a Stake Pool	8	1580	20 August 2021
1.35.3 Relay sync days, even with csnapshots.io Operate a Stake Pool	22	868	5 September 2022
Cncli.sh leaderlog is killing the block producing node Operate a Stake Pool	6	505	28 January 2022
Extremely slow synchronization Community Technical Support cardano-node , vasil	6	1289	12 September 2022

Slow Synch - 1.35 - Mainnet - 1 Solution/Suggestion

Related topics