Hi folks, trying to get knowledgeable about how the node software works. Do the running nodes periodically restart themselves? Or do you guys periodically do that yourselves? I ask because I have just built version 1.27 with new(ish) cabal & ghc and have noticed the node seems to operate more sluggishly over time & restarting alleviates it. I don’t think it’s a memory leak from watching metrics, but maybe a CPU queue thing, or related perhaps to logging.
if cardano-node process restarts only if something controls it e.g. a script or a service, cronjob…
Which solution you have?
Hi lap, I am just getting set up now for the first time. I used the official docs to install (no scripts). I would generally cronjob something like that if it’s advisable to periodically restart the nodes and they don’t cycle themselves. From looking around on here, it seems other folks are noticing that nodes start missing slots, get sluggish, etc. after running a long time. I wonder it’s a bug?
Because of topology updates relay nodes should restart once a day. But it’s not critical. It is just convenient to reconnect to other relays at some point.
Block producer nodes don’t need restarts. The instability you’ve read about is mostly because of hardware issues. From my own experience some VPS hosting services don’t configure systems responsive enough for BP nodes. Therefore, I went with a dedicated virtual server, i.e. a virtual system with a 1:1 mapping for virtual and real cores (in other words a partitioned system). Since then there’s no missing slots anymore.
From the experience of other operators here in the forum it seems, that most of them run into issues from not having enough memory configured. With 16 GB you are on the save side, but 12 GB should be enough for the time being.
However, there’s still an issue for bp nodes during epoch changes. But developer are aware and IMO it will improve over time. Even then a restart is not required.
And one last comment - with one of the next versions a reload of the configuration will be implemented through signaling. With that feature topology changes won’t require restarts as it is now.
right on. Thanks for the thorough info, jf. It’s definitely an optimization challenge. Do you have any experience or knowledge how well using a swap file on SSD disk works with cardano-node?
I want to point out , the Cardano-node service only consumes ~6GB’s of memory (Now). I’ve herd that this is expected to grow, but at the moment, 8gb is more than sufficient. If you run other tools (cncli for example), it use to consume a high amount of memory when running specific tasks. Andrew (developer of cncli) has worked with Cardano Dev’s and created a new ‘snapshot’. I’ve noticed the snapshot takes high CPU cycles, and doesn’t overconsumption memory.
By default, if your using ubuntu, it’s not configured with a SWAP. SWAP is virtual memory, and you can always add it to your VM for those additional stakepool tasks. I’ve successfully done it to run CNCLI. It works great. Where a lot of operators ran into trouble, is not enabling SWAP, and having the kernel kill the node service.
In terms of running the Cardano-node proc on swap, I can’t comment on this. I’ve wanted to investigate if this is a possibility as it would save on costs, but never went beyond the thought.
Thanks, luis. Good to know what’s coming! I am testing out running swap VM on a SSD disk. I’m experimenting on a 2 core CPU 4 GB RAM DigitalOcean droplet I have had a long time and the SSD storage is way cheaper than jumping up to a 8 or 16 GB RAM plan. I have only set up the relay node so far and it was hammering that droplet for 2 days at near 100% CPU until it finally just synced up to the current epoch about two hours ago. Now it’s barely touching the CPU but still using that full 6 GB memory you mentioned with 2 GB sitting on that SSD VM. I hope that translates to it not needing the restarts now, though I will probably cronjob in periodic restarts with the topology thing too. I’ll post back back in a new topic what I discover about using SSD virtual memory.
Interesting. What I would suggest is running your node in the testnet (there’s one that mints blocks every minute) and see how the VM handles block production. I don’t think it would necessarily be a problem, but needs to be tested. I really hope it works, because having a cardano node/relay sit idle wastes a ton of resources, especially if your not producing blocks.
Good idea. I just checked it again and now it has split that 6 GB into 3GB on RAM and 3 GB on swap SSD. I guess it’s smart enough to leave 1 GB real RAM overhead for other things. It seems to be running very smoothly now that it’s synced.
@jf3110 may I know which dedicated VM are you using ? I’m now planning to move to a dedicated service and a bit lost which one to choose
If you’re doing periodic restarts, make sure they are not happening during a scheduled slot time.
I’m happy with the vds small from https://contabo.com/de/vds/. They come with a reasonable price and are fast enough for running the bp node on them.
Thanks a lot @jf3110 , just to confirm , do you have any missing blocks on it ? also did you try Contabo VPS for relay ?
You’re welcome. Since the upgrade to VDS Small, there are no missed slots except the usual ones during epoch changes. Relay is running on VPS M. Both systems are solid and stable. Next step would be a second relay on their US site.
Because of that, I’m no longer restarting bp node. It’s now running for almost 25 days without restart.
@jf3110 Perfect !! then I will go for it , thanks a lot
I use a 2G swap on my 8G BP ubuntu 18.04g 8core on vps at hostinger.com performance seems good have not missed any slots yet but only been running for 3 weeks so