I heard Charles mention something about SPO optimizations being rolled out right after Alonzo. I am waiting until then to right-size my vms. In the interim, more than enough should be enough.
Stats of our Azure relay running 1.29.0 after a few hours. 8 GB RAM, 2 CPUs. Bare bones install with no monitoring other than CNTools. 17 out connections.
Are u running in mainnet?
1.29.0 after ~4days uptime:
Relay: Prometheus: ~2.8GB | top: RES ~8.5GB
BP: Prometheus: ~3.3GB | top: RES ~7.6GB
Btw I see Prometheus metrics show a far lower memory amount for the node process than what
top is showing.
(Prometheus: rts_gc_current_bytes_used) I guess that metric is not actual ram usage?
With these stats and zero swap, I’m wondering why
oomd does not kill
cardano-node. Did you disable it?
Nope. Default standard install, except there is a dedicated premium data drive attached for the db folder. The dedicated drive is standard for all our nodes. Our setups are on our blog. Ticker 001.
Did it stay at 6.5Gb? I am using 1.29.0 but the memory consumption always ends up getting close to 8 Gbs.
This is annoying. I wish they would at least give a heads up about the issue, or update the minimum requirements.
I thought it would get reduced after my latest upgrade to a new commit. Unfortunately no. After a 24 hour run it all went back to the same figures.
Currently 2 relays run at 10Gb and the core runs at 9.5Gb
Reducing the number of peers didn’t help either. I reduced the number of peers from 25 to 16 per relay. It didn’t make any difference on memory consumption. So I guess we have to live with it till Basho era kicks in as this would be the optimisation era.
Wow, wasn’t expecting this to turn into a such a lively discussion
Thanks all for your feedback.
For me, after 4 days of node running,
Relay: 8GB used
Producer: 7.1GB used
Upgraded our nodes to higher specs and we are ready to take on more if future release demands. On a side note, I’m concerned about the future of Rpi pools.
Thanks. I am seeing very similar results. Thanks for sharing.
Yeah, I read that thread you mentioned.
I’d been trying different variations of RTS, mostly heap and allocation adjustments, but I still have concerns.
Playing with runtime (RTS) is not really optimisation IMHO.
It is always a trade-off. Yes, you can optimise somewhat, but usually you just sacrifice speed for resources or vice versa.
At the moment, I have enough memory, I might as well use it until they optimise the actual code.
The 8 GB Azure relay mentioned previously is struggling. I had to restart it several times. I’m monitoring it via atop (no Grafana or Prometheus, so load is minimal).
I don’t think even Scotty could make a difference.
I’m thinking one-shot to the head may be in that relay’s future.
In my case, after migrating to 1.29 the relay worked fine for a few hours. After that it starts reseting the node once there is no more RAM available. It now restarts every 12 minutes.
The main node is working well so far. I need to address the memory increase on the relay first and observe if there will be any effect on the main node (which I don’t expect).
Even from 1.27.0, I realized quickly that there were plenty of instances of maximum resource utilization and the result was frequent missed slots. Changing to N4 didn’t really help. So I just bite the bullet, pay to upgrade all my servers to 8 cores and 16GB RAM for each of my relays and BP. Memory usage, processor usage, and missed slots all resolved. Basically, it just takes computational resources to be a proper SPO. I also foresee that these requirements will increase over time as the network grows and usage and popularity. With Plutus, smart contracts and an exponentially growing ecosystem, we can’t really run away from growing server requirements. I just hope that the future models when voted into fruition, will take into consideration that these costs for SPOs will only grow, especially if the SPOs are trying to do the right thing and support with appropriate hardware.
Just another confirmation that these RTS modifiers worked well fir my 8gb relays. My Raspi was particularly problematic, but now is no longer crashing.
-Sully at RADAR
I did not see an increase in RAM usage from 1.27 to 1.29. Still running without missed slots with 8GB.