Upgraded to 1.29.0 - High memory usage

Memory is always low after a restart. Then it starts caching and increasing.
It always starts at around 4Gb for both the core and relays. Then eventually stabilises at around 8 for 1.28.0 and around 9-10 for 1.29.0
That’s what it is for me anyway…
The mem stats right now:
relay 9.84GB
spare 10.66GB
core 8.981GB

0 missed slots during the epoch - correct. Same here.
Increased number of missed slots during the epoch changeover though.

Interesting…
Just upgraded to the latest commit of 1.29.0.
Mmemory consumption is back to normal around 6.5Gb
Go figure

I heard Charles mention something about SPO optimizations being rolled out right after Alonzo. I am waiting until then to right-size my vms. In the interim, more than enough should be enough. :slight_smile:

2 Likes

Stats of our Azure relay running 1.29.0 after a few hours. 8 GB RAM, 2 CPUs. Bare bones install with no monitoring other than CNTools. 17 out connections.

image

Are u running in mainnet?

Affirmative

1 Like

1.29.0 after ~4days uptime:

Relay: Prometheus: ~2.8GB | top: RES ~8.5GB
BP: Prometheus: ~3.3GB | top: RES ~7.6GB

Btw I see Prometheus metrics show a far lower memory amount for the node process than what top is showing.

(Prometheus: rts_gc_current_bytes_used) I guess that metric is not actual ram usage?

With these stats and zero swap, I’m wondering why oomd does not kill cardano-node. Did you disable it?

Nope. Default standard install, except there is a dedicated premium data drive attached for the db folder. The dedicated drive is standard for all our nodes. Our setups are on our blog. Ticker 001. :slight_smile:

Hey Ruslan.

Did it stay at 6.5Gb? I am using 1.29.0 but the memory consumption always ends up getting close to 8 Gbs.

This is annoying. I wish they would at least give a heads up about the issue, or update the minimum requirements.

Hi,

I thought it would get reduced after my latest upgrade to a new commit. Unfortunately no. After a 24 hour run it all went back to the same figures.

Currently 2 relays run at 10Gb and the core runs at 9.5Gb
Reducing the number of peers didn’t help either. I reduced the number of peers from 25 to 16 per relay. It didn’t make any difference on memory consumption. So I guess we have to live with it till Basho era kicks in as this would be the optimisation era.

1 Like

go there Solving the Cardano node huge memory usage - done

Wow, wasn’t expecting this to turn into a such a lively discussion :grinning:

Thanks all for your feedback.

For me, after 4 days of node running,
Relay: 8GB used
Producer: 7.1GB used

Upgraded our nodes to higher specs and we are ready to take on more if future release demands. On a side note, I’m concerned about the future of Rpi pools.

1 Like

Thanks. I am seeing very similar results. Thanks for sharing.

1 Like

Yeah, I read that thread you mentioned.
I’d been trying different variations of RTS, mostly heap and allocation adjustments, but I still have concerns.
Playing with runtime (RTS) is not really optimisation IMHO.
It is always a trade-off. Yes, you can optimise somewhat, but usually you just sacrifice speed for resources or vice versa.
At the moment, I have enough memory, I might as well use it until they optimise the actual code.

The 8 GB Azure relay mentioned previously is struggling. I had to restart it several times. I’m monitoring it via atop (no Grafana or Prometheus, so load is minimal).

I don’t think even Scotty could make a difference. :slight_smile:

I’m thinking one-shot to the head may be in that relay’s future.

In my case, after migrating to 1.29 the relay worked fine for a few hours. After that it starts reseting the node once there is no more RAM available. It now restarts every 12 minutes.
The main node is working well so far. I need to address the memory increase on the relay first and observe if there will be any effect on the main node (which I don’t expect).

Even from 1.27.0, I realized quickly that there were plenty of instances of maximum resource utilization and the result was frequent missed slots. Changing to N4 didn’t really help. So I just bite the bullet, pay to upgrade all my servers to 8 cores and 16GB RAM for each of my relays and BP. Memory usage, processor usage, and missed slots all resolved. Basically, it just takes computational resources to be a proper SPO. I also foresee that these requirements will increase over time as the network grows and usage and popularity. With Plutus, smart contracts and an exponentially growing ecosystem, we can’t really run away from growing server requirements. I just hope that the future models when voted into fruition, will take into consideration that these costs for SPOs will only grow, especially if the SPOs are trying to do the right thing and support with appropriate hardware.

1 Like

Just another confirmation that these RTS modifiers worked well fir my 8gb relays. My Raspi was particularly problematic, but now is no longer crashing.
-Sully at RADAR

I did not see an increase in RAM usage from 1.27 to 1.29. Still running without missed slots with 8GB.

1 Like