Solving the Cardano node huge memory usage - done

Thanks for the feedback!

A few remarks on this great Google sheet:

  • You should include the version of GHC used to compile the node as the RTS is a part of GHC.
  • using -I0 effectively disables the periodic GC so using -Iw600 with -I0 should have no effect (unless there is an undocumented behavior of -Iw)
  • The options you add between the +RTS -RTS arguments are added to the default ones the node is compiled with so not using -T, -A16m or --disable-delayed-os-memory-return should have no effect (I’ve added them for consistency)
  • It can take up to 16 hours or even several days to see the effectiveness of these parameters.

Also node that version 1.29 is using on average 500M more RAM than 1.27 (check the live data afer a major GC to see the base usage which is around 2500M for 1.29) so you should replace -H2500M with -H3G. Having this value too low or a -F smaller than 1.5 will trigger many more small GCs and slow down the node as it will spend most of its time garbage collecting instead of doing useful stuff. (you can notice this during startup where you want the smallest number of major and small GCs)

One of my relays is a Raspberry Pi (using Raspbian with the 64bit kernel) and I am using the following with 1.29 compiled with GHC 8.10.7:

+RTS -N4 --disable-delayed-os-memory-return -I0.3 -Iw300 -A16m -n4m -F1.5 -H3G -T -S -RTS

The -n4m is dividing the -A16m by blocks of 4m and allowing cores that exhaust their nursery to use other cores’ unused 4m blocks before triggering a small GC which is useful when a thread is doing many allocations while others are idle. The documentation is unclear on whether -n4m is already a default or not but from my observation it does seem to decrease the time spent doing GCs a little bit but more tests would be required to be certain.

I use -Iw300 to limit major GCs caused by heap exhaustion as much as possible (this is when the RTS will allocate more ram and never release it).

One could also play with -I and use -I0.1 to increase the number of opportunities to run the major GC set with the -Iw parameter (-I0.3 -Iw300 tells the RTS to do a major GC if the node is idle for at least 0.3 seconds and if the last one was made at least 5 minutes ago)

I see that several people tried the -c parameter, it’s very effective at diminishing the memory usage but the cost in CPU is huge and missed block and unresponsiveness are inévitable with this parameter (unless you have very fast cpu).

7 Likes