The Cardano Mainnet database is constantly growing. A year ago the storage space requirement was 70GB, today we are at 125GB.
Source: Cardano Blockchain Insights
To find out which compression is easily possible, with reasonable efforts (for your servers) and what is maximum possible, I worked out and tested different solutions.
Important: Do NOT try to implement the solution described here just because you want to quickly solve a space problem on your servers. This is a proof-of-concept and not directly integrated into the Cardano-node. I.e. you really need to understand what you are doing to make your node work operationally in the medium and long term.
First I looked at filesystem based compression and deduplication. for example with ZFS. This achieves first improvements, but - especially deduplication - comes with considerable additional memory requirements.
Then I looked at solutions that work as tar archive files above the actual filesystem. Something like
In short - for now as initial post - ratarmount mounts a virtual folder, where it offers the content from a (read-only) tar archive, combined with a fully read&write enabled but uncompressed write-overlay folder.
Ratarmount offers a bunch of different options and tecniques. The best results I achieved by using pixie a parallel and indexing (!) version of xz
The compression is applied on most of the files in the nodes db/immutable files.
for this in db folder I created a subfolder
db/ratar
and therein 3 more subfolders
db/ratar/archive
db/ratar/indexes
db/ratar/writeoverlay
Then as a one-time preparation step I compressed the numbered files in the existing immutable folder
note:
- cd into the existing immutable folder first, in order to create the tar archive without subfolders
- I decided to create tar files in groups for the first two diggits of the immutable file names. Theoretically it’s also possible to create just one large archive.
cd db/immutable
tar -c --use-compress-program='pixz -9 -p 6' --file=db/ratar/archive/00.tar.xz 00*.*
tar -c --use-compress-program='pixz -9 -p 6' --file=db/ratar/archive/01.tar.xz 01*.*
tar -c --use-compress-program='pixz -9 -p 6' --file=db/ratar/archive/02.tar.xz 02*.*
tar -c --use-compress-program='pixz -9 -p 6' --file=db/ratar/archive/03.tar.xz 03*.*
Then delete all the archived files, and move all the remained files (at the current chain state) all 04*.* files to the db/ratar/writeoverlay folder.
The 13290 files from 00000 to 03999.* have a file size of 101GB
The compressed tar archives shrink it down to 23.4GB
Now delete the empty immutable folder. ratarmount will create a virtual directory instead.
Also ratarmount is configured to read from and write to db/ratar/writeoverlay in parallel to the static read-only tar archives.
Now ratarmount needs to be started (ideally as a service, and as a dependency for the node service)
On its first startup it recognises the new archives and will index them, which takes a couple of minutes. Following startups are sub 1sec. The index files allow ratarmount to quickly and directly access the bytes in the large tar files.
Now the node can be started and will see db/immutable with all the required files. The node can also generate new immutable files out of his (untouched) volatile folder. ratarmount will store them in the writeoverlay folder.
This means that over time the amount of new and uncompressed files will grow. The node needs some maintainance window and your personal attention to move a bunch of new uncompressed immutable files (eg all 04*.*) into 04.tar.xz
So what is missing?
I ran this setup for 8 months now on two mainnet relays of CLIO1 pool and had 0 issues with it. No real measurable impact on performance (full NVME storage) just a ~75% and growing ratio of saved disk space. And - that’s an important cost - ratarmount will require an additional ~6GB of memory.
The db/ratar/archive folder content (the tar archives) can even be used by multiple node instances, but at a certain point this would definitively have negative and visible effects on node time budgets and operations. So more or less not recommended, depending on your HW ressources.
Last but not least, again: please don’t try this at home!