Why is my cardano-node constantly writing 100 MB/s to disk?

KingTChoka · 23 August 2021 04:28

Background: I have both my cardano-relay and cardano-block nodes running in seperate docker container.

Before I touched anything, the disk usage looked like this:

Looks pretty normal.

Then I decided to add flock to my cronjobs on both my relay node and block nodes:

Crontab for relay-node:

* * * * * flock -n /root/grafana.lock -c 'cd /usr/share/grafana && /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini'
* * * * * flock -n /root/prometheus.lock -c 'prometheus'
* * * * * flock -n /root/prometheus-node-exporter -c 'prometheus-node-exporter'
* * * * * flock -n /root/start-relay-node.lock -c '/root/cardano-my-node/startRelayNode1.sh'
* * * * * echo "Hello world" >> /var/log/cron.log 2>&1
0 * * * *  /root/cardano-my-node/topologyUpdater.sh && echo "DONE" >> /var/log/cron.log 2>&1

Crontab for block-node:

* * * * * flock -n /root/pne.lock -c 'prometheus-node-exporter'

* * * * * flock -n /root/start-bp-node.lock -c '/root/cardano-my-node/startBlockProducingNode.sh'

Now all of a sudden, disk usage became 100 MB/s!!! Why? Is this flock related? Why is the nodes constantly writing so much data?
(The gaps in between are me stopping and starting the nodes)

Edit 1: using iotop I see that it’s the prometheus that keeps writing the ~90-100 MB/s to disk. Not sure why though…

Edit 2: running `watch -n 2 “df -h”’ to constantly watch my overall storage, I see that the storage decreases by 2 GB then increases by 2 GB constantly, so I’m guessing that data isn’t actually just being written but also being deleted which is a big relief.

COSDpool · 23 August 2021 05:56

This could be the consequence of you trying to start those heavy jobs once every it seems that’s set as a disk bandwidth limit in your docker container.

Current best practice is to start the node & background jobs as a systemd service instead, so that the OS will detect when they have stopped and restart them automatically, under all conditions when the process fails, system reboots, etc.; e.g. the systemd service scripts here (modify to suit your system):

topologyUpdater.sh is not a background service, just a run-through script, so it’s fine as you’ve listed it… but other Cardano utilities that run continuously have been made to run more efficiently as services, e.g.:

KingTChoka · 23 August 2021 05:59

Unfortunately systemd is not enabled or used in my ubuntu docker container (running ps aux it shows that /bin/bash is pid 1) so my work around has been to set cron job to just be on repeat basically

Topic		Replies	Views
Node sync suddenly extremely slow after half a day of syncing (testnet) Setup a Stake Pool	8	1564	20 August 2021
Periodically restarting nodes? Operate a Stake Pool	15	1009	29 June 2021
All nodes keep restarting? Operate a Stake Pool	9	804	26 October 2021
Both Relays Offline After 1.29 update Setup a Stake Pool	23	929	27 September 2021
Cardano Node 1.27.0 released - update procedure same as 1.26.2? Operate a Stake Pool	26	1067	14 May 2021

Why is my cardano-node constantly writing 100 MB/s to disk?

Related topics