Many of us run the TopologyUpdater service from guild-operators, which first publishes the node’s IP and then fetches an updated topology. Up until now, I always assumed that the node somehow monitors its topology config file and reloads changes when they occur.
This does not seem to be the case and I was told that a node restart would be needed to reload the updated topology config. For obvious reasons, I don’t do an hourly restart of the node, in fact I very rarely do docker restart relay.
How do you update your topology?
PS: Up until now, I never questioned this, because the nodes were always well connected even without an explicit restart. With Alonzo, the need to explicitly update ones topology will likely go away, but still this puzzles me.
I am restarting the relays once/12 hours… but default (topologyUpdater) it will restart once/24 hours;
deploying the topology updater as systemd will have the following services:
DEPLOY THE SCRIPT
systemd service
The script can be deployed as a background service in different ways but the recommended and easiest way if prereqs.sh was used, is to utilize the deploy-as-systemd.sh script to setup and schedule the execution. This will deploy both push & fetch service files as well as timers for a scheduled 60 min node alive message and cnode restart at the user set interval when running the deploy script.
cnode-tu-push.service : pushes a node alive message to Topology Updater API
cnode-tu-push.timer : schedules the push service to execute once every hour
cnode-tu-fetch.service : fetches a fresh topology file before cnode.service file is started/restarted
cnode-tu-restart.service : handles the restart of cardano-node(cnode.sh)
cnode-tu-restart.timer : schedules the cardano-node restart service, default every 24h
We are currently wondering, whether we should built this script based functionality into the “official” upstream docker image. Alonzo is supposed to support config reloads triggered by signals - I guess that would be the right time to add this functionality, if even needed by then.
When you press Ctrl+C on a foreground process, you sent the SIGINT signal to that process. The same can be done with all sorts of signals with the linux kill command.
Currently SIGINT causes the node to do a graceful shutdown, which SIGTERM does not do. The result of a non-graceful shutdown is that the node has to re-validate the entire block data base, which may take 15min. Therefore, never just pull the plug on your node
With Alonzo the node will support config reload when you send it a specific signal (I don’t yet know which one) - a restart of the node won’t be necessary any more.
Yes. I therefore should probably first wait on the signal support (because requiring a restart is out of question) and then (and only if alonzo p2p gets delayed for some reason) can we build that script base topology update into the image.
It is ridiculous that you have to restart the process simply to have it reread the topology file. This means every time the updater alters it, you have to restart the process. This sets off alerts every time (since we have it automated).