Questions on TopologyUpdater push and relay topology pull

I followed the CoinCashew guide to set up things for the topology update work. I have been comparing the CoinCashew directions with the GuildOperator guild-operators topologyupdater readme and I’d like to make sure I have things set up properly on my relay nodes.

Here is my understanding of how CoinCashew directions work.

  • They have 2 script files: one script called topologyUpdater.sh that publishes your relay node information to the centralized list of peers and another script called relay_topology_pull.sh which fetches the peer node information to update your mainnet-topology.json. Each of these scripts invokes a certain endpoint hosted at https://api.clio.one. I assume that IOHK is maintaining the API at https://api.clio.one to do the publish and fetching, respectively.

  • You need to run the topologyUpdater.sh script once per hour using CRON expression

  • The relay_topology_pull.sh should only be run after the topologyUpdater.sh script has run at least 4 times, but they don’t mention in the directions needing to re-run the relay_topology_pull script using a cron job.

With the GuildOperator script, they put both Topology Update push and fetch into the same script, and it looks like the default behavior is to run both push and fetch every time, once per hour. I may be wrong about that, but I’m not a shell scripting expert.

In the GuildOperator instructions they also say

it’s expected that you also add a scheduled restart of the relay node to pick up a fresh topology file fetched by topologyUpdater script with relays that are alive and well.

If I were to relate this instruction to the coin cashew instructions, I think I would need to set up a CRON expression to run the relay_topology_pull.sh script each day and then restart the node. However, they don’t show that in the CoinCashew instructions unless I missed it. It appears they are suggesting to just run the script manually, but then you won’t be getting updates to the peers.

Should I add another CRON job to call relay_topology_pull.sh once per day and do a restart of the relay node? If I should, would I need to create a wrapper script that calls relay_topology_pull.sh and then do the systemctl restart?

Thanks

1 Like

Hi,

It’s not related to IOHK. It’s a hobby project of a member from this forum.

Correct. That is sending the heartbeat to the topology updater server telling it that your relay is alive so that it can include it in the list for other pools.

This one just populates your topology file with nodes that are up to date. You can also run it manually as you see fit. It will still work assuming that the nodes from the list are not dead.

So you have 2 approaches, run it through a cron at some predefined period or run it manually once you see that the outgoing connections are falling down below some reasonable threshold.

In the next month or Q1, Q2 next year we can expect P2P functionality which will make the topology updater not needed anymore. Until then I’m using the topology updater and many thanks to the maintainer :beers:

1 Like

Wow, I’m surprised to learn that the API is managed by a member on the forum. Kudos to that person!

That makes sense about running the pull script on ad hoc basis. I guess the guild operator instructions are just a bit more “hard-core”. That said, I don’t really like restarting the relays every day, since the restart seems to need 10+ minutes in my case, which is not so pleasant.

Thanks much for the quick answer!

See if this can help you:

Another great tip, because I was encountering the corrupt DB issue. Per CoinCashew docs, that setting is 2 secs. I bumped it up to 30, did a “sudo systemctl daemon-reload” and restarted the node. It still did some DB interrogation, but it started in a few mins. I’m assuming that must be normal.

There are also some restart timeouts that can be defined. You can look it up in the docs.
I came to the conclusion that it works best if I do a sudo systemctl stop ... and then a sudo systemctl start ... Never had the time to investigate if that is actually true compared to restart : )

1 Like