Why do we need the topology updater?

Hi all,

I just had this conversation with Alex in anther thread and didn’t want to spam it with an off-topic question. I have few questions regarding the topology updater.

Can someone please explain to me why it is needed? I don’t understand the reason.

I read this forum a lot before I even touched building the node and found few mentions. From what I gathered, few people ask whether they should run it without knowing what it does. They run it because someone told them who heard it from someone else and so on…
I went through the script as well and I couldn’t find anything in there that is needed for the node to function. I just want to understand what it is I am missing…

If the only purpose of the script is to generate the topology file (instead of typing it manually), you don’t need to run a service for that. Especially periodically. A topology file can be generated with a single line of bash, no need for the service.

Even if you wanted to run a whole service just to generate the file, why do you need to run it periodically? Node reads config files only once on start up, changing config files while it is running is useless. The topology file should be generated once at the node start-up and that’s it.
Running extra unneeded services just takes away precious memory/resources from your node and adds extra managements burden.

Thanks for answers and comments.

I think u don’t understand what topology update does

Exactly. That’s why I opened the topic…

2 Likes

Here are the infos

Alex, I read all this long ago. It doesn’t explain why it is needed. It just says “you have to do it every minute…”

It also says “since we don’t have P2P, we need a static topology file”
Which is correct. We all have static topology files.
So why do we need to run the topology updater script every 60 mins or so?

I have the static topology file (as everyone else does without the P2P)
There is no explanation anywhere why the periodic script is needed.
In your answer you said “you don’t understand what it does” but you didn’t explain yourself.

  • cnode-tu-push.service : pushes a node alive message to Topology Updater API

Basically the topology updater script role is to send each hour a keep alive message, this way your node will be declared online and it will be added to the main topology table

Optionals are:

  • restart the node
  • fetch a new topology file, etc

This will be change soon when the P2P protocol will be released

1 Like

The global topology.json holds all “active” relays. The topologyupdater is a way to make sure only life relays are in the global topology.json. To achieve this nodes who behave healthily and send an update each hour will either be added or kept in the list. The same way, whenever relays are offline and/or don’t send this hourly message will be removed from the list. This temporary solution will become obsolete when p2p goes live.

1 Like

P2P will create self discovery. What P2P will do is stop you from using static topology file. Means no more topology files. The nodes will discover each other by themselves. But this is not relevant to the question.

The script does exactly 2 things and 2 things only:

  1. It pushes your node metrics like block number, valency and hostname to them
  2. Generates your topology file from custom nodes and the nodes pulled from the APIs

Don’t need a service for number 2 since it is useless without restarting the node and should be done once at node start up.
Don’t need a service for number 1 either. Your relay info is on the blockchain. Tools like Pooltool and Adapools read this info from the blockchain and check if relay is alive every now and again. That’s why my relay popped up in the topology right after pool registration without any topology updater.
So we are back to square one

Then don’t use it

Now we are getting somewhere. Thanks. So this is just to check if a relay is alive? That’s where my issue with it.
Why relays have to send the info?
This is done by the tools which serve the APIs for the topology.
For instance if you pull the topology from adapools, they will always generate a file which contains ONLY live relays. It is the APIs provider job to serve the file which doesn’t contain “dead” relays. AFAIK adapools and pooltool do it perfectly. I never had connection problems with the relays I pull from them. They read the relays from the blockchain and they ping them periodically to generate topology with only live nodes

As far as I know, this is a temporary mechanism to keep a useful list of relays the topologyupdater can use to create their static topology. So when you online a new relay the topologyupdater will use this list to create its initial topology list.

I am guessing here but I would say we don’t want a system that is dependent on too many things like apis and other websites as you mentioned. We got just 1 simple json which is updated through the explained topologyupdater. And from my experience works pretty well.

@ADA4Good @Ruslan_Sendecky I think a few important things are missing from the above discussion. The following is my understanding of it, correct me if I’m wrong.

The first important thing: the global topology.json (if you mean https://explorer.mainnet.cardano.org/relays/topology.json) and topology updater are two independent things!

topology.json is extracted purely from blockchain and updated on regular intervals (and I think it also regularly checks if the listed nodes are reachable and omits them if not, not sure about this though). As a consequence, this file only contains the public relays. Pools can also have private relays that are not registered on the blockchain (this is a good thing, because they can be helpful in case of a direct attack on a specific pool; the attacker wouldn’t know that the pool still is connected to the network after successfully attacking and bringing down its public relays).

Then the need for topology updater. Until peer-to-peer is live, nodes need to have a static topology file configured. How they make it and which relays are added is entirely up to them. They could just (randomly) select some relays from the global topology.json, add the relays from some (reliable) friends or use a service to provide a list of relays. Of course your own other relays en BP should also be added.

Topology updater is one of this services. Then there’s the question of why using a service and not just selecting some relays from the global file?

The first and most importing reason: blocks are pulled from other nodes (and not pushed). So you need incoming connections or your blocks won’t propagate! With topology updater you list your relays and you get other relays. The service makes sure every listed relay will be given enough other relays, so you will get those incoming connections (with just randomly selecting from the main file this would be much less the case).

Secondly, topology updater factors in location so that everyone gets a good mix of nearby and faraway relays, so blocks and transactions will propagate fast around the world. It would be a lot more complicated (and unnecessary) to run such logic on your own.

Also, every hour you have to send an update with your current tip to make sure only up to date relays are listed and propagated to others. After you miss a couple updates, you get delisted until you have a couple successful updates in a row again. This also makes it very easy to add a new relay and get incoming connections to it rather quickly.

You can also list private relays with topology updater, because it’s independent from the main topology file. With only the main topology file, those relays wouldn’t get incoming connections unless some other relays add them manually.

Note that topology updater is a centralised part, but this is a temporary measure until peer-to-peer is live. It’s also not developed by IOG, but by someone from the community. You’re free to use it or not, and I think there’re other services too.

2 Likes

Hi,

Agree mostly.
First part of your post. Yes, it is just a third party service. People choose to use it or not. I just wanted to convey to people that it is optional and not a must. Thing will work perfectly fine without it. On the top of that, I also not happy with some bad practices related to the updater like restarting nodes every 60 mins (seriously?)
People, especially the ones lacking engineering knowledge, should know and learn what those things do instead of blindly copying what others do.
People copy/paste numerous tutorials without understanding then they get hacked, lose their money or simply miss blocks due to misconfiguration.
And it is definitely shouldn’t be stated to the new comers: you must run it.
It should be explained what it does and they should have a choice to decide whether they should use this or not. That’s the crux of my point :slight_smile:

Now, what it does is very simple. It pushes your info to a third party service and then this third party service provides that info back to everyone via APIs.
Do you have to use it? No. Are there other services that do this better? Yes.

For the second part of your post. I have to argue here. It doesn’t matter who connects to who as long as the connection is established. Whether you pull or push is irrelevant once the connection is established. The blocks will propagate. This is tested on my node.
The relays are listed without the topology updater. All relays are on the blockchain. All a topology provider has to do (any of them) is to read it off the blockchain and make sure the topology file they provide via API is up to date and contains live relays. How they do it is up to them. They can ping the relays periodically or right before injecting them in the file. Up to them.
Again we choose any provider we see fit. If I see that my provider gives me dead relays, I will choose another one. By the way, I deliberately chose not to use the region filter but that’s another story.

In any case, I am happy this is discussed. People will read it, learn and make their choice.

Indeed, I don’t use CNTools which has one script for the hourly update and retrieval (not looked into it if you maybe can run those separately with the right parameters), but use two distinct scripts for it. The one for retrieving new relays I run on average once an epoch per relay or so.

Yes a lot of guides a written like this. And if something doesn’t work anymore because it’s outdated and not updated yet, they don’t know what to do…

Not sure if they are better… I use topology updater, haven’t looked into other services because in a few months when peer-to-peer is live, it won’t matter anymore.

Blocks are always fetched, never pushed! Maybe you are speaking about transactions? They’re also fetched, but by outgoing connections, so it’s the other way around. With peer-to-peer however, connections will be bidirectional so an outgoing connection will also be an incoming connection (if I understand this last part right). For now, you NEED incoming connections for blocks to propagate! So it’s not only about which relays YOU put in your topology (for fetching blocks of other pools), but also about in which topologies of OTHER POOLS you are listed. With a service like topology updater, you make sure you are.

This is a false statement. Didn’t you read the part on private relays in my post? Not all relays necessarily need to be registered on the blockchain!

Yeah, I meant better for me, of course. Others are free to choose things they feel are better :slight_smile:
I didn’t look too much into other services either. I think all of them (pooltool, adapools… etc.) all can give you the topology file via their APIs. Just pull and inject. I use adapools. Works for me.

p2p just means automatic discovery. Basically, no topology file.
TBH, I haven’t studied this in great detail yet. So I’ll shut up on this one.
All I know for now is that my first block was minted with only 3 connections and and only with iohk round-robin relay. Only later I added the topology splicing at start up (it is a single line really, no need for big scripts or services).
Also, my relay popped up shortly after registration and everything just worked. That’s all I can say without going too much into details of protocol technicalities as I am still reading/discovering…

But even if you are right (which you probably are), and the connection is uni-directional now, it still doesn’t matter much. I can get my peers by pulling and splicing topology into my local file. Other nodes will do exactly the same and since I am discoverable, they will find me. If they couldn’t, I wouldn’t be able to mint blocks, right?

Yeah, I read about private relays. That’s their choice to be private. The network and my node will function fine without those…

Minted AND adopted by the network? Because if it was, you would have got at least one incoming connection from another node.

No, the nodes are restarted every 24 hours by default and when you run the script you can also enter your own preferred period to restart the node.

What do you mean not now? You said it yourself that it is uni-directional now and when p2p is up, it will be bi-directional. Anyway, I get your point…

Yes, minted and adopted obveously. Otherwise what would be the point? Judging by the rewards in my account it was well adopted :slight_smile:
I base my conclusions on practical observation

My preference is not to restart a production service at all unless you upgrade/patch the service or fix a problem. I am not imposing this on anyone. This is what I think is right.

Misread it I guess. Deleted the comment.

Just look on a block explorer. No need to the derive this from rewards. :stuck_out_tongue_winking_eye: