What might the process be for failover cardano-node services?

What if there were two relay nodes on a public network with a relay and a block producer on the private network. We will call this private relay the “standby” block producer server. Let’s say the private relay is running simply to keep it’s DB synced.

Would it be technically possible to copy the block producer keys to the standby server and shut down the other block producer to force pool activities to move to the standby server?
Would the keys still function?
Would it continue to produce blocks for the pool?

Is there a better/more appropriate way to create redundancy with cardano-node services?

Many pools run failover approaches. Find mine here: Block Producer - Failover Approach with a BP Standby

Maybe not the easiest in terms of the switchover mechanism. But the architecture is similar to your description.

2 Likes

Yes, uploading these 3 files (op/node.cert, vrf.skey and kes/hot.skey) and updating the start script will start the private relay as a producer and yes, it will continue to produce blocks

U can do it manually or automatically

Cheers,

Can the relay “standby” node continue hosting on the same IP it was as a relay or does the node IP on the standby need to change to the primary block producer node IP?

Nope, the IP will not need to changed

The private relay should not be registered and should not run the topologyupdater script

  • connect the producer to the private relay and to the registered relays
  • connect the private relay with the registered relays only
  • if u will not connect the main producer to the registered relays then the private relay will be one point of failure

What if, for some reason, the relay is behind on synchronization versus the primary block producer at failover time? Would this be a problem? or should the relay be 100% synced before reconfiguring the relay to start producing blocks?

Must be 100% synced, but if u will run it as a relay then should be synced all the time

Forgive the stupid questions…
What do you mean by not registered?
What is the topologyupdater script for? Is it a requirement to run a block producer?

What would happen if the standby wasn’t 100% synced on failover?

It will not start

1 Like

Do u remember that u registered the relays when u registered the pool certificate… ?
the topology updater is for relays only… should run once/hour… to announce the nodes to the public network

I haven’t actually configured a pool yet, I’m still in the planning phase of implementation.

Aa ok, then when u will register the pool add only the public relays

I think I understand the reasoning.

So I guess registering it would have an affect on the public meta data that represents the pool right?
In other words, we don’t want the public aware of that private “standby” relay.

1 Like