From what I have read about pool operations and other how-tos from others, most stake pool operations seem to follow a one block-producing node and multiple relay node topology (shown here in the official docs). Some tutorials mention having a failover block-producing node to prevent forks but those all seem to be from the Jormungandr days.
What I’m wondering is, how would someone go about achieving high-availability with
cardano-node? We mostly want to check if any node is unhealthy, either by being in a forked or other failed state, and remove that node, if it is a relay, or fail to another block-producing node, if the block-producing node is the one at fault. Can a cluster have multiple block-producing modes that are load balanced? Can we failover to a new one if that isn’t possible? (and how would that be done with
cardano-node?) Can relays be added and removed dynamically?
I realize that the relays could just be added or removed via DNS, so I’m not really worried there. I’m just wondering how in the heck I can add some redundancy to the block-producing node. Thanks for any help!