Hypothetical (having duplicate Block producers) 1 cloud 1 home

Anti.biz · 6 March 2021 16:33

I am contemplating setting up a block producer server at home.

Is it possible to clone the configs and quickly swap from the home to the cloud if it breaks?

If I created a secondary block producer at home, what conflicts would it cause with the current functioning one if they’re both registered with the same payment/stake keys?

Johann_ADAholycs · 6 March 2021 17:23

Hi,

yes it is. I would run the second pool not as a bp but as a passive node. If your bp is not reachable you can restart one of you passive nodes as a bp.

Of course all the keys and the certificates have to be at the place of your passive node. your relays should already include your passive node in the topology files. Then the things get easier.

If you need more guidance I could explain in more detail the idea.

Best,
Johann

Anti.biz · 6 March 2021 17:27

Yes if you can explain passive node more. Is it just a copy of your Block producer? What sets a passive node / block producer.

Alexd1985 · 6 March 2021 17:41

Start it like a relay… without certificates and keys

Johann_ADAholycs · 6 March 2021 18:11

A passive node is determined by its topology file and it is started the same way as a relay node.
A topology file suitable with your idea is:
#passive node topology
{ "Producers": [ { "addr": "relays-new.cardano-mainnet.iohk.io", "port": 3001, "valency": 2 } ] }

Now you may run a script which tests if your bp is online. Once your bp dies you
simply relaunch your passive node with the start script for a block producing node.
You should replace your topology file by:

#block producing topology
{ "Producers": [ { "addr": "pool.adaholycs", "port": 46371, "valency": 1 } ] }

replace pool.adaholycs and the port with your relay and the valency if your dns represents multiple nodes.

You may run your passive node with the #block producing topology.

if you have more questions, feel free to ask.

Best,
Johann

Anti.biz · 6 March 2021 18:19

What should my valency be set to on my current block producer and relays?

Right now in my topology for my Block producer:

{
“Producers”: [
{
“addr”: “x.x.x.x”, Relay 1
“port”: 6000,
“valency”: 1
},
{
“addr”: “x.x.x.x”, Relay 2
“port”: 6000,
“valency”: 1
}
]
}

On my Relay 1 topology:

{ “addr”: “x.x.x.x”, “port”: 6000, “valency”: 2 } , (Block Producer)
{ “addr”: “x.x.x.x”, “port”: 6000, “valency”: 1 } , (Relay 2)

On my Relay 2 topology:

{ “addr”: “x.x.x.x”, “port”: 6000, “valency”: 2 } , (Block Producer)
{ “addr”: “x.x.x.x”, “port”: 6000, “valency”: 1 } , (Relay 1)

Johann_ADAholycs · 6 March 2021 19:08

Hi,

your valency values look correct.
{
“Producers”: [
{
“addr”: “x.x.x.x”, Relay 1
“port”: 6000,
“valency”: 1
},
{
“addr”: “x.x.x.x”, Relay 2
“port”: 6000,
“valency”: 1
}
]
}
would fit to your passive node.
Your topology file is perfect. Now you have to pay attention to have two different starting scripts one for the bp and one for the passive node (which is the same as for the relay).

Best,
Johann

Anti.biz · 6 March 2021 19:37

Yes I think I understand, I would also need a extra cardano-node.service for automating the secondary startBlockProducer1-2.sh

Anti.biz · 6 March 2021 19:37

So block producer is valency “2” ?

Johann_ADAholycs · 6 March 2021 20:17

Hi,
yes this is correct.
Best,
Johann

Markus-VITAL · 9 March 2021 07:01

Hi @Anti.biz !
Sometime we ask ourselves the same questions as it seems
This was the outcome for me

Similarly to what was explained already i also just run it as a relay but with the required config to run it as a Validator available. Still secured like a validator (not exposed to public, no typo updater)

Markus-VITAL · 9 March 2021 07:08

Could you give some more details about how this script works. I have a monitoring/alerting running currently which verifies if the TIP is current. If it gets older then 5 minutes i trigger an Alert. This could be used as a trigger for the secondary node to come up, but there is a risk that the original producer was just temporarely blocked (e.g. not updated by the other relays in the typology). This sometimes happened in the last weeks mostly at ~10:40-10:50 CET.
Concrete Question:

How do you check if the original BP died?
If it died are you making sure that it is also stopped to make sure it not comes up again and you have 2 BPs running?
Is there a reference script around?

My current setup just sends me the alert. Switching is currently manual but I’d love to automate that as well.

Anti.biz · 10 March 2021 04:57

Can you share your script that is monitoring the tip? And possible explain how it would trigger the passive to become the block.

Markus-VITAL · 10 March 2021 07:42

Short explanation of the script. It is executed through crontab every minute.
It send OK pings to healthchecks.io. If the TIP diff is too high it does not send the ping.
Healtchecks.io will alert if no valid ping comes in for 5 Minutes.
This way I will recognize that something is wrong in any case (also when the machine is not running/crashed/not able to execute the check) without exposing anything to the outside world (like it would be the case if using a cloud monitoring agent)

Remarks:

The script is kind of hardcoded currently, so it will require customization for you
Also if Cardano Config / Parameters change the calculation may be invalid because I’m just substracting the constant 1591566291 from the current Time. So it could be improved to calculate this static value from the Cardano Config Parameters.
Please customize the following parts of the script:
Change USERNAME to your user
Change the “all good sending ping” area to your appropriate handler or define a non success area to trigger somethign in this case.

Script (pingTipCheck.sh):

#!/usr/bin/env bash
# shellcheck disable=SC2034,SC2086,SC2230,SC2009,SC2206,SC2062,SC2059

export CARDANO_NODE_SOCKET_PATH=/opt/cardano/cnode/sockets/node0.socket

customCurrentSlotNoString=$(/home/USERNAME/.cabal/bin/cardano-cli shelley query tip --mainnet | grep -Po '\"slotNo\": \K[0-9]+')
customCurrentSlotNo=$(expr $customCurrentSlotNoString + 0)

customRefSlotNo=$(expr $(printf '%(%s)T\n' -1) - 1591566291)
customDiff=$(expr $customRefSlotNo - $customCurrentSlotNo)

if [[ $customDiff -le 50 ]]
then
  echo "all good sending ping"
  curl -m 10 --retry 5 https://hc-ping.com/YOURPINGENDPOINT
exit
fi

Crontab Entry (crontab -e -u USER):

* * * * * /opt/cardano/cnode/custom/pingTipCheck.sh

Johann_ADAholycs · 10 March 2021 09:18

Hi,

I am happy to see development in this direction.

I think that if you decide to go to the data center anyway, which makes sense for the bp. The redundancy to run a second bp is in principle an overkill. I would suggest to monitor the bp on the data center as suggested by zwirny that’s it.

Best,
Johann

Anti.biz · 10 March 2021 09:19

Markus-VITAL:

Short explanation of the script. It is executed through crontab every minute.
It send OK pings to healthchecks.io. If the TIP diff is too high it does not send the ping.
Healtchecks.io will alert if no valid ping comes in for 5 Minutes.
This way I will recognize that something is wrong in any case (also when the machine is not running/crashed/not able to execute the check) without exposing anything to the outside world (like it would be the case if using a cloud monitoring agent)

Remarks:

The script is kind of hardcoded currently, so it will require customization for you

Also if Cardano Config / Parameters change the calculation may be invalid because I’m just substracting the constant 1591566291 from the current Time. So it could be improved to calculate this static value from the Cardano Config Parameters.

Please customize the following parts of the script:

Change USERNAME to your user

Change the “all good sending ping” area to your appropriate handler or define a non success area to trigger somethign in this case.

Script (pingTipCheck.sh):
#!/usr/bin/env bash
# shellcheck disable=SC2034,SC2086,SC2230,SC2009,SC2206,SC2062,SC2059

export CARDANO_NODE_SOCKET_PATH=/opt/cardano/cnode/sockets/node0.socket

customCurrentSlotNoString=$(/home/USERNAME/.cabal/bin/cardano-cli shelley query tip --mainnet | grep -Po '\"slotNo\": \K[0-9]+')
customCurrentSlotNo=$(expr $customCurrentSlotNoString + 0)

customRefSlotNo=$(expr $(printf '%(%s)T\n' -1) - 1591566291)
customDiff=$(expr $customRefSlotNo - $customCurrentSlotNo)

if [[ $customDiff -le 50 ]]
then
  echo "all good sending ping"
  curl -m 10 --retry 5 https://hc-ping.com/YOURPINGENDPOINT
exit
fi
Crontab Entry (crontab -e -u USER):
* * * * * /opt/cardano/cnode/custom/pingTipCheck.sh

What does your appropriate handler or define a non success area mean?

Alexd1985 · 10 March 2021 09:26

Also a question… if the ICMP is filtered, the script is still valid?

Anti.biz · 10 March 2021 09:34

- - - - /opt/cardano/cnode/custom/pingTipCheck.sh
        What time should I set this on?

Markus-VITAL · 10 March 2021 10:35

I’m doing the Ping to healthchecks.io. You could want to do something different. In that case you would need to change this area.

Markus-VITAL · 10 March 2021 10:37

Yes it is. It is just a call from the server to healthchecks to let healthchecks know everything is still good. This approach is often handled for monitoring of completely internal serves which are not allowed to accept any incoming connects from outside networks.

No ICMP involved. Just a URL request to the HTTP Url

Topic		Replies	Views
Redundant BP Nodes? Setup a Stake Pool	1	668	13 April 2021
How to setup a secondary Block Producer? Staking & Delegation	4	743	29 March 2021
Block Producing/Relay Node Issues Operate a Stake Pool	4	1045	2 September 2020
2nd Relay Node question? Operate a Stake Pool	2	649	24 November 2020
Multiple block producing nodes in stake pool? Operate a Stake Pool	3	926	24 May 2021

Hypothetical (having duplicate Block producers) 1 cloud 1 home

Related topics