Node never cleanly shuts down

Sean_D · 26 April 2022 14:44

Hi everyone,

I’ve looked around and I can’t find any definitive information so I hope you can help.

I am running two relay nodes installed via cntools. They restart every 24 hours to update the topology files using the system service files provided.

However the nodes never shut down cleanly. Now first of all I thought maybe the script wasn’t giving the node enough time, so I increased that time to 360 seconds, however the nodes are still starting up and taking over an hour to come online as they check over the ledger again.

Does anyone else have this issue? The code being executed is shown below:

ExecStop=/bin/bash -l -c "exec kill -2 $(ps -ef | grep /home/ec2-user/.cabal/bin/cardano-node.*./opt/cardano/cnode/ | tr -s ' ' | cut -d ' ' -f2) &>/dev/null"

The nodes restart 2 hours apart from each other, so my BP always has at least one relay, but from what I understand the node should only take about 10 minutes to start, not over an hour.

Nodes are version 1.34.1

Thank you.

Alexd1985 · 26 April 2022 16:03

what is the relays hardware configuration?

Sean_D · 26 April 2022 16:16

16GB RAM, 4vCPUs - the xlarge AWS EC2 instances.

Alexd1985 · 26 April 2022 16:18

you can try to edit the cnode.sh script . uncomment and set CPU_CORES to 3… then restart and check again

#CPU_CORES=2 # Number of CPU cores cardano-node process has access to (please don't set higher than physical core count, 2-4 recommended)

Sean_D · 26 April 2022 16:19

I have already uncommented it and set the value to 4 as it has 4 vcpus, do you think that is the issue then?

Alexd1985 · 26 April 2022 16:28

try with 3 vCPU
if still not working (which should be related with the CPU power) then set the relays to restart more seldom (not a must to be restarted each day)

Sean_D · 26 April 2022 20:39

I’ve changed it to 3 vcpu, manually stopped the Cardano service and started it again.

Both nodes were back up within 6 minutes.

I’ll see what happens when they restart automatically tomorrow and report back.

Out of curiosity why would changing it to 3 vcpu make a difference?

Alexd1985 · 26 April 2022 20:41

Tbh I don’t know… perhaps it is not recommended to use all available vCPU

Sean_D · 27 April 2022 14:41

The system service restarted the node today at the specified time, unfortunately it did not cleanly shutdown and now it is checking over the ledger again.

ParadoxicalSphere · 28 April 2022 13:58

Hi Sean_D,

How about implementing a simpler .SERVICE file, such as the example at Creating Startup Scripts - CoinCashew and then troubleshooting from there?

[CHG]

Sean_D · 29 April 2022 09:33

Okay, so yesterday, BOTH nodes restarted cleanly on their own, I will monitor today as well to make sure it’s not a fluke. They’ve never done this before, but interestingly I did manually stop and restart them each before the automated script ran, so maybe that cleared some files or something?

No idea, but yesterday they were fine, I will report back if today they are fine as well.

Sean_D · 30 April 2022 14:08

UPDATE:

The nodes now restart cleanly from scheduled restarts fine since changing my cnode.service ExecStop line to:

ExecStop=/bin/bash -l -c “exec killall -2 cardano-node”

Thanks for all your help guys.

Topic		Replies	Views
Issues stopping cardano-node Operate a Stake Pool	2	703	20 February 2021
All nodes keep restarting? Operate a Stake Pool	9	804	26 October 2021
Node goes down every 24 hours Operate a Stake Pool	20	943	22 May 2021
Relay - Failed with result 'signal' Operate a Stake Pool	8	700	29 March 2022
Updating relay topology Setup a Stake Pool topology	2	747	22 July 2021

Node never cleanly shuts down

Related topics