Hi everyone,
I’ve looked around and I can’t find any definitive information so I hope you can help.
I am running two relay nodes installed via cntools. They restart every 24 hours to update the topology files using the system service files provided.
However the nodes never shut down cleanly. Now first of all I thought maybe the script wasn’t giving the node enough time, so I increased that time to 360 seconds, however the nodes are still starting up and taking over an hour to come online as they check over the ledger again.
Does anyone else have this issue? The code being executed is shown below:
ExecStop=/bin/bash -l -c "exec kill -2 $(ps -ef | grep /home/ec2-user/.cabal/bin/cardano-node.*./opt/cardano/cnode/ | tr -s ' ' | cut -d ' ' -f2) &>/dev/null"
The nodes restart 2 hours apart from each other, so my BP always has at least one relay, but from what I understand the node should only take about 10 minutes to start, not over an hour.
Nodes are version 1.34.1
Thank you.
1 Like
what is the relays hardware configuration?
16GB RAM, 4vCPUs - the xlarge AWS EC2 instances.
you can try to edit the cnode.sh script . uncomment and set CPU_CORES to 3… then restart and check again
#CPU_CORES=2 # Number of CPU cores cardano-node process has access to (please don't set higher than physical core count, 2-4 recommended)
I have already uncommented it and set the value to 4 as it has 4 vcpus, do you think that is the issue then?
try with 3 vCPU
if still not working (which should be related with the CPU power) then set the relays to restart more seldom (not a must to be restarted each day)
1 Like
I’ve changed it to 3 vcpu, manually stopped the Cardano service and started it again.
Both nodes were back up within 6 minutes.
I’ll see what happens when they restart automatically tomorrow and report back.
Out of curiosity why would changing it to 3 vcpu make a difference?
Tbh I don’t know… perhaps it is not recommended to use all available vCPU
The system service restarted the node today at the specified time, unfortunately it did not cleanly shutdown and now it is checking over the ledger again.
Hi Sean_D,
How about implementing a simpler .SERVICE
file, such as the example at Creating Startup Scripts - CoinCashew and then troubleshooting from there?
[CHG]
Okay, so yesterday, BOTH nodes restarted cleanly on their own, I will monitor today as well to make sure it’s not a fluke. They’ve never done this before, but interestingly I did manually stop and restart them each before the automated script ran, so maybe that cleared some files or something?
No idea, but yesterday they were fine, I will report back if today they are fine as well.
1 Like
UPDATE:
The nodes now restart cleanly from scheduled restarts fine since changing my cnode.service ExecStop line to:
ExecStop=/bin/bash -l -c “exec killall -2 cardano-node”
Thanks for all your help guys.
2 Likes