Relay node suddenly go down after 3 months

Hi ,
My relay node have been running for 3 months now, today I found it seems go down, 0 blocks,slots, everything 0. I restart the node and still see the same (below picture).

Could you please share what is the cause if any clue , can you please suggest how should I recover from it?

image

journalctl -e -f -u cardano-node

or

journalctl -e -f -u cnode

I ran the command and it seems restart the node, however seems it still stuck at the “starting…” phase
image

It works! thanks for your help!

@Alexd1985 ,
It works originally after restart, however it seems back to starting… phase again, what could be the cause of failing so soon, in less than an couple hours?
Should I restart or there is some other known issues?

Looks like is run out of disk space issue? I check df -k . , it is 98%. How should I make up more space?

Delete the log files

When I run the topologyUpdater.sh script, the message show my node is out of sync
", “clientIp”: “x.x.76.39”, “msg”: “blockNo 2606181 seems out of sync. please retry” }

I retried a couple times, and still the same. What should I do?
This is my 2nd relay node. the first relay node seems okay
This is the 2nd note gLview
image

Delete db restart and let it resync. It once happened to one of my relays as well.
You might want to change the topology file and remove the extra out connections because they slow down the sync process.

do you mean on the relay node, delete the DB and change mainnet.topology.json?
BTW, this is my 2 nd relay node.

Yep, exactly

There are 3 directories inside DB, will it broke if I delete the DB?
do you mean cardano-my-node/db?

Yes this is the directory. If you delete the files in there and let the node resync it will recreate them.

df -h to check disk space. There are some easy bash options to check which directories will be using the space. As Alex mentions above it is almost certainly be the log files. You can delete everything in the log directory and restart the node and you will be fine.

When you ran out of disk space you probably broke the db as well, given it is now re-synching.

1 Like