So on the old version, the node was syncing fine, and only after I’ve updated to 1.34.1 the node is now stuck on starting… it’s been doing that for about 24 hours…
Share the glive output.
cntools or coincashew?
type sudo systemctl status cnode
or sudo systemctl status cardano-node
and journalctl -e -f -u cnode
or journalctl -e -f -u cardano-node
Do u see any errors?
Cheers,
cntools
here is the glive output:
systemctl status cnode:
journalctl -e -f -u cnode
I don’t think i see any errors
thanks, @Alexd1985
Ok, now go to db folder and rename ledger, immutable and volatile
mv immutable imutable_bkp
mv ledger ledger_db
mv volatile volatile_db
restart the node and check glive, it should start to sync again
then stop the node and go back to db folder
now u should have more files
delete the new files (NOT the bkp) ledger, immutable and volatile
rm -R ledger
rm -R immutable
rm -R volatile
now rename the bkp files
mv ledger_db ledger
mv immutable_bkp immutable
mv volatile_db volatile
restart the node
Check glive again
so i’ve followed those steps and it was syncing once I’ve renamed the folders inside the /db/ folders
but once i’ve renamed the old files its stuck on starting again, its been 3 hours already
Could I copy the /db/ folder from a node that is working on the same version - would that help?
Then perhaps it needs more time to start, from which version did u upgraded to 1.34?
I had similar issue and after hours of debugging without finding any reasonable cause I also opted to copy of the /db/
from other node running 1.34.1
and in sync.
- first in healthy and synced node where I want to get the db - I stopped that node so that db would not corrupt while coping.
- then in problematic node I ran
rm -rf db/
mkdir db && cd db
rsync -chavzP -e "ssh -p 22" <you--ssh-user>@<node-ip-or-host>:/ful/path/to/db/ .
- then starting both nodes it took only some minutes when both nodes were syncing again
i was upgrading from 1.33 - i’ve given it 20+ hours before, I’ll leave it overnight again
thank you
i am going to give this a try and keep you posted - thank you
if u type top do u see the CPU ~ 100%?
type free -m
or df -h
Hey,
thank you this resolved my issue for me - I’ve copied the db files from my working node and its working normally now
thank you for this!
Hi,
I am having a similar issue.
The node syncs fine until epoch 200ish but then the CPU starts running at 200% and the synchronisation slow down massively.
I let it run for more than 72 hours the first time, then I stopped the node deleted the db and restarted the synchronisation from the beginning, but here I am 36 hours later having the same issue.
Unfortunately, I don’t have a full db that I can copy over from another source.
I checked the log and I can see a few of the following errors:
[vmi60218:cardano.node.ChainDB:Notice:36] [2022-07-14 08:27:10.28 UTC] Chain extended, new tip: 7f0d5fe98c7d1857793fac2e6c33802c8960c99132b9dfc4525f5e5ee9c9b3eb at slot 8765028
[vmi60218:cardano.node.ErrorPolicy:Warning:79] [2022-07-14 08:27:10.56 UTC] IP 99.18.45.153:23812 ErrorPolicySuspendPeer (Just (ApplicationExceptionTrace (MuxError (MuxIOException writev: resource vanished (Connection reset by peer)) "(sendAll errored)"))) 20s 20s
[vmi60218:cardano.node.ErrorPolicy:Warning:79] [2022-07-14 08:27:11.10 UTC] IP 45.32.187.141:33251 ErrorPolicySuspendPeer (Just (ApplicationExceptionTrace (MuxError (MuxIOException writev: resource vanished (Connection reset by peer)) "(sendAll errored)"))) 20s 20s
[vmi60218:cardano.node.IpSubscription:Error:982035] [2022-07-14 08:27:12.07 UTC] IPs: 0.0.0.0:0 [199.247.26.225:4002,94.104.125.117:3001,65.109.3.248:6000,161.97.84.25:6000,89.47.161.139:6006,15.222.244.219:3001,178.128.232.212:3001,3.20.137.144:3001,144.126.158.237:22378,164.90.189.93:6000,66.94.103.66:6000,118.140.168.102:3001,186.32.202.127:3003] Application Exception: 65.109.3.248:6000 ExceededTimeLimit (KeepAlive) (ServerAgency TokServer)
[vmi60218:cardano.node.IpSubscription:Info:982035] [2022-07-14 08:27:12.07 UTC] IPs: 0.0.0.0:0 [199.247.26.225:4002,94.104.125.117:3001,65.109.3.248:6000,161.97.84.25:6000,89.47.161.139:6006,15.222.244.219:3001,178.128.232.212:3001,3.20.137.144:3001,144.126.158.237:22378,164.90.189.93:6000,66.94.103.66:6000,118.140.168.102:3001,186.32.202.127:3003] Closed socket to 65.109.3.248:6000
[vmi60218:cardano.node.ErrorPolicy:Notice:52] [2022-07-14 08:27:12.07 UTC] IP 65.109.3.248:6000 ErrorPolicySuspendConsumer (Just (ApplicationExceptionTrace ExceededTimeLimit (KeepAlive) (ServerAgency TokServer))) 20s
The server spec are fine (6 cores, 16 GB ram) and the nodes have been running for more than 18 months.
I managed to mint my first block about 1 month ago but after upgrading from 1.33 to 1.34.1 the node stopped working.
gLiveView stats seems to be as usual and the Grafana dashboard doesn’t flag anything anomalous.
At this point I am running out of options.
Anyone has an idea of what it is happening?
Thanks
Probably the db needs to resync and it will take more days… if u have another node already synced u can download the db
Hi @Alexd1985 Thanks for the quick reply. I guess the only thing left to do is to wait a few days then.
Unfortunately, I don’t have a full db anywhere after I deleted all the copies a few days back.
doh!
Then … if u deleted the db… it will take more days to downlod it again
Hi,
after literally weeks of waiting for my node to synchronise, I stumbled across this comment which was the cause of my issue
I hope this will help