Greetings all,
I hope this finds you well. Exciting times with the release of 1.23.0, wishing everyone the best! This topic is for those (like me) who have (or may) run into issues with their nodes following a system reboot.
I run cardano-node as a systemd service, and have never had issues starting/stopping or restarting the software. This ease-of-use also applied to the occasional system reboot. Until Wednesday.
On reboot, I got the rebooted relay up and running in no time, as usual. Seconds later it dropped all of its outgoing connections, and was booted by all its incoming peers. At first glance, the messages in the log pointed to a time sync issue, as per below (bold text):
{“at”:“2020-11-27T01:23:55.33Z”,“env”:“1.21.1:9577e”,“ns”:[“cardano.node.ChainDB”],“data”:{“kind”:“TraceAddBlockEvent.IgnoreInvalidBlock”,“block”:{“hash”:“5956c15”,“kind”:“Point”,“slot”:14873950},“reason”:“InFutureExceedsClockSkew (RealPoint (SlotNo 14873950) 5956c15ab55ef0f89d6d43d4e6e03a328ee158f28168471f15320dcceb6ae5c0)”},“app”:,“msg”:"",“pid”:“696”,“loc”:null,“host”:“ip-172-3”,“sev”:“Info”,“thread”:“39”}
I have since sent a ticket to IOHK, and am waiting for an answer. In the meantime, I have found two things that may be of use to anyone experiencing similar issues:
-
the problem can be solved with a reset of the database (not ideal, due to huge wasted hours), but will happen again on next system reboot;
-
the issue does not occur if your machine (or VM) is stopped and started (rather than rebooted);
After scratching my head for a while and fearing the worst, I resolved to install chrony. This solved the problem and I have been able to reboot the system consistently without it recurring.
Last comment, if this is happening to you on cloud services, please follow their directions to time sync.
I welcome any comments/experiences, once I have information from IOHK, I will post it here.
Cheers,
Adrem [RABIT]