1.18.1 - Nodes don't sync & no peers; 1.18.0 - all works fine

Hi,

on the cusp of 210-211 epochs my nodes running 1.18.1 stopped connecting to peers but were working perfectly fine before that.

I’ve checked and rechecked topology, firewalls settings etc.

All my nodes running on 1.18.0 are running just fine.

Has anyone experienced something similar? Any thoughts on where to dig further.

Cheers,
Frank.

I think it has to do with announcement to downgrade to 1.18.0 due to a bug found in 1.18.1. So all operators had to downgrade/rollback.

:man_facepalming: :man_facepalming: :man_facepalming: :man_facepalming: :man_facepalming: :man_facepalming: :man_facepalming:I see. I should have kept my ear to the ground. Where was it announced?

This was announced in all the official (and many unofficial) real-time support groups.

I strongly recommend engaging with these groups on Telegram if you are running a stake pool

1 Like

I have to say they have announced it quite late. Only 4 hours or so before the epoch ended/new epoch started. Not really nice (and professional) to be honest, hopefully they don’t take this lightly and be better in communication. As we all work in various part of the world with different timezones. It is the 1st stage and really hope they improve in stability and good communication. It’s not test net anymore :wink:

1 Like

We were running core & relays on 1.18.1 until 5 minutes ago (when we got the back-rev request through the last IOHK marketing announcement) & all of them ran straight through the epoch boundary with no problem and with no abnormalities than I could tell. On general principles I think anyone who upgraded should downgrade ASAP.

I find it appalling that the software release channel may be on Github while we still remain obligated to chat on Telegram in order to hear these release announcements through some kind of informal grapevine. :stuck_out_tongue_winking_eye:

This wasn’t a matter of communication - this was a matter discovered edge case of (:poop: happens) and we needed to adapt quickly.

Running a pool operation can require immediate action in some cases, but a good strategy to implement in the future would be to let others upgrade first and let those nodes run for ~week before following suit so enough time for feedback is allotted. This was an unusual circumstance given operators were eager to get away from the rough epoch boundaries we’re currently seeing

thanks @ADAfrog & I normally would have preferred to let others take point, but there was a release note with 1.18.1 about network integrity, probably the end-of-epoch performance issue (since removed, along with the tag) so I upgraded us in a hurry because I didn’t want us to be blamed for any :poop: on the network :roll_eyes:

The consensus between IOHK & the Cardano community is that we will have to check regularly (daily at least? for pinned postings if nothing else) about software releases on Twitter and Telegram, and I am reluctantly bowing my head to this. The whole idea that Github packages can be included in a set of well-defined dependencies of course is out the window… but as many would argue it’s “still early days” :sunglasses:

My pleasure, @COSDpool. Yeah it’s not ideal and I was in the same position.

With that being said operators are pushing to establish a more formal communication pipe for ops to be notified of such emergency changes. We understand the need and will be sure to continue addressing this with CF and IOG.

2 Likes