Relay's dropping peers on epoch transition

I had the opportunity to watch this transition in real time and am curious if anyone noticed the same thing happening.

On my relays at any given time I have 23 ~ 24 peers each. During the transition from 210 ~ 211 I saw the peer connections drop to 15 ~ 16 each.

Are other relays getting crashing? Could it be connection issues that transpires during this event? Anyone else noticing this or have an idea of why this happens? I didn’t notice any ram change during the transition but CPU load almost maxed out on the all the 2 core servers.

1 Like

Yes - this is a known issue with 1.18.0. The next rev of cardano-node will address this.

Recycle your relays and block producers after epoch changes to keep them fresh.

3 Likes

Actually in the process of doing that now, seems to clear out some ram as well when doing so. Also lowering my topos on the relays to 18 each.

1 Like

Yeah that’s a good idea - none of my relay topos have more than total 18 valency count across all entries

2 Likes

Stupid question, everyone I see is uber concerned about 24/7/365 up time. Any worries about missing a block when refreshing the block producer after each epoch? Or would not be enough time to be of consequence?

1 Like

Yes this is a valid concern. I don’t do it currently, but will when I update cardano-node. The block producers typically handle epoch changes better than the relays. I suggest an idle node running in passive mode (like a relay) you can swap to a producer for upgrade purposes - but don’t run two producers at once or you risk forking the chain, which will be punished in the future.

3 Likes

Much appreciate the quick responses as well as the floating bubbles on your home page :). Thank you.

3 Likes

Haha my pleasure, Mark.

You can pop the bubbles to reveal mnemonics to a stacked wallet!

Your friend, FROG

4 Likes

Same thing noticed around midnight that out of 25 3-4 relays were connecting so I restarted my node and let it roll. At the morning this went down to zero peers (by occasionnally they reconnect to the node and disconnect) so something was very wrong already. Checking their topology updater:

“msg”: “blockNo 4576507 seems out of sync. please retry”
“msg”: “invalid blockNo

I don’t know what this dummy means by recycle (probably restart) but this should never happen with production quality software. This code is a god damn garbage, it might be ok for testnet but no way for mainnet and I don’t know why is he talkin about 1.18.0 when we are 1.18.1 even before the last epoch start.

fc2cb6bb07569e3d1fb60a7decd6df6f--recycling-real-estate-humor

Charles said they will have some big update coming up tomorrow, so let’s keep crashin and losing blocks till then, gj!

We’re all running 1.18.0 because there are serious issues with 1.18.1 (far worse than the rough epoch changes experienced with 1.18.0), which is exactly why your nodes are trashed.

You will receive no further support from me in the future

Your dummy friend who’s nodes are up and knocking blocks, FROG

2 Likes

Cross referencing this thread, which may be related:

1 Like

@Aione dude, not cool.

This is absolutely related - the chain is now corrupted and needs to be purged (start fresh database sync)

1 Like

thanks @ADAfrog , we’ll get right on that & post any observations in the other thread.

1 Like