Missed two blocks - no clue why

Hi All,

In the current epoch (259) leaderlog showed that 2 leader slots were assigned to our pool, both were missed :(.

We monitor our systems extensively using Grafana, at the moment of producing blocks:

  • Time sync offset was within 1ms
  • Tip diff was hovering around 20
  • CPU usage was very low, ~5%
  • Memory footprint also on avarage
  • No network spikes/hiccups detected
  • Both relays and BP are running on 1.26.1

When I search for the slot numbers in our BP’s log file I see that our BP does not even think it is a leader for this slot, the same applies to the other slot mentioned.

node0-20210415142548.json:{"thread":"81","loc":null,"data":{"kind":"TraceStartLeadershipCheck","chainDensity":4.7277026e-2,"slot":26933491,"delegMapSize":437974,"utxoSize":1870953,"credentials":"Cardano"},"sev":"Info","env":"1.26.1:62f38","msg":"","app":[],"host":"149-210-","pid":"1928910","ns":["cardano.node.LeadershipCheck"],"at":"2021-04-15T15:16:22.00Z"}
node0-20210415142548.json:{"thread":"81","loc":null,"data":{"val":{"kind":"**TraceNodeNotLeader**","slot":26933491},"credentials":"Cardano"},"sev":"Info","env":"1.26.1:62f38","msg":"","app":[],"host":"149-210-","pid":"1928910","ns":["cardano.node.Forge"],"at":"2021-04-15T15:16:22.00Z"}

The slots:
“no”: 1,
“slot”: 26888631,
“slotInEpoch”: 363831,
“at”: “2021-04-15T04:48:42+02:00”

“no”: 2,
“slot”: 26933491,
“slotInEpoch”: 408691,
“at”: “2021-04-15T17:16:22+02:00”

Noteworthy to say is that we have a 2nd pool (BP) which runs behind the exact same relays and did actually succesfully mint a block on the very same day!

We would really like to troubleshoot this issue further but at this moment we don’t have a clue what else we can look at… all help is appreciated :).

Cheers,

Cardano.Amsterdam

1 Like

might related…

Check your CNCLI version.

Use CNCLI 2.0.2 with node v 1.26.1.

Thanks for your fast responses!

We are running the latest verison of cncli, the output of cncli --version is “cncli 2.0.2”

I had the same experience on 1.25.1, but only occurred once.

Only one epoch or just a single block?
It’s odd that the BP doesnt even think it’s a leader for the particular slot… it simply ignores it.

only 1 block
I think the official schedule is when you have isLeader event in your logs…
so guess, cncli could not determine schedule correctly…

It happened again :frowning:

We have updated our KES keys prior to this block to make sure this could not be the culprit, also upgraded the hardware a bit… 8cores and 16gb memory should be sufficient.

  • No missed slots seen in grafana

  • Tip diff is fine

  • Time sync offset is within 7ms

  • No cpu/mem/network/disk load seen during the slot’s window

    node0-20210417232029.json:{“thread”:“143”,“loc”:null,“data”:{“kind”:“TraceStartLeadershipCheck”,“chainDensity”:4.8511405e-2,“slot”:27150672,“delegMapSize”:444921,“utxoSize”:1896684,“credentials”:“Cardano”},“sev”:“Info”,“env”:“1.26.1:62f38”,“msg”:"",“app”:,“host”:“h2913679”,“pid”:“9008”,“ns”:[“cardano.node.LeadershipCheck”],“at”:“2021-04-18T03:36:03.00Z”}
    node0-20210417232029.json:{“thread”:“143”,“loc”:null,“data”:{“val”:{“kind”:“TraceNodeNotLeader”,“slot”:27150672},“credentials”:“Cardano”},“sev”:“Info”,“env”:“1.26.1:62f38”,“msg”:"",“app”:,“host”:“h2913679”,“pid”:“9008”,“ns”:[“cardano.node.Forge”],“at”:“2021-04-18T03:36:03.00Z”}

Still I dont think you missed a block creation since in the logs should be isLeader event about that when the producer was scheduled to create a block:

cardano.node.Forge:Info:238]ESC[0m [2021-01-08 07:37:07.00 UTC] fromList [("credentials",String "Cardano"),("val",Object (fromList [("kind",String "TraceNodeIsLeader"),("slot",N
umber 1.8525136e7)]))]

I guess time will tell, hopefully we’ll mint a block soon.
Strange enough, I ran leaderlog today and it showed NO blocks this epoch… while the leaderlog showed a single block for the current epoch while it was still running in the previous epoch (e.g. using “next” instead of “current” in cncli leaderlog).

Something else I find very strange is is that we never had issue with missed slots… and just after upgrading to 1.26.1 they started popping up (verified by grafana and prometheus).