My heart is bleeding... InvalidKesSignatureOCERT

I started my small pool 2021/03 first with 20k ADA later increased to 40k ADA.

I was waiting nearly a year for my first block, created leaderlogs every 5 days.

But cncli didn’t show any block far and wide.

This morning i looked at my Grafana and saw “Pool Performance: 100”. I thoght wow why didnt cncli show that.

But all the tools likr pooltool didnt show a block and even in my db-sync instance was nothing to find.

So i started to search my logs in the suitable timeframe Grafana showed and found this:

pool-core-1  | e[34m[core:cardano.node.Forge:Info:424]e[0m [2022-01-27 05:51:42.00 UTC] fromList [("val",Object (fromList [("kind",String "TraceNodeIsLeader"),("slot",Number 5.1696411e7)])),("credentials",String "Cardano")]
pool-core-1  | e[34m[core:cardano.node.Forge:Info:424]e[0m [2022-01-27 05:51:42.17 UTC] fromList [("val",Object (fromList [("block",String "b8938e217149a81c054ac4edc48e58ebd0ad5533fdfa7f6cd993a4909147477f"),("blockNo",Number 6806779.0),("blockPrev",String "49c96fef6d3b2fb4f472f7bf5733d666e246c8a2c9e87939b3181d23416205ef"),("kind",String "TraceForgedBlock"),("slot",Number 5.1696411e7)])),("credentials",String "Cardano")]
pool-core-1  | e[31m[core:cardano.node.ChainDB:Error:414]e[0m [2022-01-27 05:51:42.27 UTC] Invalid block b8938e217149a81c054ac4edc48e58ebd0ad5533fdfa7f6cd993a4909147477f at slot 51696411: ExtValidationErrorHeader (HeaderProtocolError (HardForkValidationErrFromEra S (S (S (S (Z (WrapValidationErr {unwrapValidationErr = ChainTransitionError [OverlayFailure (OcertFailure (InvalidKesSignatureOCERT 398 357 41 "Reject"))]})))))))
pool-core-1  | e[34m[core:cardano.node.ChainDB:Info:414]e[0m [2022-01-27 05:51:42.27 UTC] Valid candidate 49c96fef6d3b2fb4f472f7bf5733d666e246c8a2c9e87939b3181d23416205ef at slot 51696406
pool-core-1  | e[31m[core:cardano.node.Forge:Error:424]e[0m [2022-01-27 05:51:42.27 UTC] fromList [("val",Object (fromList [("kind",String "TraceForgedInvalidBlock"),("reason",Object (fromList [("error",Object (fromList [("error",Object (fromList [("failures",Array [Object (fromList [("error",String "Reject"),("kind",String "InvalidKesSignatureOCERT"),("opCertExpectedKESEvolutions",String "41"),("opCertKESCurrentPeriod",String "398"),("opCertKESStartPeriod",String "357")])]),("kind",String "ChainTransitionError")])),("kind",String "HeaderProtocolError")])),("kind",String "ValidationError")])),("slot",Number 5.1696411e7)])),("credentials",String "Cardano")]

And started googling for this error most advised that cold.counter was not updated, but i doubt, i suspect i did not update kes.skey after roataing, because i learned that it’s not necessary to generate new kes-keypairs everytime and started only to generate new node.cert, and forgot to upload the last kes.skey.

I now rotated the keys twice and re-uploaded cold.vkey kes.skey and node.cert

While still using the same vrf.skey cncli now shows my block when using --ledger-set current. Maybe i was just operationally blind.

So now i have questions:

I’m looking forward to your advices, it would kill me if it happened again, it’s not about the money, it just hurts me to wait again for a year to see if i did better.

curl localhost:12798/metrics | grep KES (then check here if the LIVE KES is the same)

with this command you should check if the KES are valid.

and, if cncli didn’t showed you block perhaps you are using a wrong vrf file?
check your vrf file hash and compare with the hash from cardanoscan.io, should both match
to calculate the vrf file has run the command
cardano-cli node key-hash-VRF --verification-key-file vrf.vkey
then check your pool ID on cardanoscan.io and compare the hash

cheers,

PS: and I rotated the KES I saw that 4 files were been updated (looking to date):

  • hot.skey (in your case kes.skey)
  • hot.vkey
  • kes.start
  • op.cert (in your case node.cert)

But only 3 files are used to start the node as a Producer:

  • hot.skey (in your case kes.skey)
  • vrf.skey
  • op.cert (in your case node.cert)
peter@Ubuntu-2004-focal-64-minimal:~/pool$ sudo nsenter -t $(sudo docker inspect -f '{{.State.Pid}}' pool-core-1) -n curl localhost:12798/metrics | grep KES
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2593    0  2593    0     0  2532k      0 --:--:-- --:--:-- --:--:-- 2532k
cardano_node_metrics_operationalCertificateExpiryKESPeriod_int 460
cardano_node_metrics_currentKESPeriod_int 399
cardano_node_metrics_remainingKESPeriods_int 61
cardano_node_metrics_operationalCertificateStartKESPeriod_int 398
peter@Ubuntu-2004-focal-64-minimal:~/pool$ sudo docker exec pool-core-1 cardano-cli node key-hash-VRF --verification-key-file /keys/vrf.vkey
928bd4102ab319fdf5540bcb87cbb863601658c1ce74841137bffd7996a4cd53

https://cardanoscan.io/pool/3bd3996595321d951291b11e1331061c5d8659d9e69390536dfc922c
Vrf Hash
928bd4102ab319fdf5540bcb87cbb863601658c1ce74841137bffd7996a4cd53

Looks ok… hope i’ll be fine now, but last time my node.cert also had a valid KES-Period (10 days were left) so im still unconfident is these checks are sufficient.

can u check again to run cncli to check if this time will show the block?

read this article

i already did, as i mentioned, now it showed up even when i didnt change the vrf.skey, i think i just didn’t look attentive enough =( maybe if i had seen it i double-checked everything in before

1 Like

Seems that CRAB had exactly done the same, used an old kes.skey… ok peter just be patient ^^

1 Like