Be assigned a slot but the pool is not leader in real situation

I check leader logs by cardano-cli query leadership-schedule
The pool was assigned a slot based on the result but the pool is not the leader when the slot comes.

The leader logs result:

     SlotNo                          UTC Time              
-------------------------------------------------------------
     75451711                   2022-10-29 04:33:22 UTC

The node logs when slot comes

Oct 29 04:33:22 cardano-node[6926]: [cardano.node.LeadershipCheck:Info:3453] [2022-10-29 04:33:22.00 UTC] {"chainDensity":4.892696e-2,"credentials":"Cardano","delegMapSize":1224583,"kind":"TraceStartLeadershipCheck","slot":75451711,"utxoSize":8834177}

Oct 29 04:33:22  cardano-node[6926]: [cardano.node.Forge:Info:3453] [2022-10-29 04:33:22.03 UTC] fromList [("credentials",String "Cardano"),("val",Object (fromList [("kind",String "TraceNodeNotLeader"),("slot",Number 7.5451711e7)]))]

Oct 29 04:33:22 cardano-node[6926]: [cardano.node.Mempool:Info:419753] [2022-10-29 04:33:22.23 UTC] fromList [("kind",String "TraceMempoolAddedTx"),("mempoolSize",Object (fromList [("bytes",Number 49418.0),("numTxs",Number 20.0)])),("tx",Object (fromList [("txid",String "582171ac")]))]

Oct 29 04:33:22  cardano-node[6926]: [cardano.node.Mempool:Info:419753] [2022-10-29 04:33:22.27 UTC] fromList [("kind",String "TraceMempoolAddedTx"),("mempoolSize",Object (fromList [("bytes",Number 51636.0),("numTxs",Number 21.0)])),("tx",Object (fromList [("txid",String "88754c4d")]))]

Oct 29 04:33:22 cardano-node[6926]: [cardano.node.Mempool:Info:419753] [2022-10-29 04:33:22.31 UTC] fromList [("kind",String "TraceMempoolAddedTx"),("mempoolSize",Object (fromList [("bytes",Number 53198.0),("numTxs",Number 22.0)])),("tx",Object (fromList [("txid",String "d6f9d684")]))]

String “TraceNodeNotLeader”),(“slot”,Number 7.5451711e7)

I would say that you have a key problem. Check that you are using the same key to calculate the leadership-schedule as the block producer is running with.

I checked the vrf keys is same

Here is how I would troubleshoot your problem:

  1. Check the key files used by your node (block producer) are exactly the same files as you are using to generate your leader logs.
  2. Generate your leader log for the current epoch using cardano-cli (as you have already done).
  3. Generate your leader log using the cncli tool (after it has fully sync’ed)
  4. Check the software versions of each program you are running are the latest: cardano-node, cardano-cli, cncli

It would be good if there was a check function (program) where you could input:

  • slot number
  • your pool key
  • epoch nonce

and then see if the output is below your pool stake as a percentage of total staked Ada.

I tried to look into how to build such a standalone tool but it is beyond my Haskell coding ability. I think it would be pretty easy for someone that knows their way around the IOG codebase to copy the correct bits. IOG could even provide a separate switch within the cardano-cli for this functionality. It would be a useful testing and reassurance tool for the community.

  1. The key files are identical

    1. cardano-cli and cncli output is different, there’s not assigned slot in cncli, but assigned a slot in cardano-cli leader logs
  2. All nodes 1.35.1

They need to be updated. 1.35.1 shouldn’t run on mainnet for weeks. It was buggy and got replaced weeks ago, long before the hard fork.

1 Like

SPOs really should follow IOG’s announcements! Closely!

On Telegram, they are on: https://t.me/SPOannouncements
The relevant announcement – “In addition, any SPOs running versions 1.35.0, 1,35.1, or 1.35.2 on mainnet should immediately upgrade to 1.35.3.” – was on 12th August: https://t.me/SPOannouncements/181

Or, you could join IOG’s Discord – https://discord.gg/DkZf7exZ – and monitor the groups in “Stake Pool Operators”.

At the very, very least, you should monitor https://github.com/input-output-hk/cardano-node/releases. That 1.35.0, 1.35.1, and 1.35.2 were pulled from there and are not even available anymore, is a strong indication that something’s wrong with them and they should not be used under any circumstances.

Sorry, it’s typo, both node is 1.35.3
Yes I joined the discord and telegram, I saw the announcement then I update to 1.35.3 immediately

2 Likes

Did you change the tpraos to praos in cncli? That changed after the fork.

I used praos again on cncli and the result is same as cardano-cli query leadership-schedule!

So it might be my BP node’s issue?

1 Like

Yes. Still think @Terminada’s guess with the VRF key mismatch is the best. Have you checked in your systemd unit (and perhaps scripts called from it) if the node really uses the VRF key file you think it uses (and that you give to the leaderlog checks)?

Yes I cat the vrf file in both servers and it shows identical
could it possibly be a setting problem with BP node?
Like the BP node did not know it was assigned a slot

What do you mean by both? There should be only one block producer.

And it’s not so much about being the same on different machines (as there should be only one BP machine), but if the one you give to cardano-cli query leadership-schedule with the --cold-verification-key-file and --vrf-signing-key-file options is actually the same that you give to cardano-node with the --shelley-vrf-key option.

Yes there’s only 1 BP node
Since my BP node has only 16GB ram so I copy the vrf file to another local server to run the leader logs command
And the vrf files in BP node and another node are identical

I remember having a similar problem with not minting a block because the cold counter was wrong - error logs should confirm this - I am no technical expert so maybe someone else wants to take this up.

Where do you see error logs? when the slot comes it only shows " TraceNodeNotLeader" which means I’m not leader in this slot

It should show " TraceNodeIstLeader" even if the keys error

When I failed to mint the block I got this error from the log file when I searched the appropriate time of the slot allocation

50544ac10478c0a630f1a00f9","slot":50659312},"error":"ExtValidationErrorHeader (HeaderProtocolError (HardForkValidationErrFromEra S (S (S (S (Z (WrapValidationErr {unwrapValidationErr = ChainTransitionError [OverlayFailure (OcertFailure (InvalidKesSignatureOCERT 390 383 7

I found that rotating the keys afterwards fixed this and making sure I was using up to date cold.counter files - my problem stemmed from using old op certs and cold.counters. I am not technical enough to know if this relates to your problem or not but Alex from CHRTY pool is an expert in this area and maybe have a look at some of his threads

If using cntools the logs are at: /opt/cardano/cnode/logs

Have you stopped the node since the issue? Try running:

curl -s localhost:12798/metrics | grep cardano_node_metrics_Forge

This should show if you were suppose to be the leader of the slot or not.

Also make sure if you’ve minted a block before that you’re using the next in line opc# (example: previous opc#3 next must be opc#4) Vasil made that a breaking change. Ticker?