Troubleshooting not validated blocks

The following block was not validated:

{
    "created_at_time": "2020-02-29T21:41:06.352339908+00:00",
    "scheduled_at_time": "2020-03-01T10:13:45+00:00",
    "scheduled_at_date": "78.27004",
    "wake_at_time": "2020-03-01T10:13:45.000848006+00:00",
    "finished_at_time": "2020-03-01T10:13:45.002801924+00:00",
    "status": {
      "Block": {
        "block": "372bd8cccb2ec0203533ea72cd1dc67c6f2c31ca0359cc6d215788d581bbd104",
        "chain_length": 257764
      }
    },
    "enclave_leader_id": 1
  },

and the block which was validated before my block is:
https://shelleyexplorer.cardano.org/en/block/f945d6ed7352655650d7d4081dc1608ba934c5c405c602680b47d60aa577c85a/

the block which was validated after my block is:
https://shelleyexplorer.cardano.org/en/block/464624133b6e7def44c23617f64db39224fe1ef3bf05cfa01507dbf2df1b5151/

so the slots in a table would be:

EPOCH SLOT TIME STAKE POOL
78 27022 11:14:21, March 1, 2020 842c…c54
78 27004 10:13:45, March 1, 2020 5959…eb5
78 27003 11:13:43, March 1, 2020 107d…0c5
78 26967 11:12:31, March 1, 2020 3033…393

it seams that the previous slot created just 2 sec before my slot. And the next one is almost after 40 sec. So my node probably created the block on a wrong block height. Since there were only 2 sec to sync the block which was created in the previous slot.
3 questions:

  • Why the slot time was so close to the previous one? I think the average should be 20 sec.
  • And which config parameter should be tuned to able to sync block height in such a short period of time.
  • is this situation an example of a height battle?

https://hydra.iohk.io/build/1505847/download/1/itn_rewards_v1-genesis.yaml
genesis的配置文件
我们关注的三个参数
“slot_duration”: 2,
“slots_per_epoch”: 43200 “consensus_genesis_praos_active_slot_coeff”: 0.1
一个时代有43200个slot,每个2秒。
24 * 60 * 60 / 2=43200
consensus_genesis_praos_active_slot_coeff 指出只有0.1比例的slot是能产生块。所以平均的同步时间是20秒,一天满效率出块4320块。
矿池在某一个slot出块是基于概率来保证的。


f is active slots coefficient
αi is the relative stake
所有这一切是由概率来保证的,所以局部的时间段内可能出现比较集中的情况。
回答:
1,平均的间隔时间大概是20秒,但实际上可能局部出现不均匀的情况。这是协议本身就包含的情况。
2,我上面列举的三个参数确定了这些性质。
“slot_duration”: 2,
“slots_per_epoch”: 43200 “consensus_genesis_praos_active_slot_coeff”: 0.1
但是你不能修改他们,这是在测试网启动时就确定的数据。在haskell所写的shelly规范中实现了更改协议参数的提议。通过某一节点提出协议参数修改提议,拥有股权的用户投票来确定是否使用新的参数来运行协议。但是测试网应该没有实现这一功能。
3,这并不是一个height battle的例子。假设你也是在slot 27003时出块,那么你和别人才是height battle。
如果你的网络足够好,那么你应该能在2秒内收到27003块,然后你自己再出27004块,这样效率是最高的。
如果你网络不够好,没能同步到27003块,那么这时27003,27004会指向同一个块,这时网络将产生分叉。由下一个块决定网络的走向。27003 27004都是合法的,但是有一块会被丢弃。
。。。。
有英文比较好的兄弟可以把这个答案翻译成英文吧
我英文比较菜,就直接用中文回答了。

could you please write your answer in English?

https://translate.google.com
Paste the post in here.
Cheers,
D

Szerintem ha lehet ne bízzuk a Google-re hogy egy technikai kérdést próbáljon lefordítani. Mosoly!

Fair enough.
The translation looked reasonable enough, though the maths is all greek to me!
Have a good one.
D

just an update - now I found another block which was validated and has the same conditions:

"created_at_time": "2020-03-02T12:01:22.192603859+00:00",
"scheduled_at_time": "2020-03-02T15:50:11+00:00",
"scheduled_at_date": "79.37097",
"wake_at_time": "2020-03-02T15:50:11.002031353+00:00",
"finished_at_time": "2020-03-02T15:50:11.002331797+00:00",
"status": {
  "Block": {
    "block": "c5a0292a5fad76abfc2135ccf9f71766f6bc325371891ad99c4ea65b4bf1a1d8",
    "chain_length": 262178
  }
},
"enclave_leader_id": 1

https://shelleyexplorer.cardano.org/en/block/c5a0292a5fad76abfc2135ccf9f71766f6bc325371891ad99c4ea65b4bf1a1d8/

EPOCH SLOT TIME STAKE POOL
79 37146 16:51:49, March 2, 2020 1f53…5d2
79 37127 16:51:11, March 2, 2020 9d51…b8b
79 37121 16:50:59, March 2, 2020 f1c9…3a2
79 37097 16:50:11, March 2, 2020 5959…eb5
79 37096 16:50:09, March 2, 2020 01bd…d2a
79 37091 16:49:59, March 2, 2020 9b00…187

So at least it means that my node is capable of fetching the latest validated block which was created in the previous slot.
So the question is why this time it was a successful creation?
Anybody else who has blocks not validated? in those cases what are the reasons?
Thanks,

How is the second case related to the first? In the second you reference a block which was validated, in the first one which was not.

Slot time is 2 seconds not 20. If you lose any other battles but competitive slots your pool is simply not up to the tip at that time that it matters.

Use tools like Prometheus with time-series monitoring to tune your pool. Which settings work for my pool might not work for yours. Every pool environment, latency etc. is slightly different.

Do check the parent block hash in the Jormungandr log and compare with the hash of the winning block, they will be different if it is not a competitive slot.

Your blockheight time tracking should look like this:
image

如果网络条件没有变化,那么可能的情况是:产生37096块的节点距离你很近,所以你很快就能同步到这个块。

Hi!

So the second case is about an info that the node can validate a block in a slot which is so close to the previous slot. So it is relevant if you want to troubleshoot the first case.
Can you share the info how to get the parent hash of a block which was not validated? Thanks.

If that the case I will try to increase the max connections of my node
Right now is about ~200.

To get the parent hash of the block which was not validated just grep your jormungandr log for the “leadership” event, all the details are found there.

1 Like

That means that the node log level should be set to info… any other way to get it? but thanks for this info - I did not know about that…

No other way that I know of. INFO is the minimum debug level you should use in a testnet IMHO.

perhaps the problem is not with the slot schedule. Since there were a case when the schedule was ideal - after 40 sec end before 20 sec between slots. And was not validated…

"created_at_time": "2020-03-03T20:16:47.470018238+00:00",
"scheduled_at_time": "2020-03-04T06:14:37+00:00",
"scheduled_at_date": "81.19830",
"wake_at_time": "2020-03-04T06:14:37.000967089+00:00",
"finished_at_time": "2020-03-04T06:14:37.002122292+00:00",
"status": {
  "Block": {
    "block": "b80ec5b8d6a752f96e6b65dca868630c74c3aeea6dd6f1280baaefbf095a67f8",
    "chain_length": 268006
  }
},
"enclave_leader_id": 1

here is the link for the next block which was validated:
https://shelleyexplorer.cardano.org/en/block/e009a187d30d9907b9f04651bae9b6ca53f4c0ab82e179f7beaa0a00551375c3/
And the sequence of the blocks would be the follow:

EPOCH SLOT TIME STAKE POOL
81 19846 06:15:09, March 4, 2020 9277…8d7
81 19843 06:15:03, March 4, 2020 365d…434
81 19830 06:14:37, March 4, 2020 365d…434
81 19811 06:13:59, March 4, 2020 7e03…04f
81 19800 06:13:37, March 4, 2020 4437…8bd

So I think I have to increase the peer max connections to able to sync with the latest hash. I will also set the log level of the node to able to determine the parent hash of block which was not validated.

1 Like

you get get the parent hash with the following API request:

jcli rest v0 block <blockhash> get -h <url> | cut -c 105-168

1 Like