Debugging node question cardano_node_metrics_slotsMissedNum_int

ToTheMoonADA · 28 April 2021 18:25

Hi,

I was looking at my node and wanted to see if I missed a block. Can someone help explain:

cardano_node_metrics_slotsMissedNum_int 195
rts_gc_max_bytes_slop 58062976
cardano_node_metrics_Stat_threads_int 15
cardano_node_metrics_density_real 4.901070974768561e-2
cardano_node_metrics_epoch_int 262
cardano_node_metrics_Forge_node_not_leader_int 435714

laplasz · 28 April 2021 19:31

similar topic:

and some workaround to solve the missing slot (not perfect)

ToTheMoonADA · 28 April 2021 19:37

hi,

is there a method i can use to see how many slots i’ve missed and where?
journalctl --since=‘2021-04-20’ | grep -c Missed

laplasz · 28 April 2021 19:40

this metrics provided by prometheus - you can query them on the node machine:

$ curl localhost:12798/metrics
cardano_node_metrics_nodeIsLeaderNum_int 3
rts_gc_par_tot_bytes_copied 543077170096
rts_gc_num_gcs 138068
cardano_node_metrics_slotsMissedNum_int 152
rts_gc_max_bytes_slop 63980872
cardano_node_metrics_served_block_count_int 126996
...

where 12798 is the listening port of prometheus

ToTheMoonADA · 28 April 2021 19:44

Is there a possibility to look at what caused the missing from below. Im trying to investigate what happened that cause the missing.

I’m running the BP on an aws t2.large - 2 vCPU, 86_64, 8gb memory - network perforamnce is low to moderate.

cardano_node_metrics_slotsMissedNum_int 195

laplasz · 28 April 2021 20:23

Do you have logs with cardano.node.BlockFetchClient trace enabled?

ToTheMoonADA · 28 April 2021 20:34

It is currently set to false. Will turn it on and restart the bp node.

“TraceBlockFetchClient”: false,
“TraceBlockFetchDecisions”: true,
“TraceBlockFetchProtocol”: false,
“TraceBlockFetchProtocolSerialised”: false,
“TraceBlockFetchServer”: false,

laplasz · 28 April 2021 20:56

So need some post processing on logs - if there is a new tip then there is blockNo and a slot number. So if you collect these events the blockNo should be increased one by one. If there is gap then a slot was missed

mahdi_gha · 28 April 2021 21:48

no.

"slots Missed " This happens when your BP cannot detect at a specific time whether it was a leader or not. . Many factors cause this to happen
node version , CPU , ram , connection peer , storage drive , run other service on server , use relay and BP in one system , and…

laplasz · 28 April 2021 22:05

hmm, is there a specific log for this event?
and what is the consequences if a slot is missed? the node may not creating the block even it was scheduled to do it?

ToTheMoonADA · 28 April 2021 22:18

Can you please elaborate more? I have the BP on aws t2.large but can convert to a c5.large and 1 Relay on aws t2.large but can convert to a c5.large. I have 1 relay baremetal running with an i7, 16gb and 512gb ssd. My guess it was due to the high tx count and the instance couldn’t keep up.

Capture

ToTheMoonADA · 28 April 2021 22:25

Hi Laplasz,

does this information help? I found it just after enabling blockFetchDecision and restarting
[2021-04-28 22:21:31.63 UTC] fromList [(“tx”,Object (from List [(“txid”,String “txid: TxId {_unTxId = SafeHash “302be2aabc37ef21ec99c7469225d6bdbe1937cb1933fef74e6f2916fec51cbb”}”)])),(“mempoolSize”,Object (fr omList [(“bytes”,Number 1759.0),(“numTxs”,Number 5.0)])),(“kind”,String “TraceMempoolRejectedTx”),(“err”,Object (fromList [(“badInputs”,Array [String “8d d64896f8810b7336f7a3ed8d2273f6283253a0afb76358433a30b5c60c4ab2#9”]),(“consumed”,Object (fromList [(“lovelace”,Number 1.4475606e7),(“policies”,Object (fro mList ))])),(“error”,String “The transaction contains inputs that do not exist in the UTxO set.”),(“incorrectWithdrawals”,Array [Array [Object (fromLis t [(“credential”,Object (fromList [(“key hash”,String “1bb3e0a32ea31e8a1d80e63d9d669f74c3d624cee0af47f12ef2db18”)])),(“network”,String “Mainnet”)]),Numbe r 1.4475606e7]]),(“kind”,String “WithdrawalsNotInRewardsDELEGS”),(“produced”,Object (fromList [(“lovelace”,Number 6.014531525e9),(“policies”,Object (from List ))]))]))]

laplasz · 28 April 2021 22:28

no… that is transaction related

ToTheMoonADA · 29 April 2021 05:56

upgraded 1 of the relay to a c5.large instance. 6 hours of monitoring and no more missed slot thus far. Will update the BP soon.

mahdi_gha · 29 April 2021 09:00

i don’t see it.
you can see it in metric data from : curl -s http://localhost:12798/metrics
yes that’s right. you have 20 sec to submit block or lose it.

mahdi_gha · 29 April 2021 09:07

what is your BP HW config . what version you use (cardano-node --version)
your HW relay is fine .
no your tx count is fine . i have 250-300 tx in mempool but did’t lost any block.

jf3110 · 29 April 2021 22:10

Unfortunately, slots start missing after about 9-11 hours again. At least for me. At the same time memory consumption rises from 2.5 GB to almost 5GB. Looks like a memory leak to me. IMO that needs to be addressed by development team.

This is happening on a 16GB 6 core configuration - i.e. no memory or processing power problem.

ToTheMoonADA · 29 April 2021 22:25

I’ve been up for 24 hours now officially with no more miss slots. I also up the network bandwidth. Could internet speed be playing a factor in responding back to in pulling a slot for LeaderSlot?

laplasz · 29 April 2021 22:56

Could you open an issue on github site?

It has been not registered yet by others as I can see…

ToTheMoonADA · 29 April 2021 23:15

hi @laplasz is it normal to have many 0 tx in the mempool? I’m seeing it pretty often after a small spike, it goes to 0 again and back. I would think it should above 0tx… Capture

Topic		Replies	Views
Metrics: Missed slots and leader - What are they? Operate a Stake Pool	0	350	29 May 2021
About missed slots Setup a Stake Pool	15	1153	7 May 2021
Missing some metrics e.g. "slotsMissedNum_int" Community Technical Support	2	543	25 September 2022
SlotsMissedNum_int Operate a Stake Pool	4	706	30 December 2021
Missed slots Operate a Stake Pool	59	2579	8 October 2021

Debugging node question cardano_node_metrics_slotsMissedNum_int

Related topics