Hello community, I was producing blocks with version 1.35.3 but after upgrading to 1.35.5 I missed a block yesterday and in trying to figure out what happened I notice some oddities (Stakepool CTL). Firstly my log is writing the wrong version. See below, LiveView shows 1.35.5 however my logs show 1.35.3
Secondly I cannot view the operational certificate via cardano-cli:
cardano-cli query kes-period-info --op-cert-file op.cert --mainnet
Command failed: query kes-period-info Error: op.cert: op.cert: openFile: does not exist (No such file or directory)
which cardano-cli
/home/core/.cabal/bin/cardano-cli
I edited the env file and changed:
#CCLI="${HOME}/.cabal/bin/cardano-cli" # Override automatic detection of path to cardano-cli exec>
#CNCLI="${HOME}/.cargo/bin/cncli"
to
CCLI="${HOME}/.cabal/bin/cardano-cli" # Override automatic detection of path to cardano-cli exec>
CNCLI="${HOME}/.cargo/bin/cncli"
I then restarted the node:
sudo systemctl restart cnode
but it still returns error:
cardano-cli query kes-period-info --op-cert-file op.cert --mainnet
Command failed: query kes-period-info Error: op.cert: op.cert: openFile: does not exist (No such file or directory)
Which is pointing to the old binary. I assume that the cnode.sh file contains an path variable pointing to the wrong binary but shouldn’t those paths be in the env file, or is it located in the bashrc file?
Here is part of the env file (no further un-commented parts in the config):
CCLI="${HOME}/.cabal/bin/cardano-cli" # Override automatic detection of path to cardano-cli executable
CNCLI="${HOME}/.cargo/bin/cncli" # Override automatic detection of path to cncli executable (https://github.com/AndrewWestberg/cncli)
#CNODE_HOME="/opt/cardano/cnode" # Override default CNODE_HOME path (defaults to /opt/cardano/cnode)
CNODE_PORT=3001 # Set node port
#CONFIG="${CNODE_HOME}/files/config.json" # Override automatic detection of node config path
#SOCKET="${CNODE_HOME}/sockets/node0.socket" # Override automatic detection of path to socket
#TOPOLOGY="${CNODE_HOME}/files/topology.json" # Override default topology.json path
#LOG_DIR="${CNODE_HOME}/logs" # Folder where your logs will be sent to (must pre-exist)
#DB_DIR="${CNODE_HOME}/db" # Folder to store the cardano-node blockchain db
#UPDATE_CHECK="Y" # Check for updates to scripts, it will still be prompted before proceeding (Y|N).
#TMP_DIR="/tmp/cnode" # Folder to hold temporary files in the various scripts, each script might create additional subfolders
#USE_EKG="Y" # Use EKG metrics from the node instead of Prometheus. Prometheus metrics yield slightly better performance but>
#EKG_HOST=127.0.0.1 # Set node EKG host IP
#EKG_PORT=12788 # Override automatic detection of node EKG port
#PROM_HOST=127.0.0.1 # Set node Prometheus host IP
#PROM_PORT=12798 # Override automatic detection of node Prometheus port
#EKG_TIMEOUT=3 # Maximum time in seconds that you allow EKG request to take before aborting (node metrics)
#CURL_TIMEOUT=10 # Maximum time in seconds that you allow curl file download to take before aborting (GitHub update process)
#BLOCKLOG_DIR="${CNODE_HOME}/guild-db/blocklog" # Override default directory used to store block data for core node
#BLOCKLOG_TZ="UTC" # TimeZone to use when displaying blocklog - https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
#SHELLEY_TRANS_EPOCH=208 # Override automatic detection of shelley epoch start, e.g 208 for mainnet
#TG_BOT_TOKEN="" # Uncomment and set to enable telegramSend function. To create your own BOT-token and Chat-Id follow guide at:
#TG_CHAT_ID="" # https://cardano-community.github.io/guild-operators/Scripts/sendalerts
TIMEOUT_LEDGER_STATE=3600 # Timeout in seconds for querying and dumping ledger-state
#IP_VERSION=4 # The IP version to use for push and fetch, valid options: 4 | 6 | mix (Default: 4)
#DBSYNC_QUERY_FOLDER="${CNODE_HOME}/files/dbsync/queries" # [advanced feature] Folder containing DB-Sync chain analysis queries
#WALLET_FOLDER="${CNODE_HOME}/priv/wallet" # Root folder for Wallets
#POOL_FOLDER="${CNODE_HOME}/priv/pool" # Root folder for Pools
# Each wallet and pool has a friendly name and subfolder containing all related keys, certificates, ...
POOL_NAME="stake_pool" # Set the pool's name to run node as a core node (the name, NOT the ticker, ie folder name)
I have edited the env file and commented the two lines pointing to .cabal folder.
I have copied all the contents from the .cabal folder to the .local folder:
cd ${HOME}/.cabal/bin/
cp ./* ${HOME}/.local/bin/
Then removed the .cabal folder:
rm -r -f ${HOME}/.cabal/
I then rebooted the machine and view the log file. It now shows 1.35.5 in the env. The cardano-cli is now being referenced in the .local folder correctly:
which cardano-cli
/home/core/.local/bin/cardano-cli
I thought I was in the clear but running the command still gives an error:
cardano-cli query kes-period-info --op-cert-file op.cert --mainnet
Command failed: query kes-period-info Error: op.cert: op.cert: openFile: does not exist (No such file or directory)
ok, but from which director are u running this command?
you need to be inside /priv/pool/pool_folder/ if you use it without path…
if you are running it from another location you will need to add the full path like…
I have followed the instructions for migrating to guild-deploy.sh and have reset the config files after update. Done this to both relay and core machines then rebooted both.
The nodes seem to be running ok with 14 connections incoming and 14 connection outgoing from the relay.
The problem is with the core machine, there are 2 outgoing connections (expected) but 0 incoming connections. I doubled check to make sure that the env file has the correct port set as per the previous env file.
you must also edit, for BP the topology file to add your relays inside and for Relays edit topologyUpdater.sh to add your BP (and other custom peers) to custom peers line (and uncomment the line)
you can find the informations in old files which were backup-ed
It was the topologyUpdater.sh file on the relay node and once I amended custom peers as you suggested, the BP node now has incoming connections.
Back to the original problem on why I missed a block, does the below operational certificate look ok?
cardano-cli query kes-period-info --op-cert-file /opt/cardano/cnode/priv/pool/stake_pool/op.cert --mainnet
✓ Operational certificate's KES period is within the correct KES period interval
✓ The operational certificate counter agrees with the node protocol state counter
{
"qKesCurrentKesPeriod": 647,
"qKesEndKesInterval": 678,
"qKesKesKeyExpiry": null,
"qKesMaxKESEvolutions": 62,
"qKesNodeStateOperationalCertificateNumber": 6,
"qKesOnDiskOperationalCertificateNumber": 6,
"qKesRemainingSlotsInKesPeriod": 4002447,
"qKesSlotsPerKesPeriod": 129600,
"qKesStartKesInterval": 616
}
Or do you think perhaps it was the servers configuration that prevented the node from minting the block?
I did a search for a couple of key words but nothing came up but to be honest I really don’t know where to start when looking for clues in the log as to why a block was missed. Is there a guide somewhere with steps to determine why a block was missed?