Strange gLiveView behavior after 1.26.1

mcvetyty · 8 April 2021 15:17

Hello -

So far, I have updated six servers to 1.26.1 and am noting something strange with all six after the update. I like to leave .gLiveView open on the servers and now when I go back and check on them after a period of time I see that .gLiveView is closed with the following error:

COULD NOT CONNECT TO A RUNNING INSTANCE, 3 FAILED ATTEMPTS IN A ROW!

When I re-open .gLiveView the “Uptime” does not show that the node went down and there are no strange errors in the journal log.

Anyone experiencing the same or have any ideas why this is happening?

Alexd1985 · 8 April 2021 16:53

What messages?
sudo systemctl status cnode

Are u using cntools?

mcvetyty · 8 April 2021 16:58

No messages at all. Just gliveview stops with the message error above.

Yes using cntools but have experienced this on all nodes. Not just BP or relays. Both.

Alexd1985 · 8 April 2021 17:15

Ok, sudo systemctl status cnode

mcvetyty · 8 April 2021 18:28

cnode.service - Cardano Node
Loaded: loaded (/etc/systemd/system/cnode.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2021-04-07 23:53:34 UTC; 18h ago
Main PID: 3295573 (cnode.sh)
Tasks: 16 (limit: 4682)
Memory: 3.5G
CGroup: /system.slice/cnode.service
├─3295573 /bin/bash /opt/cardano/cnode/scripts/cnode.sh
└─3295654 cardano-node run --topology /opt/cardano/cnode/files/topology.json --config /opt/cardano/cnode/files/config.json --database-path /opt/cardano/cnode/db →

Apr 07 23:53:34 *****-relay-2 systemd[1]: Started Cardano Node.
Apr 07 23:53:35 *****-relay-2 cnode[3295573]: Failed to query protocol-parameters from node, not yet fully started?
Apr 07 23:53:35 *****-relay-2 cnode[3295573]: WARN: A prior running Cardano node was not cleanly shutdown, socket file still exists. Cleaning up.
Apr 07 23:53:36 *****-relay-2 cnode[3295654]: Listening on http://0.0.0.0:12798

Alexd1985 · 8 April 2021 18:29

And what is the output from glive? Saying could not to connect …?

mcvetyty · 8 April 2021 18:33

Yes. It will run fine and stay open for hours, and then all of a sudden close to command prompt with this message present:

COULD NOT CONNECT TO A RUNNING INSTANCE, 3 FAILED ATTEMPTS IN A ROW!

But when I reload gliveview, all is fine and there was no reset of “Uptime” counter or anything…

tsipou · 8 April 2021 18:36

cardano-node run --topology /opt/cardano/cnode/files/topology.json --config /opt/cardano/cnode/files/config.json --database-path /opt/cardano/cnode/db →

on this command , line, do you see the real long path for the Socket?

try to run the node manually by commands line, and let us know

Alexd1985 · 8 April 2021 18:36

Aaa, understand… I don’t know what to say if it’s working for hours and then stop and if u run again will run fine…

mcvetyty · 8 April 2021 19:49

@tsipou I also believe it to be related to my socket but I do not know why… How do I see the full run command that is executed when I use sudo systemctl start cnode.service?

Alexd1985 · 8 April 2021 19:56

right arrow from the keyboard

mcvetyty · 8 April 2021 20:01

hahahaha, duh!! Thank you… here’s the remainder of the parameters:

–config /opt/cardano/cnode/files/config.json --database-path /opt/cardano/cnode/db --socket-path /opt/cardano/cnode/sockets/node0.socket --port 3001 --host-addr 0.0.0.0

Alexd1985 · 8 April 2021 20:08

I think u are fine…

MantisPool-MANT · 8 April 2021 21:39

I have seen an increase of this on my 1.26.1 upgraded nodes as well. More so than with any past node versions. I wonder if the node is disconnecting from the network ever so slightly but then reconnecting almost immediately. I did lose one block today at almost the exact same time that the “could not connect message” occurred, I was scheduled to mint a block but the block was not minted. Strange. Maybe others will notice this as well and then we’ll see if it is a common problem.

mcvetyty · 8 April 2021 22:05

Yes, this is exactly my concern… Is there a way to raise this issue with developers?

mcvetyty · 9 April 2021 01:27

Quick update: I updated TraceMemPool to “false” on one of my servers a few hours ago where this was happening frequently before and it has not happened since. I will let it run overnight and report back…

POOLG · 9 April 2021 10:11

Confirmed. I’ve observed that a couple of times.

tsipou · 9 April 2021 10:50

guys,
only for your info

when you use the new Topologyupdater script, please go and edit the CUSTOM_PEERS= and use coma ( , ) instead of : between IP and Port

read the description of the Parameter.

mcvetyty · 9 April 2021 11:19

Another update - that node where I updated the TraceMemPool config has run all night with zero interruptions. Looks like this might be the solution - I’ll be making the update on all my nodes this morning.

mcvetyty · 9 April 2021 11:22

@tsipou I have not made this change but my custom peers appear to be populating a expected?

Topic		Replies	Views
gLiveView troubleshooting Setup a Stake Pool	17	1530	1 December 2021
gLiveView Error after 1.34 upgrade Operate a Stake Pool	25	2260	4 March 2022
Following CoinCashew for a new pool setup ./gLiveView not starting Setup a Stake Pool	57	4685	6 March 2021
gLiveView.sh Setup a Stake Pool	1	423	14 May 2021
Update from 1.25.1 to 1.26.1 done but gLiveView showing 1.25.1 Community Technical Support	5	498	12 April 2021

Strange gLiveView behavior after 1.26.1

Related topics