I work on an app that uses the cardanocli-js library. Our app has its own relay node, which the js library interacts with. This all worked fine until the network congestion hit the other week. After that, we’re completely sunk. We can read addresses and chain state fine, just can’t get anything out.
We also have access to a few other relays as well as a block producer. So I’ve been working with that team to try to get our relay to peer off the other 2 relays and BP. Everything’s all connected and we can see that the connects are good (all the nodes are using cntools).
We’ve upgraded all of them to 1.33.0 thinking maybe that would help. Nope. We can generate the transactions fine, like and we get a transaction id. But, plugging that id into Cardanoscan never pulls up anything. I’ve done probably half a dozen of these transactions today and none of them land on chain.
Our application relay seems happy - the tip diff pretty much stays in the green. All the machines are 16GB, 6-8 core vm’s. Once thing I’ve noticed is our app relay does not show any values for the Total Tx and Pending Tx values. The other relays do have values there. Those values on the BP are also empty, but it’s producing blocks - the other team says it’s working.
We tuned our app relay topology to not accept any incoming connections except for the other nodes we have access to. The main dev cites security and other issues I can’t go into here. So, I can’t change the topology entries unless he green lights them. At this point, I’m just trying to figure out what else I can do to get our transactions out on the network. I know the congestion is going on and that things are slow, but the other team said their BP could withdrawal rewards and send them (albeit a little slower than usual), and with our app node tied in to these other nodes, I’m just trying to figure out why none of our transactions make it out.
The nodes are connected with the blockchain (OUT connections)? How are the nodes connected?
What is the error received? Have u ever been available to send transactions in the past?
Yah, we were able to send transactions out no problem prior to the congestion.
Our app node has 15 connections out, 2 in. The two in, we allow, are for our other team’s relay and their BP. The other team’s relays both have about 17 in and out. Both of those relays have Total Tx and Pending Tx counts. Quite high too. Ours has no Tx counts, even when we send out a transaction.
I spoke with another friend last night who has an NFT minting service. He said his node has no inbound connections, 2-5 outbound, and he’s able to mint NFT’s. His node is 1.31.0 and he hasn’t upgraded yet.
So, I’m scratching my head trying to figure out if it’s something else. Things I’ve tried to check, re-configure:
Upgraded to 1.33.0
Have plenty of disk space, ram, swap, and cpu’s
Latest updates applied, Ubuntu 21.x
Direct connections to a BP (currently minting blocks) and other relays that do process transactions
Our transactions do not mint NFT’s. Mostly process payments, can include NFT transfers
Our relay is totally firewalled, ports/ip’s only opened for topology servers (no outbound blocking)
My team and I are doing a review too to make sure something wasn’t introduced somewhere (I suspect not). I’ll keep digging, but it’s just weird that as soon as the congestion started, we literally cannot get any transactions on chain no matter how long we wait.
One thing I did notice is querying via the cli on a 1.30.1 & 1.31.0 nodes - they we’re both significantly faster than two 1.33.0 nodes. I did this query for an address utxos:
For the 1.33.0 nodes, the query returned in about 7 sec. For the other two nodes with lower versions, 1 sec return on the query. All of the nodes are 16GB vms with 6-8 cpus.
My assumption was from what I read that 1.33.0 is a substantial update. But all my tests are showing it’s much slower. I’m trying to figure out next steps. At this point, it’s going to be very difficult to roll everything back and a re-sync is going to take way too long. I guess I’ll see if I can use the db with 1.32.1 and check the query times with that.
I fixed the transaction issue. The cli flag --invalid-hereafter was set for around 7 mins. I bumped it up to 15 and after waiting a while, I was able to verify a transaction on Cardanoscan.
I’ll have to see how 1.32.1 performs and possibly roll back to 1.31.0 if it’s too long to even fetch utxos.
1.32.1 is it for me. Works perfectly. Query time for the address example above were about 1 sec. Thanks for checking in on this, I know you do it a lot all over the forum. Cheers!