Since 1.25.1, my relays use a lot of CPU constantly.
I had this issue when I used the 1.24.x because of the well-known “HardForkEncoderDisabledEra” issue.
But this is not the case here as each log doesn’t contain this error. In fact, there is no error at all.
The only thing I can see that is bugging me is the spam of “BadInputsUTxO” log like this one:
{"at":"2021-02-06T18:36:10.08Z","env":"1.25.1:9a733","ns":["cardano.node.Mempool"],
"data":{"tx":{"txid":"txid: TxId {_unTxId ="11b27c665310dc29765054f1f4aa3cf8087d2c4aa5d618be98ac84824effb82d\"}"},
"kind":"TraceMempoolRejectedTx","mempoolSize":{"numTxs":6,"bytes":3075},"err":
{"kind":"BadInputsUTxO","error":"The transaction contains inputs that do not exist in the UTxO set.","consumed":0,"badInputs":
[["c31b3d8a75d542466c8f9e87b2121868a17358d18e4213f0a48f267848d0fcd9",0]],
"produced":77411931261}},"app":],"msg":"","pid":"3290056","loc":null,"host":"ubuntu","sev":"Info","thread":"346"}
Doing a tail -f on each log display this line every second. It’s not a different “BadInputsUTxO” log for each line, but the very same one:
My main question: if you want to keep your relays running this way, how would you generally know the relay was accepting transactions at all?
I considered this as a somewhat drastic measure for one of our two relays. Unlike the other, it has peers who like to submit about 12 copies of transactions. The first in each of those groups succeeds, and of course all the following copies fail with invalid UTxO’s:
So it certainly keeps the load average down on the affected relay to turn off TraceMempool… presumably by saving the CPU load of having to mathematically invalidate the remaining ~11 transactions… but then unfortunately we have no confirmation the original, valid transaction is going through either.
You don’t “have to” change this parameter. I did it because my relays were consuming a lot of CPU because of theses bad transactions.
Like any other Trace, it’s here to log more data but I assume it doesn’t change the core behavior of relays as my tip is always the highest and I still see the transactions number growing on the block-producer.
If i understand this correctly, is someone sending redundant transactions to the blockchain? Is there any chance this is malicious? Almost like a DDoS?
It seems so. Saying it’s like a DDoS is a step I won’t take, but it can be quite expensive in term of ressources.
It’s rough for us to loose this visibility by turning Mempool trace to false.
As I said and did, you can keep the Mempool trace to “true” on the block-producer only. This way, you are still able to keep track of your transactions number growing, without the need of upgrading your infra because of this spam.
Thanks for filing this guys. I have the same issue since the hard fork.
For now I have mitigated it be assigning more resources to the relay nodes.
Important to start the node with flags +RTS -N4 -RTS for 4 cores, or +RTS -N8 -RTS, otherwise it will stick at utilizing no more then 2 cores in paralel.
It would be great to get some insights into what the “TraceMempool” flag in the config actually does, and what the effect of turning it off is. The different between on and off in CPU usage is ~ a factor 8 in my case. (~5% when off => ~40% when on), and it results in ~10k log messages an hour:
I can not believe that the mere logging, results in this much CPU usage.
If the point of the setting is not about the logging of transaction, but about the actual settling / validating of them, there are some other questions:
Why does the node attempt to validate the same transaction multiple times (logically resulting in everything after the first time failing)
What is the added value of a node with “TraceMempool” off, if it results in it no longer validating the transactions? Is it enough to do this only on BP-Nodes? What is the effect on the network if we would turn this trace off on all relays?
I think the link from my previous reply give the answer for your first point.
For the second point, all the Trace are used to log what’s happening on your relays/block-producer. Turning them on/off only affect the informations you want to display or save in text file.
This doesn’t affect the core behaviour of the node.
I only turned it ON on the Block-producer and was able to create blocks.
I see what you are saying, but as I mentioned in my previous message, it is hard for me to believe that just the logging of ~10k lines will cost ~35% of my CPU capacity (that would be some crappy coding if it does)… so there must be more going on! I simply want to understand what that is, and what the effect is of turning it off, not just for my BPnode, but for the network as a whole.
It’s just that enabling tracing of mempool data implies creating a log entry detailing the problem listing the utxo involved in a very inefficient way, this was fix for Shelley in this PR but removed when they switched to the Mary era…
It’s very good replies for me. But I think can’t check the tx is meaning of I don’t know my pool status. So I changed 1 relay config. We need to trace that issue.