Sudden 1GB RAM usage increases on both relay+bp

DanTup · 22 April 2021 18:46

Around an hour ago, the RAM usage of both relay + producer increased by about 1GB. There was a small amount of CPU activity when it occurred:

Screenshot 2021-04-22 at 19.31.42

Zoomed in:

Screenshot 2021-04-22 at 19.42.42

I did some searching and found this issue this seems to indicate rewards calculations are 48hrs after epoch start, though it doesn’t seem like we’re at 48hours yet.

Seems a bit strange for such a large increase in such a short space of time on both. Any ideas what may have happened? (there’s plenty of RAM, I’m just curious). There was not a corresponding increase in mempool size/txs when this occurred.

kaverne · 23 April 2021 00:34

Can you correlate this with some log events ? Are you even sure it’s from cardano processes ?

billcarroll · 23 April 2021 01:09

I’m seeing the same and it appears to correlate with with live data based on rts_gc_max_bytes_used

Screen Shot 2021-04-22 at 6.57.10 PM

kaverne · 23 April 2021 01:15

How much memory you have to go this low ? 4G ?

kaverne · 23 April 2021 01:16

Also be careful mate, you posted you IPs, you ports etc …
As good practice and for your own benefit, try hiding those information when you share things.

billcarroll · 23 April 2021 03:11

Hey, thanks for noticing. Those ports and IPs are not external.

kaverne · 23 April 2021 03:14

Yes I was suspecting, but as I said, good practice

DanTup · 23 April 2021 06:05

Sorry, I didn’t include much because I thought it might’ve happened to everyone or there would be an obvious answer.

This is the memory usage from Kubernetes pods running the IOHK cardano-node image.

I don’t think this was at me, but my host machine has 64GB and is running nothing but two cardano-node containers (one relay, one producer).

               total        used        free      shared  buff/cache   available
Mem:           62Gi       9.3Gi        28Gi       3.0Mi        24Gi        52Gi
Swap:         8.0Gi          0B       8.0Gi

DanTup · 23 April 2021 06:13

Also - I restarted the relay after this happened and the memory usage did something around 7 hours later this morning at 4am:

Screenshot 2021-04-23 at 07.07.17

Looking in the logs around the time this happened (around 3:37am) I can’t see anything that looks unusual to me (though I don’t understand a lot of the output). There is a lot of FetchDeclineConcurrencyLimit and FetchDeclineChainNotPlausible logs (I have fetch decisions enabled because that seemed to be required for some of the metrics I wanted, though I don’t recall which ones), but nothing that looks like a warning or error.

kaverne · 23 April 2021 06:31

Yeah ok it’s containers on k8s.
It seems to be a common problem all over the place. There are other people with the same issue.

DanTup · 23 April 2021 07:04

Most of that thread seems to relate to a slow increase, but that seems different to what happened here (which was a sudden increase of around 1GB).

Dumping the ledge state (as also mentioned in that thread) does seem intensive and increases usage, but that was true for me on previous versions too. I haven’t done that on this new machine (and certainly not at 3am) so I don’t think that’s the same either.

billcarroll · 23 April 2021 13:20

I’m using docker, slimmed down Debian

kaverne · 26 April 2021 03:46

Yes you are probably right, but I feel there might be a correlation somewhere (maybe not).
Just since 1.26.2 both problem a memory linked, time linked and so released linked.
Also maybe they haven’t detected that step behavior you have but still get it.

It might also be different issues you are right though …

Topic		Replies	Views
Newbie question regarding memory usage Operate a Stake Pool	5	588	7 September 2021
1.26.2 high memory usage Operate a Stake Pool	9	1004	25 April 2021
High memory usage Operate a Stake Pool	7	564	1 September 2022
Upgraded to 1.29.0 - High memory usage Operate a Stake Pool	37	2961	9 September 2021
1.26.1 and memory usage with traceMemPool Operate a Stake Pool	4	602	14 April 2021

Sudden 1GB RAM usage increases on both relay+bp

Related topics