RL9 dnf OOMs with EPEL but only on one machine!

I currently have 7 RL9 VMs running.

On one machine, and only one machine, dnf will OOM the machine. And only when EPEL is enabled.

(For testing I have disabled swap 'cos otherwise the machine goes into swap-of-death and loadav went to over 50!)

eg

dnf makecache
...
[652504.185326] systemd invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
...
[652504.185569] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
...
[652504.185735] [3661998]     0 3661998   212112   178248  1593344        0             0 dnf
...
[652504.185824] Out of memory: Killed process 3661998 (dnf) total-vm:848448kB, a
non-rss:712980kB, file-rss:12kB, shmem-rss:0kB, UID:0 pgtables:1556kB oom_score_
adj:0

But if I disable epel then it works fine

% dnf --disablerepo=epel makecache
Docker CE Stable - x86_64                        25 kB/s | 3.5 kB     00:00    
Extra Packages for Enterprise Linux 9 openh264  6.6 kB/s | 993  B     00:00    
MariaDB                                         3.9 kB/s | 3.4 kB     00:00    
Rocky Linux 9 - BaseOS                          8.1 kB/s | 4.1 kB     00:00    
Rocky Linux 9 - AppStream                        15 kB/s | 4.5 kB     00:00    
Rocky Linux 9 - Extras                           26 kB/s | 2.9 kB     00:00    
Metadata cache created.

Indeed a --disablerepo=* --enablerepo=epel also explodes.

An rpm --verify epel-repease shows the repo file is unchanged.

Even manually cleaning out /var/cache/dnf didn’t help.

On another machine on the same physical host

 /bin/time -v dnf makecache
Extra Packages for Enterprise Linux 9 - x86_64   24 kB/s |  33 kB     00:01
Extra Packages for Enterprise Linux 9 openh264  8.8 kB/s | 993  B     00:00
Rocky Linux 9 - BaseOS                           29 kB/s | 4.1 kB     00:00
Rocky Linux 9 - AppStream                        14 kB/s | 4.5 kB     00:00
Rocky Linux 9 - Extras                           19 kB/s | 2.9 kB     00:00
Metadata cache created.
...
        Maximum resident set size (kbytes): 104388

So I have no idea why this one machine is blowing up!

do all the machines have the same specs? are they hardware or VMs?

They’re all VMs. The machine with the problem has 2vCPU and 2Gb RAM, of which 1.1Gb is in use; the comparison machine which is on the same physical host and worked with no problem only has 1vCPU and 512Mb RAM (330Mb free).

(I know; in theory the test machine is underspec’d but it works; you just can’t do an initial install with that little memory. The problem machine exceeds the minimum requirements).

Ok, but if you re-enable it, with the correct value, are you then able to run makecache epel all the way to the end without it crashing?

This machine has a 1Gb swap partition (so half RAM).

I run collectd on this, reporting into a grafana instance. Fortunately, first time this machine went mad, this kept running (while swap was enabled). So I got a graph…

Console logins would timeout, ssh logins just didn’t proceed. There was no way to get into the machine.

That looks like a swap of death to me (load average increasing because of too much time wasted swapping). So I forcibly rebooted to get the machine back under control.

That’s why I disabled swap, to give the OOM-killer a chance!

So I’m not re-running that test with swap enabled, but I think the graph shows it wouldn’t help.

Well, I think the problem with the swap in this instance would have been due to not being enough of it. You said the VM has 2GB ram, with 1.1GB of it in use. Assuming that was a single process running with 1.1GB, when dnf was running and required ram, it would have wanted to move (at that time) unused processes to swap. If swap is only 1GB, it won’t be able to put the 1.1GB process into it, and so that would fail. In this case, the system should OOM kill the largest process. Theoretically that should have been the 1.1GB process, since dnf at that point only had available 900MB ram (2GB - 1.1GB).

Ideally swap should be double ram for this instance, so 4GB. There is a point when you get over a certain amount of ram, that you start decreasing the size of the swap. But, that all depends on what you are doing with your system. For example, with my laptop, I still need double the ram size for it to suspend/hibernate properly.

This link by RH: Chapter 15. Swap Space | Red Hat Product Documentation

shows, 2-8GB should have the swap equal to the amount of ram. I would however even at this level still have double the ram. Once I get to 16GB ram, I would only then think about having the swap the same as the amount of ram.

You can also set vm.swappiness and change ot from the default of 60 to a lower value to prefer using ram over swap. Eg:

echo 0 > /proc/sys/vm/swappiness

or by setting in /etc/sysctl.d/99-local.conf or whatever filename you want:

vm.swappiness = 0

all that will say is prefer using ram, but only use swap if necessary. Most databases will say to set this to zero. I’ve seen this for Java apps as well. I used to set it to 10 for the majority of my systems, but now I just set it to zero. Setting to zero doesn’t mean it won’t use swap, it will still do it but only if necessary.

1 Like

swap=“twice ram” has never been true on Linux. It’s an old generalisation from early BSD days (eg SunOS 4) and how that swapped. On SunOS if you had swap=“twice ram” then your whole virtual memory was only “twice ram”. (IIRC the first “ram size” of swap was used as backing store, and so only the second “ram size” added to the VM size).

On Linux (and Solaris 2) you only need swap=ram to meet the same “twice ram” virtual memory. The RedHat guidance is based around “expected workload” and is purely for sizing purposes, not for functional reasons.

Further, Linux pages rather than swaps, so even if there was a single 1.1Gb process it would only page out subsets of the least recently used memory rather than the whole application. (Of course there isn’t a single 1.1Gb process; the largest two are mariadb with a 400Mb RSS, HomeAssistant with a 370Mb RSS).

But all this is a diversion from the underlying question; why is dnf exploding with EPEL? I showed, on another machine, that it can complete in under 100Mb.

Now this is interesting. I built a new machine with 1 CPU and 512Mb RAM; makecache initially took 300Mb. Repeatable (cd /var/cache/dnf ; rm -rf * then redo it). I upped the spec to 2Gb RAM, and now makecache takes 800Mb RAM. I upped the VM to 3G RAM but memory requirement stayed at 800M.

So the memory allocation requirement appears to change with server memory, to a maximum?

That would explain why it works on all my other machines (no matter what the size) but fails on this one; it’s just this one where it exceeded free memory! (which was less than 900M 'cos the kernel takes about 300M for itself).