He deliberately exhausted the memory, hoping the system would kill the process, because the following situation might occur: a Kubernetes container’s memory exceeds its limit, but the process isn’t killed as expected.
I recall the oom_killer of CentOS 4 (or 5) to target system services, rather than rogue user process, but not since.
When memory usage goes up, swap is taken into use. Swapping really slows the system to “stuck”-like state until all of RAM+swap is in use and oom_killer finally acts. Was the system “forever” stuck (oom_killer does nothing or kills vital services) or just “a long time”?
Can one “contain” the containers with cgroups? That is how, for example, SLURM limits the memory usage of jobs that it runs. Then it would be the cgroups, not the oom_killer, that kills the contained processes and system overall would not run out of memory (as long as cgroups is not given too much/all).
oom_killer doesn’t seem to do anything, because the kswapd process is occupying 100% of the cpu, causing the entire system to get stuck, is there a solution?
Setting vm.overcommit_memory=2 can indeed solve the problem, but it may waste a part of memory. I want vm.overcommit_memory=0, so my problem is that in the scenario where vm.overcommit_memory=0, How to solve this problem in rocky9linux
Can you try launching six instances at once, e.g. using things like bash background ‘&’ and ‘wait’ or some other async launcher. Your stdout will be a bit muddled unless you add pid as prefix.
I’ve just tested the oom killer in a Rocky 9.5 guest.
The vm guest only has 2Gb of RAM, but you can do the same test on bigger machines by changing 800MiB to something like 2GiB or more.
When I run this script, I see memory going down, down, down and intense swapping, then I see multiple oom killer messages in dmesg. The o/s did not crash or freeze, it just killed the bad processes.
#!/bin/bash
set -u
for i in {1..10}; do
head -c 800MiB /dev/zero | tail | sleep 10 &
sleep 1
done
No, you need to first check using the bash example above on your system because we don’t know if the issue is your system or your code. We know from the test above that oom killer is working on clean install of Rocky 9.5