From Fedora Project Wiki

Description

This test case tests that systemd-oomd will kill a cgroup with the most pgscans when memory pressure on user@$UID.service exceeds 10% (or whatever was defined in systemd-oomd-defaults).

Setup

  • This test case should be performed on either bare-metal or virtual machines.
  • Check that you are running systemd 248~rc4 or higher with systemctl --version.
  • Ensure the systemd-oomd-defaults package is installed (included with Fedora 34).
  • You will also need to install stress-ng.
  • Boot the system and log in as a regular user.
  • Ensure no other conflicting userspace OOM killers are running. For example you may have to stop earlyoom:
sudo systemctl stop earlyoom
  • So as not to trigger the swap policy for systemd-oomd, create an override with the following commands (don't forget to remove this file and systemctl daemon-reload to restore the settings afterwards):
sudo mkdir /etc/systemd/system/-.slice.d/
printf "[Slice]\nManagedOOMSwap=auto" | sudo tee /etc/systemd/system/-.slice.d/99-test.conf
sudo systemctl daemon-reload

How to test

  • Check that systemd-oomd is running:
systemctl status systemd-oomd
  • Check that the systemd-oomd-defaults policy was applied by running oomctl and verifying that "/user.slice/user-$UID.slice/user@$UID.service/" is listed as a path under "Memory Pressure Monitored CGroups" along with some stats. "Swap Monitored CGroups" should show no paths since we put in an override.
  • Now run the test:
systemd-run --user --scope /usr/bin/stress-ng --brk 2 --stack 2 --bigheap 2 --timeout 90s
  • Make sure to clean up the override and reset the test unit when you're done:
sudo rm /etc/systemd/system/-.slice.d/99-test.conf
sudo systemctl daemon-reload

Expected Results

  • The system becomes unresponsive during the test but should respond again once stress-ng is killed.
  • If the main stress-ng was killed, the command will print "Killed" with a non-zero exit code (expected result). If it runs to completion, it say something about "successful run" and exit 0 (unexpected result).
  • This test will invoke the kernel OOM killer in combination with systemd-oomd. The kernel OOM killer will kill worker processes from stress-ng, but not the main process. stress-ng continually spawn new processes them until the main process is killed by systemd-oomd.
  • systemd-oomd will have killed all the processes before the timeout. stress-ng may take some time to build up pressure. If the the command runs to timeout, it means systemd-oomd did not kill it.
  • You can verify by checking for some of the relevant log lines with journalctl: "Memory pressure for <...> and there was reclaim activity" or "systemd-oomd killed <...> process(es)"

Optional

  • You can also try a variant of this test that is less likely to invoke the kernel OOM killer; the idea is to use up all the free memory and swap, leaving ~0.5GB. This should generate enough pressure on the system without invoking an actual out of memory event:
swapfree=$(cat /proc/meminfo | grep SwapFree | awk '{print $2}')
memfree=$(cat /proc/meminfo | grep MemFree | awk '{print $2}')
target=$(bc -l <<< "$memfree + $swapfree - 500000")
systemd-run --user --scope  stress-ng -m 1 --vm-bytes "$target"K --vm-keep
  • Make sure the swap policy is disabled as described in the setup or it will kick in first.
  • You will have to ctrl-c the command if systemd-oomd does not kill it. It can take some time to build up pressure and meet the kill condition. However if you see from the output of oomctl that the "value" of "Pressure: Avg10: <value>" does not go above 10.0 after a couple of minutes or so, there might have been too much memory available to generate pressure.