Revision as of 09:38, 17 March 2021

Description

This test case tests that systemd-oomd will kill a cgroup with the most pgscans when memory pressure on user@$UID.service exceeds 10% (or whatever was defined in systemd-oomd-defaults).

Setup

This test case should be performed on either bare-metal or virtual machines.
Check that you are running systemd 248~rc1 or higher with systemctl --version.
Ensure the systemd-oomd-defaults package is installed (included with Fedora 34).
You will also need to install stress-ng.
Boot the system and log in as a regular user.
So as not to trigger the swap policy for systemd-oomd, create an override with the following commands (don't forget to remove this file and systemctl daemon-reload to restore the settings afterwards):

sudo mkdir /etc/systemd/system/-.slice.d/
printf "[Slice]\nManagedOOMSwap=auto" | sudo tee /etc/systemd/system/-.slice.d/99-test.conf
sudo systemctl daemon-reload

How to test

Check that systemd-oomd is running:

systemctl status systemd-oomd

Check that the systemd-oomd-defaults policy was applied by running oomctl and verifying that "/user.slice/user-$UID.slice/user@$UID.service/" is listed as a path under "Memory Pressure Monitored CGroups" along with some stats. "Swap Monitored CGroups" should show no paths since we put in an override.
Now run the test:

systemd-run --user --scope /usr/bin/stress-ng --brk 0 --stack 0 --bigheap 0 --timeout 120s

Make sure to clean up the override and reset the test unit when you're done:

sudo rm /etc/systemd/system/-.slice.d/99-test.conf
sudo systemctl daemon-reload

Expected Results

The system becomes unresponsive during the test but should respond again once stress-ng is killed.
systemd-oomd will have killed all the processes before the 120 second timeout. stress-ng may take some time to build up pressure, but should be killed before the timeout. If the the command runs to timeout, it means systemd-oomd did not kill it. * * You can verify by checking for some of the relevant log lines with journalctl: "Memory pressure for <...> and there was reclaim activity" or "systemd-oomd killed <...> process(es)"

@@ Line 23: / Line 23: @@
 * Now run the test:
 <pre>
-systemd-run --user -r --unit systoomd_mempressure_test /usr/bin/stress-ng --brk 1 --stack 1 --bigheap 1 -t 90s
+systemd-run --user --scope /usr/bin/stress-ng --brk 0 --stack 0 --bigheap 0 --timeout 120s
 </pre>
 * Make sure to clean up the override and reset the test unit when you're done:
@@ Line 29: / Line 29: @@
 sudo rm /etc/systemd/system/-.slice.d/99-test.conf
 sudo systemctl daemon-reload
-systemctl --user reset-failed systoomd_mempressure_test.service
 </pre>
 |results=
 * The system becomes unresponsive during the test but should respond again once `stress-ng` is killed.
-* systemd-oomd will have killed systoomd_mempressure_test.service after about 10 seconds. `stress-ng` will timeout after 90 seconds, so if the the command runs to timeout, it means systemd-oomd did not kill it. You can verify by checking for log lines that say something about "Memory pressure for <...> and there was reclaim activity" and "systemd-oomd killed <...> process(es)" with `journalctl`.
+* systemd-oomd will have killed all the processes before the 120 second timeout. `stress-ng` may take some time to build up pressure, but should be killed before the timeout. If the the command runs to timeout, it means systemd-oomd did not kill it. * * You can verify by checking for some of the relevant log lines with `journalctl`: "Memory pressure for <...> and there was reclaim activity" or "systemd-oomd killed <...> process(es)"
 }}

Search

QA:Testcase Memory Pressure Based Killing: Difference between revisions

Revision as of 09:38, 17 March 2021

Contents

Description

Setup

How to test

Expected Results