Enable EarlyOOM on Fedora KDE
Summary
As Fedora Workstation did in F32, install earlyoom package, and enable it by default. If RAM goes below 4% free and swap below 10% free, earlyoom issues SIGTERM to the process with the largest oom_score. If RAM goes below 2% free and swap goes below 5% free, earlyoom issues SIGKILL to the process with the largest oom_score. The idea is to recover from out of memory situations sooner, rather than the typical complete system hang in which the user has no other choice but to force power off.
Owner
- Name: Ben Cotton
- Email: bcotton@redhat.com
Current status
- Targeted release: Fedora 33
- Last updated: 2020-07-16
Detailed Description
Shamelessly copied from Workstation, which did it in the last release:
Certain workloads have heavy memory demands, quickly consume all of RAM, and start to heavily page out to swap. (Heavy paging, is often called "swap thrashing" for added descriptive effect, probably because it's noticeable and annoying). Incidental swap usage is a good thing, it frees up memory for active pages used by a process. Heavy swap usage quickly leads to a very negative UX, because it's slow, even on modern SSDs. Due to installer defaults, the swap partition is made the same size as available memory (at install time), which can be huge. This just extends swap thrashing time.
On the one hand, we want this resource hungry job to complete. On the other hand, we want our system to be responsive while that other work is going on. But once the GUI stutters or even comes to an apparent stand still (hang), we're really wishing the kernel oom-killer would kick in and free up memory, so we can start over (maybe using memory or thread limiting options - which arguably should be more intelligently figured out, and that too is a work in progress but beyond the scope of this feature).
However, once in a heavy swap scenario, it's relatively common the system gets stuck in it, where GUI interactivity is terrible to non-existent, and also the kernel oom-killer doesn't trigger. From a certain point of view, this is working as intended. The kernel oom-killer is concerned about keeping the kernel running. It's not at all concerned about user space responsiveness.
Instead of the system becoming completely unresponsive for tens of minutes, hours or days, this feature expects that an offending process (determined by oom_score, same as the kernel oom-killer) will be killed off within seconds or a few minutes.
Feedback
Why not all desktops?
They're welcome to join in.
This will kill my applications
The service is easy enough for administrators to tune or disable, so that should not prevent making this the default. Workstation has used it for a release without any apparent trouble.
Benefit to Fedora
KDE users will be able to take advantage of the benefits Workstation users got from enabling earlyOOM in Fedora 32:
- improved user experience by more quickly regaining control over one's system, rather than having to force power off in low-memory situations where there's aggressive swapping. Once a system becomes unresponsive, it's completely reasonable for the user to assume the system is lost, but that includes high potential for data loss.
- reducing forced poweroff as the main work around will increase data collection, improving understanding of low memory situations and how to handle them better
- earlyoom first sends SIGTERM to the chosen process, so it has a chance of a proper shutdown, unlike the kernel's oom-killer
Scope
- Proposal owners:
- Modify https://pagure.io/fedora-comps/blob/master/f/comps-f33.xml.in to include earlyoom package for in kde-desktop section.
- Add https://src.fedoraproject.org/rpms/fedora-release/blob/master/f/80-kde.preset to include:
# enable earlyoom by default on KDE enable earlyoom.service
- Other developers: None, unless KDE-based Spins/Labs want to opt out
- Release engineering: N/A
- Policies and guidelines: N/A
- Trademark approval: N/A
Upgrade/compatibility impact
earlyoom.service will be enabled on upgrade. An upgraded system should exhibit the same behaviors as a newly-installed system.
How To Test
- Fedora 31/32 KDE users can test today:
- sudo dnf install earlyoom
- sudo systemctl enable --now earlyoom
And then attempt to cause an out of memory situation. Examples:
- tail /dev/zero
- https://lkml.org/lkml/2019/8/4/15
User Experience
earlyoom sends SIGTERM to processes based on oom_score when both memory and swap have less than 10% free and SIGKILL when below 5%.
Dependencies
None
Contingency Plan
- Contingency mechanism: (What to do? Who will do it?) Owner reverts changes
- Contingency deadline: Final freeze
- Blocks release? No
Documentation
- man earlyoom
- https://github.com/rfjakob/earlyoom
- https://www.kernel.org/doc/gorman/html/understand/understand016.html
Release Notes
The earlyoom service is now enabled by default in Fedora KDE.
The earlyoom service monitors system memory usage. If free memory falls below a set limit, earlyoom terminates an appropriate process to free up memory. As a result, the system does not become unresponsive for long periods of time in low-memory situations.
The following is the default earlyoom configuration:
- If RAM goes below 4% free and swap goes below 10% free, earlyoom sends the SIGTERM signal to the process with the largest oom_score.
- If RAM goes below 2% free and swap goes below 5% free, earlyoom sends the SIGKILL signal to the process with the largest oom_score.
For more information, see the earlyoom man page.