facebookincubator / oomd
- вторник, 24 июля 2018 г. в 00:15:57
C++
A userspace out-of-memory killer
oomd is userspace Out-Of-Memory (OOM) killer for linux systems.
Out of memory killing has historically happened inside kernel space. On a memory overcommitted linux system, malloc(2) and friends will never fail. However, if an application dereferences the returned pointer and the system has run out of physical memory, the linux kernel is forced take extreme measures, up to and including killing processes. This is typically a slow and painful process because the kernel spends an unbounded amount of time swapping in and out pages and evicting the page cache. Furthermore, configuring policy is not very flexible while being somewhat complicated.
oomd aims to solve this problem in userspace. oomd leverages PSI and cgroupsv2 to monitor a system holistically. oomd then takes corrective action in userspace before an OOM occurs in kernel space. Corrective action is configured via a flexible plugin system, in which custom code can be written. By default, this involves killing offending processes. This enables an unparalleled level of flexibility where each workload can have custom protection rules. Furthermore, time spent churning pages in kernelspace is minimized. In practice at Facebook, we've regularly seen 30 minute host lockups go away entirely.
Note that oomd requires PSI to function. This kernel feature has not yet been upstreamed (as of 7/18/18).
oomd currently depends on meson, libfolly, and jsoncpp. The dependency on folly will soon be removed (as of 7/18/18).
$ git clone https://github.com/facebook/oomd
$ cd oomd
$ meson build && ninja -C build
$ cd build && sudo ninja install
oomd receives runtime configuration from two sources: environment vars and a config file. You typically do not need to change the default environment values. However, to finely tune oomd, you may want to.
Default location: /etc/oomd.json
Example config:
{
"cgroups": [
{
"target": "system.slice",
"kill_list": [
{"chef.service": { "kill_pressure": "60", "max_usage": "100" } },
{"sshd.service": { "max_usage": "inf" } }
],
"oomdetector": "default",
"oomkiller": "noop"
},
{
"target": "workload.slice",
"kill_list": [],
"oomdetector": "default",
"oomkiller": "default"
}
],
"version": "0.2.0"
}
This example config describes the following:
OOMD_INTERVAL
OOMD_VERBOSE_INTERVAL
OOMD_POST_KILL_DELAY
OOMD_THRESHOLD
OOMD_HIGH_THRESHOLD
OOMD_HIGH_THRESHOLD_DURATION
OOMD_LARGER_THAN
OOMD_GROWTH_ABOVE
OOMD_MIN_SWAP_PCT
oomd depends on gtest/gmock to run tests. Installing gtest/gmock from master is preferred.
If meson detects gtest/gmock is installed, meson will generate build rules for tests.
$ cd oomd
$ rm -rf build
$ meson build && ninja test -C build
oomd is GPL 2 licensed, as found in the LICENSE file.