There are two different out of memory conditions you can encounter in Linux. Which you encounter depends on the value of
sysctl vm.overcommit_memory (
The kernel can perform what is called ‘memory overcommit’. This is when the kernel allocates programs more memory than is really present in the system. This is done in the hopes that the programs won’t actually use all the memory they allocated, as this is a quite common occurrence.
overcommit_memory = 2
overcommit_memory is set to
2, the kernel does not perform any overcommit at all. Instead when a program is allocated memory, it is guaranteed access to have that memory. If the system does not have enough free memory to satisfy an allocation request, the kernel will just return a failure for the request. It is up to the program to gracefully handle the situation. If it does not check that the allocation succeeded when it really failed, the application will often encounter a segfault.
In the case of the segfault, you should find a line such as this in the output of
[1962.987529] myapp: segfault at 0 ip 00400559 sp 5bc7b1b0 error 6 in myapp[400000+1000]
at 0 means that the application tried to access an uninitialized pointer, which can be the result of a failed memory allocation call (but it is not the only way).
overcommit_memory = 0 and 1
overcommit_memory is set to
1, overcommit is enabled, and programs are allowed to allocate more memory than is really available.
However, when a program wants to use the memory it was allocated, but the kernel finds that it doesn’t actually have enough memory to satisfy it, it needs to get some memory back.
It first tries to perform various memory cleanup tasks, such as flushing caches, but if this is not enough it will then terminate a process. This termination is performed by the OOM-Killer. The OOM-Killer looks at the system to see what programs are using what memory, how long they’ve been running, who’s running them, and a number of other factors to determine which one gets killed.
After the process has been killed, the memory it was using is freed up, and the program which just caused the out-of-memory condition now has the memory it needs.
However, even in this mode, programs can still be denied allocation requests.
0, the kernel tries to take a best guess at when it should start denying allocation requests.
When it is set to
1, I’m not sure what determination it uses to determine when it should deny a request but it can deny very large requests.
You can see if the OOM-Killer is involved by looking at the output of
dmesg, and finding a messages such as:
[11686.043641] Out of memory: Kill process 2603 (flasherav) score 761 or sacrifice child [11686.043647] Killed process 2603 (flasherav) total-vm:1498536kB, anon-rss:721784kB, file-rss:4228kB
The truth is that regardless of which way you look at it – whether your process choked up due to the system’s memory manager or due to something else – it is still a bug. What happened to all of that data you were just processing in memory? It should have been saved.
overcommit_memory= is the most general way of configuring Linux OOM management, it is also adjustable per process like:
echo [-+][n] >/proc/$pid/oom_adj
-17 in the above will exclude a process from out-of-memory management. Probably not a great idea generally, but if you’re bug-hunting doing so could be worthwhile – especially if you wish to know whether it was OOM or your code. Positively incrementing the number will make the process more likely to be killed in an OOM event, which could enable you to better shore up your code’s resilience in low-memory situations and to ensure you exit gracefully when necessary.
You can check the OOM handler’s current settings per process like:
Else you could go suicidal:
sysctl vm.panic_on_oom=1 sysctl kernel.panic=X
That will set the computer to reboot in the event of an out-of-memory condition. You set the
X above to the number of seconds you wish the computer to halt after a kernel panic before rebooting. Go wild.
And if, for some reason, you decide you like it, make it persistent:
echo "vm.panic_on_oom=1" >> /etc/sysctl.conf echo "kernel.panic=X" >> /etc/sysctl.conf