Shell 脚本模拟(无需安装工具) OOM 问题
#!/bin/bash
#持续申请内存,每次申请 100MB,直到内存耗尽。
while true; do
# 创建 100MB 临时文件,读取到内存(cat 命令会占用内存)。
cat /dev/zero |head -c 100M |tail &
done
运行脚本:
chmod +x memory_oom.sh && ./memory_oom.sh
停止方法:
killall tail
检查OS 日志:
Dec 15 15:18:49 test kernel: [ 1308.010846] tail invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
Dec 15 15:18:51 test kernel: [ 1308.010847] tail cpuset=/ mems_allowed=0
Dec 15 15:18:51 test kernel: [ 1308.010850] CPU: 1 PID: 12786 Comm: tail Kdump: loaded Not tainted 4.19.90-23.8.v2101.ky10.x86_64 #1
Dec 15 15:18:51 test kernel: [ 1308.010851] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Dec 15 15:18:51 test kernel: [ 1308.010851] Call Trace:
Dec 15 15:18:51 test kernel: [ 1308.010856] dump_stack+0x66/0x8b
Dec 15 15:18:51 test kernel: [ 1308.010859] dump_header+0x6e/0x299
Dec 15 15:18:51 test kernel: [ 1308.010860] oom_kill_process+0x259/0x280
Dec 15 15:18:51 test kernel: [ 1308.010861] ? oom_badness+0xe1/0x130
Dec 15 15:18:51 test kernel: [ 1308.010862] out_of_memory+0x110/0x4f0
Dec 15 15:18:51 test kernel: [ 1308.010864] __alloc_pages_slowpath+0x9c4/0xd10
Dec 15 15:18:51 test kernel: [ 1308.010866] __alloc_pages_nodemask+0x245/0x280
Dec 15 15:18:51 test kernel: [ 1308.010868] alloc_pages_vma+0x7c/0x1f0
Dec 15 15:18:51 test kernel: [ 1308.010870] do_anonymous_page+0x10c/0x400
Dec 15 15:18:51 test kernel: [ 1308.010871] __handle_mm_fault+0x672/0x6b0
Dec 15 15:18:51 test kernel: [ 1308.010873] handle_mm_fault+0xdc/0x230
Dec 15 15:18:51 test kernel: [ 1308.010875] __do_page_fault+0x2b5/0x4e0
Dec 15 15:18:51 test kernel: [ 1308.010877] do_page_fault+0x31/0x130
Dec 15 15:18:51 test kernel: [ 1308.010879] page_fault+0x1e/0x30
......
Dec 15 15:18:51 test kernel: [ 1308.010945] Tasks state (memory values in pages):
Dec 15 15:18:51 test kernel: [ 1308.010945] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Dec 15 15:18:51 test kernel: [ 1308.010950] [ 606] 0 606 7653 1054 90112 0 -250 systemd-journal
Dec 15 15:18:51 test kernel: [ 1308.010951] [ 629] 0 629 38407 74 61440 0 0 lvmetad
Dec 15 15:18:51 test kernel: [ 1308.010953] [ 636] 0 636 11499 646 86016 0 -1000 systemd-udevd
Dec 15 15:18:51 test kernel: [ 1308.010954] [ 638] 192 638 6796 243 86016 0 0 systemd-network
Dec 15 15:18:51 test kernel: [ 1308.010956] [ 768] 32 768 2792 170 65536 0 0 rpcbind
Dec 15 15:18:51 test kernel: [ 1308.010958] [ 769] 975 769 23033 239 86016 0 0 systemd-timesyn
Dec 15 15:18:51 test kernel: [ 1308.010959] [ 774] 0 774 740 28 49152 0 0 mdadm
Dec 15 15:18:51 test kernel: [ 1308.010960] [ 775] 0 775 27063 612 81920 0 -1000 auditd
Dec 15 15:18:51 test kernel: [ 1308.010961] [ 781] 0 781 1607 65 61440 0 -1000 sedispatch
Dec 15 15:18:51 test kernel: [ 1308.010963] [ 807] 977 807 3542 258 65536 0 -900 dbus-daemon
......
Dec 15 15:18:51 test kernel: [ 1308.011398] [ 13054] 2001 13054 53140 16 61440 0 0 cat
Dec 15 15:18:51 test kernel: [ 1308.011399] [ 13055] 2001 13055 53188 16 65536 0 0 head
Dec 15 15:18:51 test kernel: [ 1308.011401] [ 13056] 2001 13056 54311 1138 69632 0 0 tail
Dec 15 15:18:51 test kernel: [ 1308.011402] [ 13057] 2001 13057 53140 16 57344 0 0 cat
Dec 15 15:18:51 test kernel: [ 1308.011403] [ 13058] 2001 13058 53188 16 53248 0 0 head
Dec 15 15:18:51 test kernel: [ 1308.011404] [ 13059] 2001 13059 54040 848 69632 0 0 tail
Dec 15 15:18:51 test kernel: [ 1308.011405] [ 13060] 2001 13060 53140 16 57344 0 0 cat
Dec 15 15:18:51 test kernel: [ 1308.011406] [ 13061] 2001 13061 53188 16 61440 0 0 head
Dec 15 15:18:51 test kernel: [ 1308.011407] [ 13062] 2001 13062 53600 479 61440 0 0 tail
Dec 15 15:18:51 test kernel: [ 1308.011409] [ 13063] 2001 13063 53140 15 65536 0 0 cat
Dec 15 15:18:51 test kernel: [ 1308.011410] [ 13064] 2001 13064 571 10 36864 0 0 head
Dec 15 15:18:51 test kernel: [ 1308.011411] [ 13065] 2001 13065 53113 16 65536 0 0 tail
Dec 15 15:18:51 test kernel: [ 1308.011412] [ 13066] 2001 13066 92 1 36864 0 0 cat
Dec 15 15:18:51 test kernel: [ 1308.011413] [ 13067] 2001 13067 94 1 32768 0 0 head
Dec 15 15:18:51 test kernel: [ 1308.011414] [ 13068] 2001 13068 53471 75 53248 0 0 memory_oom.sh
Dec 15 15:18:51 test kernel: [ 1308.011416] [ 13069] 2001 13069 53471 74 49152 0 0 memory_oom.sh
上面的数据说明是tail 触发了oom-killer "tail invoked oom-killer"。
Dec 15 15:18:51 test kernel: [ 1308.011528] Killed process 1402 (dmserver) total-vm:4938804kB, anon-rss:633824kB, file-rss:0kB, shmem-rss:0kB
Dec 15 15:18:51 test kernel: [ 1308.024444] oom_reaper: reaped process 1402 (dmserver), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
OOM kill 了 占用内存比较大的 dmserver 进程来释放内存
Dec 15 15:18:53 test kernel: [ 1309.328984] Out of memory: Kill process 12786 (tail) score 20 or sacrifice child
Dec 15 15:18:53 test kernel: [ 1309.328989] Killed process 12786 (tail) total-vm:272548kB, anon-rss:59868kB, file-rss:0kB, shmem-rss:0kB
Dec 15 15:18:53 test kernel: [ 1309.360638] oom_reaper: reaped process 12786 (tail), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
OOM 进一步的 Kill 了其他占内存多的 tail 进程
Dec 15 15:18:53 test kernel: [ 1309.852805] Out of memory: Kill process 12819 (tail) score 16 or sacrifice child
Dec 15 15:18:53 test kernel: [ 1309.852809] Killed process 12819 (tail) total-vm:262244kB, anon-rss:49476kB, file-rss:0kB, shmem-rss:0kB
OOM kill 的这个过程不断的反复杀掉内存占用多的进程来是否内存。
结论:
在当前案例中,tail 进程是我们设计的,用来耗尽内存的,而dmserver 是受害者,只是因为内存占的大被kill 掉了。
欢迎访问达梦技术分享社区 ECO