文章目录
- [1. 前言](#1. 前言)
- [2. 现场](#2. 现场)
- [3. 解析](#3. 解析)
1. 前言
限于作者能力水平,本文可能存在谬误,因此而给读者带来的损失,作者不做任何承诺。
2. 现场
笔者在 Linux 4.14.113 内核中分析 KASAN 代码实现,开启了 KASAN 功能后,通过 QEMU 启动内核,结果 KASAN 爆出了 OOB(Out-Of-Bound),即越界访问的错误:
bash
[ 2.875585] ==================================================================
[ 2.876318] BUG: KASAN: global-out-of-bounds in __do_proc_doulongvec_minmax+0x324/0x4d0
[ 2.876855] Read of size 8 at addr ffff200009e7db80 by task systemd/1
[ 2.877336] CPU: 0 PID: 1 Comm: systemd Not tainted 4.14.113 #6
[ 2.877427] Hardware name: linux,dummy-virt (DT)
[ 2.877659] Call trace:
[ 2.877741] [<ffff20000808c9e0>] dump_backtrace+0x0/0x480
[ 2.877844] [<ffff20000808ce74>] show_stack+0x14/0x20
[ 2.877928] [<ffff200008e6b85c>] dump_stack+0xa0/0xc4
[ 2.878010] [<ffff2000082a1618>] print_address_description+0x60/0x240
[ 2.878097] [<ffff2000082a0d1c>] kasan_report_error+0x14c/0x240
[ 2.878247] [<ffff2000082a152c>] kasan_report+0x58/0x8c
[ 2.878344] [<ffff20000829f87c>] __asan_load8+0x88/0xac
[ 2.878427] [<ffff2000080f8d34>] __do_proc_doulongvec_minmax+0x324/0x4d0
[ 2.878521] [<ffff2000080f8f34>] proc_doulongvec_minmax+0x54/0x70
[ 2.878609] [<ffff20000837da84>] proc_sys_call_handler+0x110/0x19c
[ 2.878695] [<ffff20000837db20>] proc_sys_write+0x10/0x20
[ 2.878771] [<ffff2000082cbbd0>] __vfs_write+0xbc/0x250
[ 2.878844] [<ffff2000082cbfe0>] vfs_write+0xd0/0x230
[ 2.878916] [<ffff2000082cc3c0>] SyS_write+0x9c/0x100
[ 2.879026] Exception stack(0xffff800035277ec0 to 0xffff800035278000)
[ 2.879223] 7ec0: 0000000000000004 0000aaaaff490ce0 0000000000000014 0000ffff9ee64820
[ 2.879367] 7ee0: 0000000000000020 8080808080808000 0000000000000010 7f7f7f7f7f7f7f7f
[ 2.879496] 7f00: 0000000000000040 0000fffff5ae89f5 000055549f911d6d 000000000000000a
[ 2.879625] 7f20: 3538363330323733 0a37303835373734 0000000000000001 0000000000000000
[ 2.879771] 7f40: 0000ffff9ee5b4d0 0000ffff9e927de0 0000000000000000 0000000000000004
[ 2.879914] 7f60: 0000000000000000 0000000000000004 0000ffff9ee64fe0 0000fffff5ae8dd0
[ 2.880489] 7f80: 0000aaaacc76ae50 0000000000000000 0000000000000000 0000fffff5ae8e00
[ 2.880963] 7fa0: 0000000000000000 0000fffff5ae8b00 0000ffff9ece83f0 0000fffff5ae8b00
[ 2.881701] 7fc0: 0000ffff9e927e10 0000000000000000 0000000000000004 0000000000000040
[ 2.882490] 7fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 2.882663] [<ffff200008083a00>] el0_svc_naked+0x34/0x38
[ 2.882846] The buggy address belongs to the variable:
[ 2.882933] zero+0x0/0x40
[ 2.883042] Memory state around the buggy address:
[ 2.883404] ffff200009e7da80: fa fa fa fa 04 fa fa fa fa fa fa fa 00 00 00 00
[ 2.883546] ffff200009e7db00: 00 00 00 00 fa fa fa fa 04 fa fa fa fa fa fa fa
[ 2.883641] >ffff200009e7db80: 04 fa fa fa fa fa fa fa 00 00 00 00 fa fa fa fa
[ 2.883765] ^
[ 2.883823] ffff200009e7dc00: 00 fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
[ 2.883914] ffff200009e7dc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2.884014] ==================================================================
3. 解析
从日志 zero+0x0/0x40 了解到是对全局变量 zero 越界写导致,具体的越界写代码在函数 __do_proc_doulongvec_minmax() 内,通过 addr2line 定位到代码行如下:
bash
$ aarch64-linux-gnu-addr2line -e output/vmlinux -a 0xffff2000080f8d34
0xffff2000080f8d34
/home/lxj/Work/qemu-lab/arm64-linux/linux-4.14.113/output/../kernel/sysctl.c:2734 (discriminator 1)
c
static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int write,
void __user *buffer,
size_t *lenp, loff_t *ppos,
unsigned long convmul,
unsigned long convdiv)
{
unsigned long *i, *min, *max;
int vleft, first = 1, err = 0;
size_t left;
char *kbuf = NULL, *p;
...
...
min = (unsigned long *) table->extra1;
max = (unsigned long *) table->extra2;
...
for (; left && vleft--; i++, first = 0) {
unsigned long val;
if (write) {
...
if ((min && val < *min) || (max && val > *max)) // 导致 KASAN 爆错的代码行
continue;
...
} else {
...
}
}
...
}
确定到是代码行 if ((min && val < *min) || (max && val > *max)) 出错,显而易见是读越界访问问题,但到底是 min 还是 max 指针访问出错呢?结合 KASAN 爆出的信息,要确认是 min 还是 max 指向的全局变量 zero;同时从崩溃堆栈的 proc_sys_call_handler() 知道是对某个 sysctl 控制变量的读操作引起的,搜索内核代码,有多个定义为 zero 的变量,和 sysctl 相关,所以需要排查,在 __do_proc_doulongvec_minmax() 加入日志:
c
static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int write,
void __user *buffer,
size_t *lenp, loff_t *ppos,
unsigned long convmul,
unsigned long convdiv)
{
...
for (; left && vleft--; i++, first = 0) {
unsigned long val;
if (write) {
...
pr_info("%s(): table->procname=%s\n", __func__, table->procname);
if ((min && val < *min) || (max && val > *max)) // 导致 KASAN 爆错的代码行
continue;
...
} else {
...
}
}
...
}
重新编译运行:
bash
[ 2.688833] __do_proc_doulongvec_minmax(): table->procname=file-max
[ 2.689084] ==================================================================
[ 2.689251] BUG: KASAN: global-out-of-bounds in __do_proc_doulongvec_minmax+0x37c/0x510
[ 2.689369] Read of size 8 at addr ffff200009e7db80 by task systemd/1
[ 2.689455]
[ 2.689611] CPU: 1 PID: 1 Comm: systemd Not tainted 4.14.113 #7
[ 2.689696] Hardware name: linux,dummy-virt (DT)
[ 2.689938] Call trace:
[ 2.690084] [<ffff20000808c9e0>] dump_backtrace+0x0/0x480
[ 2.690250] [<ffff20000808ce74>] show_stack+0x14/0x20
[ 2.690580] [<ffff200008e6a85c>] dump_stack+0xa0/0xc4
[ 2.690665] [<ffff2000082a1658>] print_address_description+0x60/0x240
[ 2.690756] [<ffff2000082a0d5c>] kasan_report_error+0x14c/0x240
[ 2.690838] [<ffff2000082a156c>] kasan_report+0x58/0x8c
[ 2.690928] [<ffff20000829f8bc>] __asan_load8+0x88/0xac
[ 2.691009] [<ffff2000080f8d8c>] __do_proc_doulongvec_minmax+0x37c/0x510
[ 2.691100] [<ffff2000080f8f74>] proc_doulongvec_minmax+0x54/0x70
[ 2.691183] [<ffff20000837cac4>] proc_sys_call_handler+0x110/0x19c
[ 2.691267] [<ffff20000837cb60>] proc_sys_write+0x10/0x20
[ 2.691342] [<ffff2000082cbc10>] __vfs_write+0xbc/0x250
[ 2.691414] [<ffff2000082cc020>] vfs_write+0xd0/0x230
[ 2.691484] [<ffff2000082cc400>] SyS_write+0x9c/0x100
[ 2.691589] Exception stack(0xffff800035277ec0 to 0xffff800035278000)
[ 2.691759] 7ec0: 0000000000000004 0000aaaaf35ffce0 0000000000000014 0000ffffac37f820
[ 2.691872] 7ee0: 0000000000000020 8080808080808000 0000000000000010 7f7f7f7f7f7f7f7f
[ 2.691973] 7f00: 0000000000000040 0000ffffc53ac965 00005554b8cb2d6d 000000000000000a
[ 2.692072] 7f20: 3538363330323733 0a37303835373734 0000000000000001 0000000000000000
[ 2.692170] 7f40: 0000ffffac36b4d0 0000ffffabe37de0 0000000000000000 0000000000000004
[ 2.692267] 7f60: 0000000000000000 0000000000000004 0000ffffac37ffe0 0000ffffc53acd40
[ 2.692364] 7f80: 0000aaaabc264e50 0000000000000000 0000000000000000 0000ffffc53acd70
[ 2.692461] 7fa0: 0000000000000000 0000ffffc53aca70 0000ffffac1f83f0 0000ffffc53aca70
[ 2.692556] 7fc0: 0000ffffabe37e10 0000000000000000 0000000000000004 0000000000000040
[ 2.692650] 7fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 2.692769] [<ffff200008083a00>] el0_svc_naked+0x34/0x38
[ 2.692881]
[ 2.692936] The buggy address belongs to the variable:
[ 2.693018] zero+0x0/0x40
[ 2.693073]
[ 2.693123] Memory state around the buggy address:
[ 2.693434] ffff200009e7da80: fa fa fa fa 04 fa fa fa fa fa fa fa 00 00 00 00
[ 2.693570] ffff200009e7db00: 00 00 00 00 fa fa fa fa 04 fa fa fa fa fa fa fa
[ 2.693668] >ffff200009e7db80: 04 fa fa fa fa fa fa fa 00 00 00 00 fa fa fa fa
[ 2.693827] ^
[ 2.694635] ffff200009e7dc00: 00 fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
[ 2.695053] ffff200009e7dc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2.695517] ==================================================================
可见,是读 /proc/sys/fs/file-max 时导致的问题,继续查看相关代码:
c
/* kernel/sysctl.c */
static int zero;
...
static struct ctl_table fs_table[] = {
...
{
.procname = "file-max",
.data = &files_stat.max_files,
.maxlen = sizeof(files_stat.max_files),
.mode = 0644,
.proc_handler = proc_doulongvec_minmax,
.extra1 = &zero,
.extra2 = &long_max,
},
...
};
proc_doulongvec_minmax() 应该访问的 unsigned long 类型的变量,但 .extra1 指向的 zero 却定义为 int,同时因为笔者测试的 arm64 平台,unsigned long 为 8 字节,而 int 为 4 字节,因此导致了越界读的问题。
修正也很简单,就是将 zero 定义为 unsigned long 即可:
c
- static int zero;
+ static unsigned long zero;
当然,在 32-bit 架构下不会有这样的问题,因为 int 和 unsigned long 在 32-bit 架构下都是 4 字节。
通过搜索内核邮件,笔者发现内核已经修正了该问题:Re: [PATCH] kernel/sysctl.c: fix out of bounds access in fs.file-max