前言
内核 5.10 版本
- openEuler 使用 yum install 下载了源码,并且通过两个 VMware 虚拟机进行调试
- ubuntu 直接使用 git 拉取了https://kernel.org/下 5.10.235 分支的代码,物理主机作为开发机,通过 virtualbox 建立虚拟机作为调试机
openEuler2204-SP4
使用两台虚拟机:
- 开发机:使用 gdb 连接调试机进行调试
- 调试机:编译内核,开启 KGDB ,被调试的机器
配置虚拟机
开发机

调试机

测试: 在调试机执行 cat /dev/ttyS0
阻塞,在开发机执行 echo "hello" > /dev/ttyS0
,可以看到调试机测输出 hello
,表示串口连通。
调试机配置
编译内核
在调试机侧下载内核源码:yum install kernel-source.x86_64
,在目录 /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64
下:
bash
#安装yum install
pkg-config
ncurses-devel
openssl-libs
elfutils-libelf-devel
dwarves
openssl-libs
bash
cd /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64
make menuconfig
按照以下截图配置内核:
问题
- openeuler 安装时会出现:dracut-install: Failed to find module 'virtio_gpu' dracut: FAILED: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.MlDs2I/initramfs --kerneldir /lib/modules/5.10.235-yielde-v1-+/ -m virtio_gpu,需要打开 Virtio GPU driver 支持,如下:
配置完成并保存后开始编译内核:
bash
make -j8
- -j8:表示使用 8 个 cpu 核共同编译
编译完成后检查 vmlinux 是否包含 debug 信息:
bash
[root@yielde-debugging linux-5.10.0-257.0.0.160.oe2203sp4.x86_64]# readelf -e vmlinux|grep debug
[36] .debug_aranges PROGBITS 0000000000000000 02e00000
[37] .debug_info PROGBITS 0000000000000000 02e2c310
[38] .debug_abbrev PROGBITS 0000000000000000 0d7f2b9b
[39] .debug_line PROGBITS 0000000000000000 0dcc7ee8
[40] .debug_frame PROGBITS 0000000000000000 0f37bcf8
[41] .debug_str PROGBITS 0000000000000000 0f64cf48
[42] .debug_loc PROGBITS 0000000000000000 0f9adda9
[43] .debug_ranges PROGBITS 0000000000000000 12c16be0
调试机安装内核模块和系统:
bash
make modules_install
make install
配置 grub
设置 grub 打开 kgdb,vim /etc/default/grub
在 GRUB_CMDLINE_LINUX
的末尾加入 kgdboc=ttyS0,115200 nokaslr
bash
GRUB_CMDLINE_LINUX="rhgb quiet crashkernel=auto rd.lvm.lv=VolGroup/lv_root cgroup_disable=files apparmor=0 crashkernel=512M selinux=0 kgdboc=ttyS0,115200 nokaslr"
更新 grub
bash
grub2-mkconfig -o /boot/grub2/grub.cfg
复制代码到 开发机
bash
rsync -avh /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64 [email protected]:/usr/src
调试机 reboot,选择我们编译好的内核启动

kgdb Debugger
发起 kgdb 中断,调试机的屏幕卡住,后面通过开发机的 gdb 通过串口连接后接管:
bash
echo g > /proc/sysrq-trigger
调试
内核
在开发机侧:
bash
cd /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64
连接调试机,给 vfs_write 函数打断点并执行
bash
gdb vmlinux
(gdb) target remote /dev/ttyS0
(gdb) bt
#0 kgdb_breakpoint () at kernel/debug/debug_core.c:1268
#1 0xffffffff811f5c2e in sysrq_handle_dbg (key=<optimized out>) at kernel/debug/debug_core.c:1008
#2 0xffffffff81b31761 in __handle_sysrq (key=key@entry=103, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:604
#3 0xffffffff8174776a in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1168
#4 0xffffffff81455ce3 in pde_write (ppos=<optimized out>, count=0, buf=<optimized out>, file=0x67, pde=0xffff888448b14540) at fs/proc/inode.c:345
#5 proc_reg_write (file=0x67, buf=0xffff88882fba0710 "", count=0, ppos=0x0 <fixed_percpu_data>) at fs/proc/inode.c:357
#6 0xffffffff813bd5df in vfs_write (file=file@entry=0xffff88844b4b3540, buf=buf@entry=0x564f86210f00 <error: Cannot access memory at address 0x564f86210f00>, count=count@entry=2, pos=pos@entry=0xffffc900039fbef0) at fs/read_write.c:600
#7 0xffffffff813bda67 in ksys_write (fd=<optimized out>, buf=0x564f86210f00 <error: Cannot access memory at address 0x564f86210f00>, count=2) at fs/read_write.c:655
#8 0xffffffff813bdb0a in __do_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:667
#9 __se_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:664
#10 __x64_sys_write (regs=<optimized out>) at fs/read_write.c:664
#11 0xffffffff81b510bd in do_syscall_64 (nr=<optimized out>, regs=0xffffc900039fbf58) at arch/x86/entry/common.c:47
#12 0xffffffff81c000df in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:125
#13 0x00007fa5a8d777a0 in ?? ()
#14 0x0000000000000002 in fixed_percpu_data ()
#15 0x00007fa5a8d775a0 in ?? ()
(gdb) b vfs_write
Breakpoint 1 at 0xffffffff813bd500: file fs/read_write.c, line 583.
(gdb) c
Continuing.
[Switching to Thread 7142]
Thread 409 hit Breakpoint 1, vfs_write (file=file@entry=0xffff888449e45680, buf=buf@entry=0xc0002f7a93 <error: Cannot access memory at address 0xc0002f7a93>, count=count@entry=1, pos=pos@entry=0x0 <fixed_percpu_data>) at fs/read_write.c:583
583 {
(gdb) l
578 return ret;
579 }
580 EXPORT_SYMBOL(kernel_write);
581
582 ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
583 {
584 ssize_t ret;
585
586 if (!(file->f_mode & FMODE_WRITE))
587 return -EBADF;
内核模块
以 bcache 举例,想要调试 bcache 需要将 bcache.ko 导入进来,否则无法获取符号表:
- 在调试机侧
bash
modprobe bcache
获取内存布局:
bash
[root@yielde-debugging ~]# cat /sys/module/bcache/sections/.text
0xffffffffa0672000
[root@yielde-debugging ~]# cat /sys/module/bcache/sections/.bss
0xffffffffa06a6140
[root@yielde-debugging ~]# cat /sys/module/bcache/sections/.data
0xffffffffa06a05a0
执行 echo g > /proc/sysrq-trigger
再将控制权交给开发机的 gdb
- 在开发机侧加载 bcache 的符号表
bash
(gdb) add-symbol-file /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64/drivers/md/bcache/bcache.ko -s .text 0xffffffffa0672000 -s .bss 0xffffffffa06a6140 -s .data 0xffffffffa06a05a0
add symbol table from file "/usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64/drivers/md/bcache/bcache.ko" at
.text_addr = 0xffffffffa0672000
.bss_addr = 0xffffffffa06a6140
.data_addr = 0xffffffffa06a05a0
(y or n) y
Reading symbols from /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64/drivers/md/bcache/bcache.ko...
给 bcache 的函数打断点,之后继续运行
bash
(gdb) b bcache_write_super
Breakpoint 2 at 0xffffffffa0689390: file drivers/md/bcache/super.c, line 375.
(gdb) c
Continuing.
在调试机创建 bcache,执行
bash
make-bcache -B /dev/sdc -C /dev/sdb --writeback
触发我们的断点如下:
bash
[New Thread 10207]
[New Thread 10198]
[New Thread 10201]
[New Thread 10204]
[New Thread 10208]
[New Thread 10209]
[New Thread 10210]
[New Thread 10211]
[New Thread 10212]
[Switching to Thread 10207]
Thread 444 hit Breakpoint 2, bcache_write_super (c=c@entry=0xffff888442d40000) at drivers/md/bcache/super.c:375
375 {
查看创建 bcache 的调用栈:
bash
(gdb) bt
#0 bcache_write_super (c=c@entry=0xffff888442d40000) at drivers/md/bcache/super.c:375
#1 0xffffffffa068af4a in run_cache_set (c=0xffff888442d40000) at drivers/md/bcache/super.c:2137
#2 0xffffffffa068b927 in register_cache_set (ca=ca@entry=0xffff888105552000) at drivers/md/bcache/super.c:2204
#3 0xffffffffa068ba47 in register_cache (sb=<optimized out>, sb_disk=<optimized out>, bdev=0xffff888441756c80, ca=0xffff888105552000) at drivers/md/bcache/super.c:2401
#4 0xffffffffa068bc72 in register_bcache (k=<optimized out>, attr=0xffffffffa06a07a0 <ksysfs_register>, buffer=<optimized out>, size=9) at drivers/md/bcache/super.c:2656
#5 0xffffffff81609d4f in kobj_attr_store (kobj=0xffff888442d40000, attr=0xffff88844566a080, buf=0xffff888105552000 "", count=0) at lib/kobject.c:864
#6 0xffffffff8146ec3b in sysfs_kf_write (of=<optimized out>, buf=0xffff888105552000 "", count=0, pos=<optimized out>) at fs/sysfs/file.c:139
#7 0xffffffff8146e27c in kernfs_fop_write_iter (iocb=0xffffc900060f7e60, iter=<optimized out>) at fs/kernfs/file.c:296
#8 0xffffffff813baa99 in call_write_iter (iter=0xffff88844566a080, kio=0xffff888442d40000, file=0xffff888440f672c0) at ./include/linux/fs.h:2064
#9 new_sync_write (filp=filp@entry=0xffff888440f672c0, buf=buf@entry=0x5627601532a0 <error: Cannot access memory at address 0x5627601532a0>, len=len@entry=9, ppos=ppos@entry=0xffffc900060f7ef0) at fs/read_write.c:515
#10 0xffffffff813bd6c0 in vfs_write (file=file@entry=0xffff888440f672c0, buf=buf@entry=0x5627601532a0 <error: Cannot access memory at address 0x5627601532a0>, count=count@entry=9, pos=pos@entry=0xffffc900060f7ef0) at fs/read_write.c:602
#11 0xffffffff813bda67 in ksys_write (fd=<optimized out>, buf=0x5627601532a0 <error: Cannot access memory at address 0x5627601532a0>, count=9) at fs/read_write.c:655
#12 0xffffffff813bdb0a in __do_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:667
#13 __se_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:664
#14 __x64_sys_write (regs=<optimized out>) at fs/read_write.c:664
#15 0xffffffff81b510bd in do_syscall_64 (nr=<optimized out>, regs=0xffffc900060f7f58) at arch/x86/entry/common.c:47
#16 0xffffffff81c000df in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:125
#17 0x00007f1ea5a5a7a0 in ?? ()
#18 0x0000000000000009 in fixed_percpu_data ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
单步执行如下:
Ubuntu 22.04.5 LTS
- 开发机:使用 gdb 连接调试机进行调试,安装 ubuntu 系统的物理机
- 调试机:编译内核,开启 KGDB ,被调试的机器,在此物理机上通过 virtualbox 安装的虚拟机
配置虚拟机
配置串口并设置主机 pip 通信路径为 /tmp/debuglinux
物理机编译内核
这次通过开发机来编译内核拷贝到调试机上面去
内核源码地址在 /root/workspace/linux-learn
,安装 deb 包如下:
bash
# apt install
gcc
make
perl
flex
bison
pkg-config
libncurses-dev
libelf-devel
build-essential
开始编译
bash
cd /root/workspace/linux-learn
mkdir build
make mrproper
make O=build defconfig
make O=build menuconfig
- 配置项与 openEuler 相似,主要是开启 KGDB,关闭内核的的随机地址空间布局(KASLR),开启 debuginfo
bash
make O=build -j8
将整个内核目录复制到 virtulbox 调试机的相同目录下:
bash
cd /root/workspace
rsync -avzW linux-learn [email protected]:/root/workspace
配置调试机
安装内核
bash
cd /root/workspace/linux-learn
make O=build modules_install
make O=build install
更新 grub
当前版本的 ubuntu 使用的 grub,上面的 openEuler 使用的 grub2
bash
# 添加
vim /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="kgdboc=ttyS0,115200 nokaslr"
grub-update
重启虚拟机,进入编译好的系统如下图


kgdb Debugger
- 调试机执行
echo g > /proc/sysrq-trigger
将控制交给 gdb - 开发机即主机执行:
bash
cd /root/workspace/linux-learn/build
target remote /tmp/debuglinux
尝试断点:
bash
(gdb) b vfs_write
Breakpoint 1 at 0xffffffff811c34cf: file ../fs/read_write.c, line 586.
(gdb) c
Continuing.
[Switching to Thread 377]
Thread 78 hit Breakpoint 1, vfs_write (file=file@entry=0xffff888101ddf800, buf=0x7ffe9e60766f <error: Cannot access memory at address 0x7ffe9e60766f>, count=count@entry=1, pos=pos@entry=0x0 <fixed_percpu_data>) at ../fs/read_write.c:586
586 {
(gdb) bt
#0 vfs_write (file=file@entry=0xffff888101ddf800, buf=0x7ffe9e60766f <error: Cannot access memory at address 0x7ffe9e60766f>,
count=count@entry=1, pos=pos@entry=0x0 <fixed_percpu_data>) at ../fs/read_write.c:586
#1 0xffffffff811c3786 in ksys_write (fd=<optimized out>, buf=0x7ffe9e60766f <error: Cannot access memory at address 0x7ffe9e60766f>,
count=1) at ../fs/read_write.c:658
#2 0xffffffff811c37eb in __do_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at ../fs/read_write.c:670
#3 __se_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at ../fs/read_write.c:667
#4 __x64_sys_write (regs=<optimized out>) at ../fs/read_write.c:667
#5 0xffffffff81a1a921 in do_syscall_64 (nr=<optimized out>, regs=0xffffc90000407f58) at ../arch/x86/entry/common.c:46
#6 0xffffffff81c0011f in entry_SYSCALL_64 () at ../arch/x86/entry/entry_64.S:117
#7 0x00005610590964e5 in ?? ()
#8 0x0000561091a5ac50 in ?? ()
#9 0x00007f6214955a60 in ?? ()
总结
本文主要汇总了下近期调试内核与内核模块的配置方式。
- 使用两台虚拟机通信,一台启动 gdb,另一台启动 kgdb
- 物理机本身为 linux 系统启动 gdb,使用虚拟机启动 kgdb
- 建议每次修改代码重新编译内核后同步到两个系统的相同的目录下,可以省去很多麻烦