KGDB调试Linux内核与模块

前言

内核 5.10 版本

  1. openEuler 使用 yum install 下载了源码,并且通过两个 VMware 虚拟机进行调试
  2. ubuntu 直接使用 git 拉取了https://kernel.org/下 5.10.235 分支的代码,物理主机作为开发机,通过 virtualbox 建立虚拟机作为调试机

openEuler2204-SP4

使用两台虚拟机:

  1. 开发机:使用 gdb 连接调试机进行调试
  2. 调试机:编译内核,开启 KGDB ,被调试的机器

配置虚拟机

开发机

调试机

测试: 在调试机执行 cat /dev/ttyS0阻塞,在开发机执行 echo "hello" > /dev/ttyS0,可以看到调试机测输出 hello,表示串口连通。

调试机配置

编译内核

在调试机侧下载内核源码:yum install kernel-source.x86_64,在目录 /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64下:

bash 复制代码
#安装yum install
pkg-config
ncurses-devel
openssl-libs
elfutils-libelf-devel
dwarves
openssl-libs
bash 复制代码
cd /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64
make menuconfig

按照以下截图配置内核:







问题

  1. openeuler 安装时会出现:dracut-install: Failed to find module 'virtio_gpu' dracut: FAILED: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.MlDs2I/initramfs --kerneldir /lib/modules/5.10.235-yielde-v1-+/ -m virtio_gpu,需要打开 Virtio GPU driver 支持,如下:

配置完成并保存后开始编译内核:

bash 复制代码
make -j8
  • -j8:表示使用 8 个 cpu 核共同编译

编译完成后检查 vmlinux 是否包含 debug 信息:

bash 复制代码
[root@yielde-debugging linux-5.10.0-257.0.0.160.oe2203sp4.x86_64]# readelf -e vmlinux|grep debug
  [36] .debug_aranges    PROGBITS         0000000000000000  02e00000
  [37] .debug_info       PROGBITS         0000000000000000  02e2c310
  [38] .debug_abbrev     PROGBITS         0000000000000000  0d7f2b9b
  [39] .debug_line       PROGBITS         0000000000000000  0dcc7ee8
  [40] .debug_frame      PROGBITS         0000000000000000  0f37bcf8
  [41] .debug_str        PROGBITS         0000000000000000  0f64cf48
  [42] .debug_loc        PROGBITS         0000000000000000  0f9adda9
  [43] .debug_ranges     PROGBITS         0000000000000000  12c16be0

调试机安装内核模块和系统:

bash 复制代码
make modules_install
make install

配置 grub

设置 grub 打开 kgdb,vim /etc/default/grubGRUB_CMDLINE_LINUX的末尾加入 kgdboc=ttyS0,115200 nokaslr

bash 复制代码
GRUB_CMDLINE_LINUX="rhgb quiet crashkernel=auto rd.lvm.lv=VolGroup/lv_root cgroup_disable=files apparmor=0 crashkernel=512M selinux=0 kgdboc=ttyS0,115200 nokaslr"

更新 grub

bash 复制代码
grub2-mkconfig -o /boot/grub2/grub.cfg

复制代码到 开发机

bash 复制代码
rsync -avh /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64 [email protected]:/usr/src

调试机 reboot,选择我们编译好的内核启动

kgdb Debugger

发起 kgdb 中断,调试机的屏幕卡住,后面通过开发机的 gdb 通过串口连接后接管:

bash 复制代码
echo g > /proc/sysrq-trigger

调试

内核

在开发机侧:

bash 复制代码
cd /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64

连接调试机,给 vfs_write 函数打断点并执行

bash 复制代码
gdb vmlinux
(gdb) target remote /dev/ttyS0
(gdb) bt
#0  kgdb_breakpoint () at kernel/debug/debug_core.c:1268
#1  0xffffffff811f5c2e in sysrq_handle_dbg (key=<optimized out>) at kernel/debug/debug_core.c:1008
#2  0xffffffff81b31761 in __handle_sysrq (key=key@entry=103, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:604
#3  0xffffffff8174776a in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1168
#4  0xffffffff81455ce3 in pde_write (ppos=<optimized out>, count=0, buf=<optimized out>, file=0x67, pde=0xffff888448b14540) at fs/proc/inode.c:345
#5  proc_reg_write (file=0x67, buf=0xffff88882fba0710 "", count=0, ppos=0x0 <fixed_percpu_data>) at fs/proc/inode.c:357
#6  0xffffffff813bd5df in vfs_write (file=file@entry=0xffff88844b4b3540, buf=buf@entry=0x564f86210f00 <error: Cannot access memory at address 0x564f86210f00>, count=count@entry=2, pos=pos@entry=0xffffc900039fbef0) at fs/read_write.c:600
#7  0xffffffff813bda67 in ksys_write (fd=<optimized out>, buf=0x564f86210f00 <error: Cannot access memory at address 0x564f86210f00>, count=2) at fs/read_write.c:655
#8  0xffffffff813bdb0a in __do_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:667
#9  __se_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:664
#10 __x64_sys_write (regs=<optimized out>) at fs/read_write.c:664
#11 0xffffffff81b510bd in do_syscall_64 (nr=<optimized out>, regs=0xffffc900039fbf58) at arch/x86/entry/common.c:47
#12 0xffffffff81c000df in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:125
#13 0x00007fa5a8d777a0 in ?? ()
#14 0x0000000000000002 in fixed_percpu_data ()
#15 0x00007fa5a8d775a0 in ?? ()
(gdb) b vfs_write
Breakpoint 1 at 0xffffffff813bd500: file fs/read_write.c, line 583.
(gdb) c
Continuing.
[Switching to Thread 7142]

Thread 409 hit Breakpoint 1, vfs_write (file=file@entry=0xffff888449e45680, buf=buf@entry=0xc0002f7a93 <error: Cannot access memory at address 0xc0002f7a93>, count=count@entry=1, pos=pos@entry=0x0 <fixed_percpu_data>) at fs/read_write.c:583
583     {
(gdb) l
578             return ret;
579     }
580     EXPORT_SYMBOL(kernel_write);
581
582     ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
583     {
584             ssize_t ret;
585
586             if (!(file->f_mode & FMODE_WRITE))
587                     return -EBADF;

内核模块

以 bcache 举例,想要调试 bcache 需要将 bcache.ko 导入进来,否则无法获取符号表:

  1. 在调试机侧
bash 复制代码
modprobe bcache

获取内存布局:

bash 复制代码
[root@yielde-debugging ~]# cat /sys/module/bcache/sections/.text
0xffffffffa0672000
[root@yielde-debugging ~]# cat /sys/module/bcache/sections/.bss
0xffffffffa06a6140
[root@yielde-debugging ~]# cat /sys/module/bcache/sections/.data
0xffffffffa06a05a0

执行 echo g > /proc/sysrq-trigger再将控制权交给开发机的 gdb

  1. 在开发机侧加载 bcache 的符号表
bash 复制代码
(gdb) add-symbol-file /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64/drivers/md/bcache/bcache.ko -s .text 0xffffffffa0672000 -s .bss 0xffffffffa06a6140 -s .data 0xffffffffa06a05a0
add symbol table from file "/usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64/drivers/md/bcache/bcache.ko" at
        .text_addr = 0xffffffffa0672000
        .bss_addr = 0xffffffffa06a6140
        .data_addr = 0xffffffffa06a05a0
(y or n) y
Reading symbols from /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64/drivers/md/bcache/bcache.ko...

给 bcache 的函数打断点,之后继续运行

bash 复制代码
(gdb) b bcache_write_super
Breakpoint 2 at 0xffffffffa0689390: file drivers/md/bcache/super.c, line 375.
(gdb) c
Continuing.

在调试机创建 bcache,执行

bash 复制代码
make-bcache -B /dev/sdc -C /dev/sdb --writeback

触发我们的断点如下:

bash 复制代码
[New Thread 10207]
[New Thread 10198]
[New Thread 10201]
[New Thread 10204]
[New Thread 10208]
[New Thread 10209]
[New Thread 10210]
[New Thread 10211]
[New Thread 10212]
[Switching to Thread 10207]

Thread 444 hit Breakpoint 2, bcache_write_super (c=c@entry=0xffff888442d40000) at drivers/md/bcache/super.c:375
375     {

查看创建 bcache 的调用栈:

bash 复制代码
(gdb) bt
#0  bcache_write_super (c=c@entry=0xffff888442d40000) at drivers/md/bcache/super.c:375
#1  0xffffffffa068af4a in run_cache_set (c=0xffff888442d40000) at drivers/md/bcache/super.c:2137
#2  0xffffffffa068b927 in register_cache_set (ca=ca@entry=0xffff888105552000) at drivers/md/bcache/super.c:2204
#3  0xffffffffa068ba47 in register_cache (sb=<optimized out>, sb_disk=<optimized out>, bdev=0xffff888441756c80, ca=0xffff888105552000) at drivers/md/bcache/super.c:2401
#4  0xffffffffa068bc72 in register_bcache (k=<optimized out>, attr=0xffffffffa06a07a0 <ksysfs_register>, buffer=<optimized out>, size=9) at drivers/md/bcache/super.c:2656
#5  0xffffffff81609d4f in kobj_attr_store (kobj=0xffff888442d40000, attr=0xffff88844566a080, buf=0xffff888105552000 "", count=0) at lib/kobject.c:864
#6  0xffffffff8146ec3b in sysfs_kf_write (of=<optimized out>, buf=0xffff888105552000 "", count=0, pos=<optimized out>) at fs/sysfs/file.c:139
#7  0xffffffff8146e27c in kernfs_fop_write_iter (iocb=0xffffc900060f7e60, iter=<optimized out>) at fs/kernfs/file.c:296
#8  0xffffffff813baa99 in call_write_iter (iter=0xffff88844566a080, kio=0xffff888442d40000, file=0xffff888440f672c0) at ./include/linux/fs.h:2064
#9  new_sync_write (filp=filp@entry=0xffff888440f672c0, buf=buf@entry=0x5627601532a0 <error: Cannot access memory at address 0x5627601532a0>, len=len@entry=9, ppos=ppos@entry=0xffffc900060f7ef0) at fs/read_write.c:515
#10 0xffffffff813bd6c0 in vfs_write (file=file@entry=0xffff888440f672c0, buf=buf@entry=0x5627601532a0 <error: Cannot access memory at address 0x5627601532a0>, count=count@entry=9, pos=pos@entry=0xffffc900060f7ef0) at fs/read_write.c:602
#11 0xffffffff813bda67 in ksys_write (fd=<optimized out>, buf=0x5627601532a0 <error: Cannot access memory at address 0x5627601532a0>, count=9) at fs/read_write.c:655
#12 0xffffffff813bdb0a in __do_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:667
#13 __se_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:664
#14 __x64_sys_write (regs=<optimized out>) at fs/read_write.c:664
#15 0xffffffff81b510bd in do_syscall_64 (nr=<optimized out>, regs=0xffffc900060f7f58) at arch/x86/entry/common.c:47
#16 0xffffffff81c000df in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:125
#17 0x00007f1ea5a5a7a0 in ?? ()
#18 0x0000000000000009 in fixed_percpu_data ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

单步执行如下:

Ubuntu 22.04.5 LTS

  1. 开发机:使用 gdb 连接调试机进行调试,安装 ubuntu 系统的物理机
  2. 调试机:编译内核,开启 KGDB ,被调试的机器,在此物理机上通过 virtualbox 安装的虚拟机

配置虚拟机

配置串口并设置主机 pip 通信路径为 /tmp/debuglinux

物理机编译内核

这次通过开发机来编译内核拷贝到调试机上面去

内核源码地址在 /root/workspace/linux-learn,安装 deb 包如下:

bash 复制代码
# apt install
gcc
make
perl
flex
bison
pkg-config
libncurses-dev
libelf-devel
build-essential

开始编译

bash 复制代码
cd /root/workspace/linux-learn
mkdir build
make mrproper
make O=build defconfig
make O=build menuconfig
  • 配置项与 openEuler 相似,主要是开启 KGDB,关闭内核的的随机地址空间布局(KASLR),开启 debuginfo
bash 复制代码
make O=build -j8

将整个内核目录复制到 virtulbox 调试机的相同目录下:

bash 复制代码
cd /root/workspace
rsync -avzW linux-learn [email protected]:/root/workspace

配置调试机

安装内核

bash 复制代码
cd /root/workspace/linux-learn
make O=build modules_install
make O=build install

更新 grub

当前版本的 ubuntu 使用的 grub,上面的 openEuler 使用的 grub2

bash 复制代码
# 添加
vim /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="kgdboc=ttyS0,115200 nokaslr"

grub-update

重启虚拟机,进入编译好的系统如下图

kgdb Debugger

  1. 调试机执行echo g > /proc/sysrq-trigger将控制交给 gdb
  2. 开发机即主机执行:
bash 复制代码
cd /root/workspace/linux-learn/build
target remote /tmp/debuglinux

尝试断点:

bash 复制代码
(gdb) b vfs_write
Breakpoint 1 at 0xffffffff811c34cf: file ../fs/read_write.c, line 586.
(gdb) c
Continuing.
[Switching to Thread 377]

Thread 78 hit Breakpoint 1, vfs_write (file=file@entry=0xffff888101ddf800, buf=0x7ffe9e60766f <error: Cannot access memory at address 0x7ffe9e60766f>, count=count@entry=1, pos=pos@entry=0x0 <fixed_percpu_data>) at ../fs/read_write.c:586
586	{

(gdb) bt
#0  vfs_write (file=file@entry=0xffff888101ddf800, buf=0x7ffe9e60766f <error: Cannot access memory at address 0x7ffe9e60766f>, 
    count=count@entry=1, pos=pos@entry=0x0 <fixed_percpu_data>) at ../fs/read_write.c:586
#1  0xffffffff811c3786 in ksys_write (fd=<optimized out>, buf=0x7ffe9e60766f <error: Cannot access memory at address 0x7ffe9e60766f>, 
    count=1) at ../fs/read_write.c:658
#2  0xffffffff811c37eb in __do_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at ../fs/read_write.c:670
#3  __se_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at ../fs/read_write.c:667
#4  __x64_sys_write (regs=<optimized out>) at ../fs/read_write.c:667
#5  0xffffffff81a1a921 in do_syscall_64 (nr=<optimized out>, regs=0xffffc90000407f58) at ../arch/x86/entry/common.c:46
#6  0xffffffff81c0011f in entry_SYSCALL_64 () at ../arch/x86/entry/entry_64.S:117
#7  0x00005610590964e5 in ?? ()
#8  0x0000561091a5ac50 in ?? ()
#9  0x00007f6214955a60 in ?? ()

总结

本文主要汇总了下近期调试内核与内核模块的配置方式。

  1. 使用两台虚拟机通信,一台启动 gdb,另一台启动 kgdb
  2. 物理机本身为 linux 系统启动 gdb,使用虚拟机启动 kgdb
  3. 建议每次修改代码重新编译内核后同步到两个系统的相同的目录下,可以省去很多麻烦
相关推荐
xq51486312 分钟前
Linux系统下安装mongodb
linux·mongodb
柒七爱吃麻辣烫12 分钟前
在Linux中安装JDK并且搭建Java环境
java·linux·开发语言
孤寂大仙v1 小时前
【Linux笔记】——进程信号的产生
linux·服务器·笔记
深海蜗牛1 小时前
Jenkins linux安装
linux·jenkins
愚戏师1 小时前
Linux复习笔记(三) 网络服务配置(web)
linux·运维·笔记
JANYI20182 小时前
嵌入式MCU和Linux开发哪个好?
linux·单片机·嵌入式硬件
熊大如如2 小时前
Java NIO 文件处理接口
java·linux·nio
晚秋大魔王2 小时前
OpenHarmony 开源鸿蒙南向开发——linux下使用make交叉编译第三方库——nettle库
linux·开源·harmonyos
农民小飞侠2 小时前
ubuntu 24.04 error: cannot uninstall blinker 1.7.0, record file not found. hint
linux·运维·ubuntu
某不知名網友2 小时前
Linux 软硬连接详解
linux·运维·服务器