KGDB调试Linux内核与模块

前言

内核 5.10 版本

  1. openEuler 使用 yum install 下载了源码,并且通过两个 VMware 虚拟机进行调试
  2. ubuntu 直接使用 git 拉取了https://kernel.org/下 5.10.235 分支的代码,物理主机作为开发机,通过 virtualbox 建立虚拟机作为调试机

openEuler2204-SP4

使用两台虚拟机:

  1. 开发机:使用 gdb 连接调试机进行调试
  2. 调试机:编译内核,开启 KGDB ,被调试的机器

配置虚拟机

开发机

调试机

测试: 在调试机执行 cat /dev/ttyS0阻塞,在开发机执行 echo "hello" > /dev/ttyS0,可以看到调试机测输出 hello,表示串口连通。

调试机配置

编译内核

在调试机侧下载内核源码:yum install kernel-source.x86_64,在目录 /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64下:

bash 复制代码
#安装yum install
pkg-config
ncurses-devel
openssl-libs
elfutils-libelf-devel
dwarves
openssl-libs
bash 复制代码
cd /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64
make menuconfig

按照以下截图配置内核:







问题

  1. openeuler 安装时会出现:dracut-install: Failed to find module 'virtio_gpu' dracut: FAILED: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.MlDs2I/initramfs --kerneldir /lib/modules/5.10.235-yielde-v1-+/ -m virtio_gpu,需要打开 Virtio GPU driver 支持,如下:

配置完成并保存后开始编译内核:

bash 复制代码
make -j8
  • -j8:表示使用 8 个 cpu 核共同编译

编译完成后检查 vmlinux 是否包含 debug 信息:

bash 复制代码
[root@yielde-debugging linux-5.10.0-257.0.0.160.oe2203sp4.x86_64]# readelf -e vmlinux|grep debug
  [36] .debug_aranges    PROGBITS         0000000000000000  02e00000
  [37] .debug_info       PROGBITS         0000000000000000  02e2c310
  [38] .debug_abbrev     PROGBITS         0000000000000000  0d7f2b9b
  [39] .debug_line       PROGBITS         0000000000000000  0dcc7ee8
  [40] .debug_frame      PROGBITS         0000000000000000  0f37bcf8
  [41] .debug_str        PROGBITS         0000000000000000  0f64cf48
  [42] .debug_loc        PROGBITS         0000000000000000  0f9adda9
  [43] .debug_ranges     PROGBITS         0000000000000000  12c16be0

调试机安装内核模块和系统:

bash 复制代码
make modules_install
make install

配置 grub

设置 grub 打开 kgdb,vim /etc/default/grubGRUB_CMDLINE_LINUX的末尾加入 kgdboc=ttyS0,115200 nokaslr

bash 复制代码
GRUB_CMDLINE_LINUX="rhgb quiet crashkernel=auto rd.lvm.lv=VolGroup/lv_root cgroup_disable=files apparmor=0 crashkernel=512M selinux=0 kgdboc=ttyS0,115200 nokaslr"

更新 grub

bash 复制代码
grub2-mkconfig -o /boot/grub2/grub.cfg

复制代码到 开发机

bash 复制代码
rsync -avh /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64 [email protected]:/usr/src

调试机 reboot,选择我们编译好的内核启动

kgdb Debugger

发起 kgdb 中断,调试机的屏幕卡住,后面通过开发机的 gdb 通过串口连接后接管:

bash 复制代码
echo g > /proc/sysrq-trigger

调试

内核

在开发机侧:

bash 复制代码
cd /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64

连接调试机,给 vfs_write 函数打断点并执行

bash 复制代码
gdb vmlinux
(gdb) target remote /dev/ttyS0
(gdb) bt
#0  kgdb_breakpoint () at kernel/debug/debug_core.c:1268
#1  0xffffffff811f5c2e in sysrq_handle_dbg (key=<optimized out>) at kernel/debug/debug_core.c:1008
#2  0xffffffff81b31761 in __handle_sysrq (key=key@entry=103, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:604
#3  0xffffffff8174776a in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1168
#4  0xffffffff81455ce3 in pde_write (ppos=<optimized out>, count=0, buf=<optimized out>, file=0x67, pde=0xffff888448b14540) at fs/proc/inode.c:345
#5  proc_reg_write (file=0x67, buf=0xffff88882fba0710 "", count=0, ppos=0x0 <fixed_percpu_data>) at fs/proc/inode.c:357
#6  0xffffffff813bd5df in vfs_write (file=file@entry=0xffff88844b4b3540, buf=buf@entry=0x564f86210f00 <error: Cannot access memory at address 0x564f86210f00>, count=count@entry=2, pos=pos@entry=0xffffc900039fbef0) at fs/read_write.c:600
#7  0xffffffff813bda67 in ksys_write (fd=<optimized out>, buf=0x564f86210f00 <error: Cannot access memory at address 0x564f86210f00>, count=2) at fs/read_write.c:655
#8  0xffffffff813bdb0a in __do_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:667
#9  __se_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:664
#10 __x64_sys_write (regs=<optimized out>) at fs/read_write.c:664
#11 0xffffffff81b510bd in do_syscall_64 (nr=<optimized out>, regs=0xffffc900039fbf58) at arch/x86/entry/common.c:47
#12 0xffffffff81c000df in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:125
#13 0x00007fa5a8d777a0 in ?? ()
#14 0x0000000000000002 in fixed_percpu_data ()
#15 0x00007fa5a8d775a0 in ?? ()
(gdb) b vfs_write
Breakpoint 1 at 0xffffffff813bd500: file fs/read_write.c, line 583.
(gdb) c
Continuing.
[Switching to Thread 7142]

Thread 409 hit Breakpoint 1, vfs_write (file=file@entry=0xffff888449e45680, buf=buf@entry=0xc0002f7a93 <error: Cannot access memory at address 0xc0002f7a93>, count=count@entry=1, pos=pos@entry=0x0 <fixed_percpu_data>) at fs/read_write.c:583
583     {
(gdb) l
578             return ret;
579     }
580     EXPORT_SYMBOL(kernel_write);
581
582     ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
583     {
584             ssize_t ret;
585
586             if (!(file->f_mode & FMODE_WRITE))
587                     return -EBADF;

内核模块

以 bcache 举例,想要调试 bcache 需要将 bcache.ko 导入进来,否则无法获取符号表:

  1. 在调试机侧
bash 复制代码
modprobe bcache

获取内存布局:

bash 复制代码
[root@yielde-debugging ~]# cat /sys/module/bcache/sections/.text
0xffffffffa0672000
[root@yielde-debugging ~]# cat /sys/module/bcache/sections/.bss
0xffffffffa06a6140
[root@yielde-debugging ~]# cat /sys/module/bcache/sections/.data
0xffffffffa06a05a0

执行 echo g > /proc/sysrq-trigger再将控制权交给开发机的 gdb

  1. 在开发机侧加载 bcache 的符号表
bash 复制代码
(gdb) add-symbol-file /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64/drivers/md/bcache/bcache.ko -s .text 0xffffffffa0672000 -s .bss 0xffffffffa06a6140 -s .data 0xffffffffa06a05a0
add symbol table from file "/usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64/drivers/md/bcache/bcache.ko" at
        .text_addr = 0xffffffffa0672000
        .bss_addr = 0xffffffffa06a6140
        .data_addr = 0xffffffffa06a05a0
(y or n) y
Reading symbols from /usr/src/linux-5.10.0-257.0.0.160.oe2203sp4.x86_64/drivers/md/bcache/bcache.ko...

给 bcache 的函数打断点,之后继续运行

bash 复制代码
(gdb) b bcache_write_super
Breakpoint 2 at 0xffffffffa0689390: file drivers/md/bcache/super.c, line 375.
(gdb) c
Continuing.

在调试机创建 bcache,执行

bash 复制代码
make-bcache -B /dev/sdc -C /dev/sdb --writeback

触发我们的断点如下:

bash 复制代码
[New Thread 10207]
[New Thread 10198]
[New Thread 10201]
[New Thread 10204]
[New Thread 10208]
[New Thread 10209]
[New Thread 10210]
[New Thread 10211]
[New Thread 10212]
[Switching to Thread 10207]

Thread 444 hit Breakpoint 2, bcache_write_super (c=c@entry=0xffff888442d40000) at drivers/md/bcache/super.c:375
375     {

查看创建 bcache 的调用栈:

bash 复制代码
(gdb) bt
#0  bcache_write_super (c=c@entry=0xffff888442d40000) at drivers/md/bcache/super.c:375
#1  0xffffffffa068af4a in run_cache_set (c=0xffff888442d40000) at drivers/md/bcache/super.c:2137
#2  0xffffffffa068b927 in register_cache_set (ca=ca@entry=0xffff888105552000) at drivers/md/bcache/super.c:2204
#3  0xffffffffa068ba47 in register_cache (sb=<optimized out>, sb_disk=<optimized out>, bdev=0xffff888441756c80, ca=0xffff888105552000) at drivers/md/bcache/super.c:2401
#4  0xffffffffa068bc72 in register_bcache (k=<optimized out>, attr=0xffffffffa06a07a0 <ksysfs_register>, buffer=<optimized out>, size=9) at drivers/md/bcache/super.c:2656
#5  0xffffffff81609d4f in kobj_attr_store (kobj=0xffff888442d40000, attr=0xffff88844566a080, buf=0xffff888105552000 "", count=0) at lib/kobject.c:864
#6  0xffffffff8146ec3b in sysfs_kf_write (of=<optimized out>, buf=0xffff888105552000 "", count=0, pos=<optimized out>) at fs/sysfs/file.c:139
#7  0xffffffff8146e27c in kernfs_fop_write_iter (iocb=0xffffc900060f7e60, iter=<optimized out>) at fs/kernfs/file.c:296
#8  0xffffffff813baa99 in call_write_iter (iter=0xffff88844566a080, kio=0xffff888442d40000, file=0xffff888440f672c0) at ./include/linux/fs.h:2064
#9  new_sync_write (filp=filp@entry=0xffff888440f672c0, buf=buf@entry=0x5627601532a0 <error: Cannot access memory at address 0x5627601532a0>, len=len@entry=9, ppos=ppos@entry=0xffffc900060f7ef0) at fs/read_write.c:515
#10 0xffffffff813bd6c0 in vfs_write (file=file@entry=0xffff888440f672c0, buf=buf@entry=0x5627601532a0 <error: Cannot access memory at address 0x5627601532a0>, count=count@entry=9, pos=pos@entry=0xffffc900060f7ef0) at fs/read_write.c:602
#11 0xffffffff813bda67 in ksys_write (fd=<optimized out>, buf=0x5627601532a0 <error: Cannot access memory at address 0x5627601532a0>, count=9) at fs/read_write.c:655
#12 0xffffffff813bdb0a in __do_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:667
#13 __se_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at fs/read_write.c:664
#14 __x64_sys_write (regs=<optimized out>) at fs/read_write.c:664
#15 0xffffffff81b510bd in do_syscall_64 (nr=<optimized out>, regs=0xffffc900060f7f58) at arch/x86/entry/common.c:47
#16 0xffffffff81c000df in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:125
#17 0x00007f1ea5a5a7a0 in ?? ()
#18 0x0000000000000009 in fixed_percpu_data ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

单步执行如下:

Ubuntu 22.04.5 LTS

  1. 开发机:使用 gdb 连接调试机进行调试,安装 ubuntu 系统的物理机
  2. 调试机:编译内核,开启 KGDB ,被调试的机器,在此物理机上通过 virtualbox 安装的虚拟机

配置虚拟机

配置串口并设置主机 pip 通信路径为 /tmp/debuglinux

物理机编译内核

这次通过开发机来编译内核拷贝到调试机上面去

内核源码地址在 /root/workspace/linux-learn,安装 deb 包如下:

bash 复制代码
# apt install
gcc
make
perl
flex
bison
pkg-config
libncurses-dev
libelf-devel
build-essential

开始编译

bash 复制代码
cd /root/workspace/linux-learn
mkdir build
make mrproper
make O=build defconfig
make O=build menuconfig
  • 配置项与 openEuler 相似,主要是开启 KGDB,关闭内核的的随机地址空间布局(KASLR),开启 debuginfo
bash 复制代码
make O=build -j8

将整个内核目录复制到 virtulbox 调试机的相同目录下:

bash 复制代码
cd /root/workspace
rsync -avzW linux-learn [email protected]:/root/workspace

配置调试机

安装内核

bash 复制代码
cd /root/workspace/linux-learn
make O=build modules_install
make O=build install

更新 grub

当前版本的 ubuntu 使用的 grub,上面的 openEuler 使用的 grub2

bash 复制代码
# 添加
vim /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="kgdboc=ttyS0,115200 nokaslr"

grub-update

重启虚拟机,进入编译好的系统如下图

kgdb Debugger

  1. 调试机执行echo g > /proc/sysrq-trigger将控制交给 gdb
  2. 开发机即主机执行:
bash 复制代码
cd /root/workspace/linux-learn/build
target remote /tmp/debuglinux

尝试断点:

bash 复制代码
(gdb) b vfs_write
Breakpoint 1 at 0xffffffff811c34cf: file ../fs/read_write.c, line 586.
(gdb) c
Continuing.
[Switching to Thread 377]

Thread 78 hit Breakpoint 1, vfs_write (file=file@entry=0xffff888101ddf800, buf=0x7ffe9e60766f <error: Cannot access memory at address 0x7ffe9e60766f>, count=count@entry=1, pos=pos@entry=0x0 <fixed_percpu_data>) at ../fs/read_write.c:586
586	{

(gdb) bt
#0  vfs_write (file=file@entry=0xffff888101ddf800, buf=0x7ffe9e60766f <error: Cannot access memory at address 0x7ffe9e60766f>, 
    count=count@entry=1, pos=pos@entry=0x0 <fixed_percpu_data>) at ../fs/read_write.c:586
#1  0xffffffff811c3786 in ksys_write (fd=<optimized out>, buf=0x7ffe9e60766f <error: Cannot access memory at address 0x7ffe9e60766f>, 
    count=1) at ../fs/read_write.c:658
#2  0xffffffff811c37eb in __do_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at ../fs/read_write.c:670
#3  __se_sys_write (count=<optimized out>, buf=<optimized out>, fd=<optimized out>) at ../fs/read_write.c:667
#4  __x64_sys_write (regs=<optimized out>) at ../fs/read_write.c:667
#5  0xffffffff81a1a921 in do_syscall_64 (nr=<optimized out>, regs=0xffffc90000407f58) at ../arch/x86/entry/common.c:46
#6  0xffffffff81c0011f in entry_SYSCALL_64 () at ../arch/x86/entry/entry_64.S:117
#7  0x00005610590964e5 in ?? ()
#8  0x0000561091a5ac50 in ?? ()
#9  0x00007f6214955a60 in ?? ()

总结

本文主要汇总了下近期调试内核与内核模块的配置方式。

  1. 使用两台虚拟机通信,一台启动 gdb,另一台启动 kgdb
  2. 物理机本身为 linux 系统启动 gdb,使用虚拟机启动 kgdb
  3. 建议每次修改代码重新编译内核后同步到两个系统的相同的目录下,可以省去很多麻烦
相关推荐
大数据魔法师10 分钟前
Redis(一) - Redis安装教程(Windows + Linux)
linux·windows·redis
Y1anoohh21 分钟前
驱动学习专栏--字符设备驱动篇--2_字符设备注册与注销
linux·c语言·驱动开发·学习
.R^O^1 小时前
计算机知识
linux·服务器·网络·安全
卡戎-caryon1 小时前
【Linux网络与网络编程】11.数据链路层mac帧协议&&ARP协议
linux·服务器·网络·笔记·tcp/ip·数据链路层
夜月yeyue2 小时前
STM32启动流程详解
linux·c++·stm32·单片机·嵌入式硬件·c#
烛.照1032 小时前
RabbitMQ消息的可靠性
linux·docker·rabbitmq
RLG_星辰2 小时前
prime-2 靶场笔记(vuInhub靶场)
linux·笔记·网络安全·渗透测试·权限维持·smb协议·lxd提权
CoolScript2 小时前
WSL2 配置和离线安装linux系统。
linux·运维·服务器
稀饭过霍2 小时前
【linux】命令收集
linux·服务器·网络
珹洺2 小时前
Linux红帽:RHCSA认证知识讲解(十 三)在serverb上破解root密码
linux·运维·服务器·网络·后端