gdb调试多线程

前言

小结一下多线程下的gdb调试技巧。

多线程gdb调试

测试程序创建两个线程,每个线程都有一个循环递增数值,同时分别调用sleep。

线程信息查看

shell 复制代码
info threads

(gdb) info threads
  Id   Target Id                              Frame
  1    Thread 0x7ffff7faa740 (LWP 2778) "sem" clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
* 2    Thread 0x7ffff7bff640 (LWP 2781) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
  3    Thread 0x7ffff73fe640 (LWP 2782) "sem" __sleep (seconds=2) at ../sysdeps/posix/sleep.c:34
(gdb) bt
#0  enqueue (arg=0x7fffffffe300) at test_sem.c:48
#1  0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#2  0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

每个线程有三个ID:

Pthread库为线程分配的pthread ID,也就是用pthread_self()返回的ID, Thread 0x7ffff7faa740 Linux kernel为线程分配的thread ID,也就是gettid()返回的ID, LWP 2956 GDB为线程分配的ID。执行GDB调试命令时要指定的线程ID,如无特殊说明,都是指的这个ID, 最前面的1,2,3

如上, 带*号表示当前在2号线程, 可以在另外的终端使用ps -eT | grep sem查看线程信息。

shell 复制代码
root@keep-VirtualBox:~# ps -eT | grep sem
   2778    2778 pts/1    00:00:00 sem
   2778    2781 pts/1    00:00:00 sem
   2778    2782 pts/1    00:00:00 sem
root@keep-VirtualBox:~#

默认情况下,执行的GDB命令是针对当前线程。比如此时执行bt(backtrace)命令,获取的是线程2的调用栈

切换当前线程

thread 命令可以切换当前线程,如thread 1把线程1切换为当前线程。

针对指定线程执行命令

shell 复制代码
(gdb) thread 1
[Switching to thread 1 (Thread 0x7ffff7faa740 (LWP 2778))]
#0  clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
62      ../sysdeps/unix/sysv/linux/x86_64/clone3.S: 没有那个文件或目录.
(gdb) info threads
  Id   Target Id                              Frame
* 1    Thread 0x7ffff7faa740 (LWP 2778) "sem" clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
  2    Thread 0x7ffff7bff640 (LWP 2781) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
  3    Thread 0x7ffff73fe640 (LWP 2782) "sem" __sleep (seconds=2) at ../sysdeps/posix/sleep.c:34
(gdb) bt
#0  clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
#1  0x00007ffff7d26a91 in __GI___clone_internal (cl_args=cl_args@entry=0x7fffffffe0b0,
    func=func@entry=0x7ffff7c947d0 <start_thread>, arg=arg@entry=0x7ffff73fe640)
    at ../sysdeps/unix/sysv/linux/clone-internal.c:54
#2  0x00007ffff7c946d9 in create_thread (pd=pd@entry=0x7ffff73fe640, attr=attr@entry=0x7fffffffe1d0,
    stopped_start=stopped_start@entry=0x7fffffffe1ce, stackaddr=stackaddr@entry=0x7ffff6bfe000, stacksize=8388352,
    thread_ran=thread_ran@entry=0x7fffffffe1cf) at ./nptl/pthread_create.c:295
#3  0x00007ffff7c95200 in __pthread_create_2_1 (newthread=<optimized out>, attr=<optimized out>,
    start_routine=<optimized out>, arg=<optimized out>) at ./nptl/pthread_create.c:828
#4  0x00005555555554ad in main () at test_sem.c:77
(gdb)

在指定线程执行命令

thread apply [thread-id-list | all] command 可以针对指定线程执行命令

如: thread apply all bt:打印所有线程的调用栈信息 thread apply 3 bt:打印线程3的调用栈信息 thread apply 2-3 bt:打印线程2和线程3的调用栈信息

shell 复制代码
(gdb) thread apply all bt

Thread 3 (Thread 0x7ffff73fe640 (LWP 2782) "sem"):
#0  __sleep (seconds=2) at ../sysdeps/posix/sleep.c:34
#1  0x00005555555552d9 in dequeue (arg=0x7fffffffe300) at test_sem.c:31
#2  0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#3  0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 2 (Thread 0x7ffff7bff640 (LWP 2781) "sem"):
#0  enqueue (arg=0x7fffffffe300) at test_sem.c:48
#1  0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#2  0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 1 (Thread 0x7ffff7faa740 (LWP 2778) "sem"):
#0  clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
#1  0x00007ffff7d26a91 in __GI___clone_internal (cl_args=cl_args@entry=0x7fffffffe0b0, func=func@entry=0x7ffff7c947d0 <start_thread>, arg=arg@entry=0x7ffff73fe640) at ../sysdeps/unix/sysv/linux/clone-internal.c:54
#2  0x00007ffff7c946d9 in create_thread (pd=pd@entry=0x7ffff73fe640, attr=attr@entry=0x7fffffffe1d0, stopped_start=stopped_start@entry=0x7fffffffe1ce, stackaddr=stackaddr@entry=0x7ffff6bfe000, stacksize=8388352, thread_ran=thread_ran@entry=0x7fffffffe1cf) at ./nptl/pthread_create.c:295
#3  0x00007ffff7c95200 in __pthread_create_2_1 (newthread=<optimized out>, attr=<optimized out>, start_routine=<optimized out>, arg=<optimized out>) at ./nptl/pthread_create.c:828
#4  0x00005555555554ad in main () at test_sem.c:77
(gdb)

在特定线程设置断点

break # 在设置断点,并对所有线程生效 break thread # 在设置断点,仅对指定的线程生效 break thread if # 在设置条件断点断点,仅对指定的线程生效

控制线程创建和退出信息

gdb的时候,默认创建和退出线程会打印信息。 可以通过命令关闭该打印。

set print thread-events on/off

命令缩写

taas command 相当于 thread apply all -s command tfaas command 相当于 thread apply all -s -- frame apply all -s command

tfaas这个命令非常有用。比如,有时我们只记得一个变量或参数的名字,却忘了或不知道它是在哪个具体的函数中,就可以用这个命令:tfaas p var_name,这个命令会搜索所有线程的调用栈,找到名字为var_name的变量,并打印它的值,如:

shell 复制代码
(gdb) tfaas p enq

Thread 1 (Thread 0x7ffff7faa740 (LWP 2956) "sem"):
#4  0x0000555555555499 in main () at test_sem.c:93
$1 = 140737349940800
(gdb)

控制程序执行的两种模式

为了更好的调试多线程程序,GDB提供了两种模式来控制程序的执行:

All-Stop Mode:在该模式下,不管因为什么原因,一个线程被中断执行,其他所有的线程都会同时被中断执行。 Non-Stop Mode:在该模式下,一个线程被中断执行,不会影响其他线程的正常执行。

All-Stop Mode

默认处于All-Stop Mode。这也给程序调试带来了一些困难,比如,无法100%精确地进行单步调试。有时你会发现,在执行step命令之后,程序却停在了另外一个线程中。

可以通过命令 set scheduler-locking mode来锁定线程模式。

shell 复制代码
# mode
# off  -- 不锁定线程,恢复时所有线程继续执行
# on  -- 锁定当前线程,执行continue, step, next, finish等命令后,只有当前线程继续执行,其他线程还是停止状态
# step -- 只有step时,当前线程 继续执行,其他线程还是停止状态。 其他命令 所有线程都会恢复执行
# replay -- 反向调试时, 当前线程执行,其他线程停止


(gdb) show scheduler-locking
Mode for locking scheduler during execution is "replay".
(gdb) set schedule-l
Display all 150 possibilities? (y or n)
(gdb) set schedule-lock
lock              lock_fd           locked_map_ptr    locked_vfxprintf  lockf             lockf64
(gdb) set schedule-lock
lock              lock_fd           locked_map_ptr    locked_vfxprintf  lockf             lockf64
(gdb) set schedule-lock
Display all 200 possibilities? (y or n)
(gdb) set scheduler-locking on
(gdb) show scheduler-locking
Mode for locking scheduler during execution is "on".
(gdb)

Non-Stop Mode

在Non-Stop模式下,一个线程被中断执行,并不会影响到其他线程。比如,一个线程触发断点,只有这一个线程会被中断执行,其余线程不受影响继续执行。同样的,在程序运行时,执行Ctrl+C,也只会中断一个线程。

开启Non-stop Mode

shell 复制代码
set pagination off
set non-stop on

set non-stop off # 恢复 All-stop模式

show non-stop

(gdb) set non-stop on
(gdb) show non-stop
Controlling the inferior in non-stop mode is on.
(gdb) b enqueue
Breakpoint 1 at 0x138d: file test_sem.c, line 48.
(gdb) r
Starting program: /media/VM_SHARE/code/blog_code/condition/sem
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7bff640 (LWP 2930)]
[New Thread 0x7ffff73fe640 (LWP 2931)]

Thread 2 "sem" hit Breakpoint 1, enqueue (arg=0x7fffffffe300) at test_sem.c:48
48              thrd_args_t *thrd_args = (thrd_args_t *)arg;
(gdb) set print thread-events off
(gdb) info threads
  Id   Target Id                              Frame
* 1    Thread 0x7ffff7faa740 (LWP 2927) "sem" (running)
  2    Thread 0x7ffff7bff640 (LWP 2930) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
  3    Thread 0x7ffff73fe640 (LWP 2931) "sem" (running)
(gdb)
(gdb) interrupt -a
(gdb)
Thread 3 "sem" stopped.
0x00007ffff7ce57f8 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
78      ../sysdeps/unix/sysv/linux/clock_nanosleep.c: 没有那个文件或目录.

Thread 1 "sem" stopped.
__futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=2930, futex_word=0x7ffff7bff910) at ./nptl/futex-internal.c:57
57      ./nptl/futex-internal.c: 没有那个文件或目录.
info threads
  Id   Target Id                              Frame
* 1    Thread 0x7ffff7faa740 (LWP 2927) "sem" __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265,
    expected=2930, futex_word=0x7ffff7bff910) at ./nptl/futex-internal.c:57
  2    Thread 0x7ffff7bff640 (LWP 2930) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
  3    Thread 0x7ffff73fe640 (LWP 2931) "sem" 0x00007ffff7ce57f8 in __GI___clock_nanosleep (clock_id=clock_id@entry=0,
    flags=flags@entry=0, req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0)
    at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
(gdb)

可以看到执行到断点函数enqueue时, 只有线程2停止,线程1, 3还在执行

interrupt -a命令可以中断所有线程的执行。

命令后台执行

像shell命令一样,后面加一个&符号,可以把程序放在后台执行。在GDB中同样可以在命令后面加一个&符号,这样就能把命令放在后台执行。这样子,gdb就可以继续接收命令,比如我们可以run &, 然后interrupt 中断线程,继续查看线程状态。

shell 复制代码
(gdb) c&
Continuing.
(gdb) info threads
  Id   Target Id                              Frame
* 1    Thread 0x7ffff7faa740 (LWP 2927) "sem" (running)
  2    Thread 0x7ffff7bff640 (LWP 2930) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
  3    Thread 0x7ffff73fe640 (LWP 2931) "sem" 0x00007ffff7ce57f8 in __GI___clock_nanosleep (clock_id=clock_id@entry=0,
    flags=flags@entry=0, req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0)
    at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
(gdb) thread apply all bt

Thread 3 (Thread 0x7ffff73fe640 (LWP 2931) "sem"):
#0  0x00007ffff7ce57f8 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
#1  0x00007ffff7cea677 in __GI___nanosleep (req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2  0x00007ffff7cea5ae in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#3  0x00005555555552d9 in dequeue (arg=0x7fffffffe300) at test_sem.c:31
#4  0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#5  0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 2 (Thread 0x7ffff7bff640 (LWP 2930) "sem"):
#0  __sleep (seconds=1) at ../sysdeps/posix/sleep.c:34
#1  0x00005555555553b5 in enqueue (arg=0x7fffffffe300) at test_sem.c:52
#2  0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#3  0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 1 (Thread 0x7ffff7faa740 (LWP 2927) "sem"):
Selected thread is running.

更多

这个文章主要就是记录一些命令的使用,后续便于查找。很多时候我们用过或者看过一些命令,知道有这个东西,但是就是想不起来怎么用了,那么写博文就可以帮助到我们。

测试程序代码路径: gitee.com/fishmwei/bl...


行动,才不会被动!

欢迎关注个人公众号 微信 -> 搜索 -> fishmwei,沟通交流。

博客地址: fishmwei.github.io

掘金主页: juejin.cn/user/208432...


相关推荐
假装我不帅1 小时前
asp.net framework从webform开始创建mvc项目
后端·asp.net·mvc
神仙别闹1 小时前
基于ASP.NET+SQL Server实现简单小说网站(包括PC版本和移动版本)
后端·asp.net
计算机-秋大田2 小时前
基于Spring Boot的船舶监造系统的设计与实现,LW+源码+讲解
java·论文阅读·spring boot·后端·vue
货拉拉技术2 小时前
货拉拉-实时对账系统(算盘平台)
后端
掘金酱3 小时前
✍【瓜分额外奖金】11月金石计划附加挑战赛-活动命题发布
人工智能·后端
代码之光_19803 小时前
保障性住房管理:SpringBoot技术优势分析
java·spring boot·后端
ajsbxi3 小时前
苍穹外卖学习记录
java·笔记·后端·学习·nginx·spring·servlet
颜淡慕潇4 小时前
【K8S问题系列 |1 】Kubernetes 中 NodePort 类型的 Service 无法访问【已解决】
后端·云原生·容器·kubernetes·问题解决
尘浮生4 小时前
Java项目实战II基于Spring Boot的光影视频平台(开发文档+数据库+源码)
java·开发语言·数据库·spring boot·后端·maven·intellij-idea