前言
小结一下多线程下的gdb调试技巧。
多线程gdb调试
测试程序创建两个线程,每个线程都有一个循环递增数值,同时分别调用sleep。
线程信息查看
shell
info threads
(gdb) info threads
Id Target Id Frame
1 Thread 0x7ffff7faa740 (LWP 2778) "sem" clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
* 2 Thread 0x7ffff7bff640 (LWP 2781) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
3 Thread 0x7ffff73fe640 (LWP 2782) "sem" __sleep (seconds=2) at ../sysdeps/posix/sleep.c:34
(gdb) bt
#0 enqueue (arg=0x7fffffffe300) at test_sem.c:48
#1 0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#2 0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
每个线程有三个ID:
Pthread库为线程分配的pthread ID,也就是用pthread_self()返回的ID, Thread 0x7ffff7faa740 Linux kernel为线程分配的thread ID,也就是gettid()返回的ID, LWP 2956 GDB为线程分配的ID。执行GDB调试命令时要指定的线程ID,如无特殊说明,都是指的这个ID, 最前面的1,2,3
如上, 带*号表示当前在2号线程, 可以在另外的终端使用ps -eT | grep sem查看线程信息。
shell
root@keep-VirtualBox:~# ps -eT | grep sem
2778 2778 pts/1 00:00:00 sem
2778 2781 pts/1 00:00:00 sem
2778 2782 pts/1 00:00:00 sem
root@keep-VirtualBox:~#
默认情况下,执行的GDB命令是针对当前线程。比如此时执行bt(backtrace)命令,获取的是线程2的调用栈
切换当前线程
thread 命令可以切换当前线程,如thread 1把线程1切换为当前线程。
针对指定线程执行命令
shell
(gdb) thread 1
[Switching to thread 1 (Thread 0x7ffff7faa740 (LWP 2778))]
#0 clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
62 ../sysdeps/unix/sysv/linux/x86_64/clone3.S: 没有那个文件或目录.
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7ffff7faa740 (LWP 2778) "sem" clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
2 Thread 0x7ffff7bff640 (LWP 2781) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
3 Thread 0x7ffff73fe640 (LWP 2782) "sem" __sleep (seconds=2) at ../sysdeps/posix/sleep.c:34
(gdb) bt
#0 clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
#1 0x00007ffff7d26a91 in __GI___clone_internal (cl_args=cl_args@entry=0x7fffffffe0b0,
func=func@entry=0x7ffff7c947d0 <start_thread>, arg=arg@entry=0x7ffff73fe640)
at ../sysdeps/unix/sysv/linux/clone-internal.c:54
#2 0x00007ffff7c946d9 in create_thread (pd=pd@entry=0x7ffff73fe640, attr=attr@entry=0x7fffffffe1d0,
stopped_start=stopped_start@entry=0x7fffffffe1ce, stackaddr=stackaddr@entry=0x7ffff6bfe000, stacksize=8388352,
thread_ran=thread_ran@entry=0x7fffffffe1cf) at ./nptl/pthread_create.c:295
#3 0x00007ffff7c95200 in __pthread_create_2_1 (newthread=<optimized out>, attr=<optimized out>,
start_routine=<optimized out>, arg=<optimized out>) at ./nptl/pthread_create.c:828
#4 0x00005555555554ad in main () at test_sem.c:77
(gdb)
在指定线程执行命令
thread apply [thread-id-list | all] command 可以针对指定线程执行命令
如: thread apply all bt:打印所有线程的调用栈信息 thread apply 3 bt:打印线程3的调用栈信息 thread apply 2-3 bt:打印线程2和线程3的调用栈信息
shell
(gdb) thread apply all bt
Thread 3 (Thread 0x7ffff73fe640 (LWP 2782) "sem"):
#0 __sleep (seconds=2) at ../sysdeps/posix/sleep.c:34
#1 0x00005555555552d9 in dequeue (arg=0x7fffffffe300) at test_sem.c:31
#2 0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#3 0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Thread 2 (Thread 0x7ffff7bff640 (LWP 2781) "sem"):
#0 enqueue (arg=0x7fffffffe300) at test_sem.c:48
#1 0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#2 0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Thread 1 (Thread 0x7ffff7faa740 (LWP 2778) "sem"):
#0 clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
#1 0x00007ffff7d26a91 in __GI___clone_internal (cl_args=cl_args@entry=0x7fffffffe0b0, func=func@entry=0x7ffff7c947d0 <start_thread>, arg=arg@entry=0x7ffff73fe640) at ../sysdeps/unix/sysv/linux/clone-internal.c:54
#2 0x00007ffff7c946d9 in create_thread (pd=pd@entry=0x7ffff73fe640, attr=attr@entry=0x7fffffffe1d0, stopped_start=stopped_start@entry=0x7fffffffe1ce, stackaddr=stackaddr@entry=0x7ffff6bfe000, stacksize=8388352, thread_ran=thread_ran@entry=0x7fffffffe1cf) at ./nptl/pthread_create.c:295
#3 0x00007ffff7c95200 in __pthread_create_2_1 (newthread=<optimized out>, attr=<optimized out>, start_routine=<optimized out>, arg=<optimized out>) at ./nptl/pthread_create.c:828
#4 0x00005555555554ad in main () at test_sem.c:77
(gdb)
在特定线程设置断点
break # 在设置断点,并对所有线程生效 break thread # 在设置断点,仅对指定的线程生效 break thread if # 在设置条件断点断点,仅对指定的线程生效
控制线程创建和退出信息
gdb的时候,默认创建和退出线程会打印信息。 可以通过命令关闭该打印。
set print thread-events on/off
命令缩写
taas command 相当于 thread apply all -s command tfaas command 相当于 thread apply all -s -- frame apply all -s command
tfaas这个命令非常有用。比如,有时我们只记得一个变量或参数的名字,却忘了或不知道它是在哪个具体的函数中,就可以用这个命令:tfaas p var_name,这个命令会搜索所有线程的调用栈,找到名字为var_name的变量,并打印它的值,如:
shell
(gdb) tfaas p enq
Thread 1 (Thread 0x7ffff7faa740 (LWP 2956) "sem"):
#4 0x0000555555555499 in main () at test_sem.c:93
$1 = 140737349940800
(gdb)
控制程序执行的两种模式
为了更好的调试多线程程序,GDB提供了两种模式来控制程序的执行:
All-Stop Mode:在该模式下,不管因为什么原因,一个线程被中断执行,其他所有的线程都会同时被中断执行。 Non-Stop Mode:在该模式下,一个线程被中断执行,不会影响其他线程的正常执行。
All-Stop Mode
默认处于All-Stop Mode。这也给程序调试带来了一些困难,比如,无法100%精确地进行单步调试。有时你会发现,在执行step命令之后,程序却停在了另外一个线程中。
可以通过命令 set scheduler-locking mode来锁定线程模式。
shell
# mode
# off -- 不锁定线程,恢复时所有线程继续执行
# on -- 锁定当前线程,执行continue, step, next, finish等命令后,只有当前线程继续执行,其他线程还是停止状态
# step -- 只有step时,当前线程 继续执行,其他线程还是停止状态。 其他命令 所有线程都会恢复执行
# replay -- 反向调试时, 当前线程执行,其他线程停止
(gdb) show scheduler-locking
Mode for locking scheduler during execution is "replay".
(gdb) set schedule-l
Display all 150 possibilities? (y or n)
(gdb) set schedule-lock
lock lock_fd locked_map_ptr locked_vfxprintf lockf lockf64
(gdb) set schedule-lock
lock lock_fd locked_map_ptr locked_vfxprintf lockf lockf64
(gdb) set schedule-lock
Display all 200 possibilities? (y or n)
(gdb) set scheduler-locking on
(gdb) show scheduler-locking
Mode for locking scheduler during execution is "on".
(gdb)
Non-Stop Mode
在Non-Stop模式下,一个线程被中断执行,并不会影响到其他线程。比如,一个线程触发断点,只有这一个线程会被中断执行,其余线程不受影响继续执行。同样的,在程序运行时,执行Ctrl+C,也只会中断一个线程。
开启Non-stop Mode
shell
set pagination off
set non-stop on
set non-stop off # 恢复 All-stop模式
show non-stop
(gdb) set non-stop on
(gdb) show non-stop
Controlling the inferior in non-stop mode is on.
(gdb) b enqueue
Breakpoint 1 at 0x138d: file test_sem.c, line 48.
(gdb) r
Starting program: /media/VM_SHARE/code/blog_code/condition/sem
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7bff640 (LWP 2930)]
[New Thread 0x7ffff73fe640 (LWP 2931)]
Thread 2 "sem" hit Breakpoint 1, enqueue (arg=0x7fffffffe300) at test_sem.c:48
48 thrd_args_t *thrd_args = (thrd_args_t *)arg;
(gdb) set print thread-events off
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7ffff7faa740 (LWP 2927) "sem" (running)
2 Thread 0x7ffff7bff640 (LWP 2930) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
3 Thread 0x7ffff73fe640 (LWP 2931) "sem" (running)
(gdb)
(gdb) interrupt -a
(gdb)
Thread 3 "sem" stopped.
0x00007ffff7ce57f8 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
78 ../sysdeps/unix/sysv/linux/clock_nanosleep.c: 没有那个文件或目录.
Thread 1 "sem" stopped.
__futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=2930, futex_word=0x7ffff7bff910) at ./nptl/futex-internal.c:57
57 ./nptl/futex-internal.c: 没有那个文件或目录.
info threads
Id Target Id Frame
* 1 Thread 0x7ffff7faa740 (LWP 2927) "sem" __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265,
expected=2930, futex_word=0x7ffff7bff910) at ./nptl/futex-internal.c:57
2 Thread 0x7ffff7bff640 (LWP 2930) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
3 Thread 0x7ffff73fe640 (LWP 2931) "sem" 0x00007ffff7ce57f8 in __GI___clock_nanosleep (clock_id=clock_id@entry=0,
flags=flags@entry=0, req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0)
at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
(gdb)
可以看到执行到断点函数enqueue时, 只有线程2停止,线程1, 3还在执行
interrupt -a命令可以中断所有线程的执行。
命令后台执行
像shell命令一样,后面加一个&符号,可以把程序放在后台执行。在GDB中同样可以在命令后面加一个&符号,这样就能把命令放在后台执行。这样子,gdb就可以继续接收命令,比如我们可以run &, 然后interrupt 中断线程,继续查看线程状态。
shell
(gdb) c&
Continuing.
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7ffff7faa740 (LWP 2927) "sem" (running)
2 Thread 0x7ffff7bff640 (LWP 2930) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
3 Thread 0x7ffff73fe640 (LWP 2931) "sem" 0x00007ffff7ce57f8 in __GI___clock_nanosleep (clock_id=clock_id@entry=0,
flags=flags@entry=0, req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0)
at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
(gdb) thread apply all bt
Thread 3 (Thread 0x7ffff73fe640 (LWP 2931) "sem"):
#0 0x00007ffff7ce57f8 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
#1 0x00007ffff7cea677 in __GI___nanosleep (req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2 0x00007ffff7cea5ae in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#3 0x00005555555552d9 in dequeue (arg=0x7fffffffe300) at test_sem.c:31
#4 0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#5 0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Thread 2 (Thread 0x7ffff7bff640 (LWP 2930) "sem"):
#0 __sleep (seconds=1) at ../sysdeps/posix/sleep.c:34
#1 0x00005555555553b5 in enqueue (arg=0x7fffffffe300) at test_sem.c:52
#2 0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#3 0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Thread 1 (Thread 0x7ffff7faa740 (LWP 2927) "sem"):
Selected thread is running.
更多
这个文章主要就是记录一些命令的使用,后续便于查找。很多时候我们用过或者看过一些命令,知道有这个东西,但是就是想不起来怎么用了,那么写博文就可以帮助到我们。
测试程序代码路径: gitee.com/fishmwei/bl...
行动,才不会被动!
欢迎关注个人公众号 微信 -> 搜索 -> fishmwei,沟通交流。
博客地址: fishmwei.github.io
掘金主页: juejin.cn/user/208432...