穿越到六年前只为回顾一个问题

简介

机缘巧合下找到一份六年前的 Ramdump 以及它对应的 vmlinux 文件,于是一个大胆的想法促使我去适配低版本的安卓设备(未完结),通过 Ramdump 的途径侧面窥探一个很久远的问题,尽管这个问题已经解决了。但是这问题也算设计虚拟机解析器其中的一个缘由,当年为解决这个问题,对 ART 虚拟机浅浅研究了一番,并基于 gdb 编写了基础版本的解析器。

但是,那个问题就像具有波粒二象性那般,添加触发 Coredump 的调试版本,它就像消失的无影无踪似得,去掉调试版本,隔三差五总要来一回,阴魂不散。断断续续,问题持续了近半年之久,最终还是结合多次不同现象的同类问题中找到蛛丝马迹,从代码角度硬是分析出来。

最初的问题

寻遍了大量的材料,终于找到了一份该问题的 tombstone 文件。简单看堆栈就是程序跑飞到无代码运行权限的内存段上,因此发生错误,一般这种情况都是从 lr 寄存器上经过跳转直接跑飞的。

sql 复制代码
pid: 598, tid: 1371, name: Binder_B  >>> system_server <<<
signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x6ffc64a8
    r0 00000000  r1 12d3c2e0  r2 00000000  r3 150e3400
    r4 00000000  r5 12d32b80  r6 702492d0  r7 00000000
    r8 00000001  r9 87e6f500  sl 000053b8  fp 87b3fd5c
    ip 00000000  sp 87b3fbb0  lr 7270c021  pc 6ffc64a8  cpsr 400f0010
... ...

backtrace:
    #00 pc 0055c4a8  /data/dalvik-cache/arm/system@[email protected]
    #01 pc 7270c01f  /data/dalvik-cache/arm/system@[email protected] (offset 0x1fbc000)

FakeCore 分析

复制代码
core-parser -t tombstone_00
yaml 复制代码
core-parser> rd 0x7270c000 -e 0x7270c03c -i
0x7270c000: c000f8dc | ldr.w r12, [r12]
0x7270c004: 41e0e92d | push.w {r5, r6, r7, r8, lr}
0x7270c008:     b087 | sub sp, #0x1c
0x7270c00a:     4680 | mov r8, r0
0x7270c00c:     9000 | str r0, [sp]
0x7270c00e:     1c0f | adds r7, r1, #0
0x7270c010:     6abd | ldr r5, [r7, #0x28]
0x7270c012:     1c29 | adds r1, r5, #0
0x7270c014:     6808 | ldr r0, [r1]
0x7270c016: 01c8f8d0 | ldr.w r0, [r0, #0x1c8]
0x7270c01a: e024f8d0 | ldr.w lr, [r0, #0x24]
0x7270c01e:     47f0 | blx lr
0x7270c020:     1c05 | adds r5, r0, #0
0x7270c022:     b1b5 | cbz r5, 0x7270c052
0x7270c024: 0044f8df | ldr.w r0, [pc, #0x44]
0x7270c028: e11cf8d9 | ldr.w lr, [r9, #0x11c]
0x7270c02c:     4641 | mov r1, r8
0x7270c02e:     47f0 | blx lr
0x7270c030:     1c05 | adds r5, r0, #0
0x7270c032: 36d6f645 | movw r6, #0x5bd6
0x7270c036: 5639f6cf | movt r6, #0xfd39
0x7270c03a:     447e | add r6, pc
0x7270c03c: 0028f8df | ldr.w r0, [pc, #0x28]

从最后寄存器的角度上看,我们一般会认为在 pc = 0x7270c01elr = 0x6ffc64a8 时发生的,因为这个条件的情况正好满足 blx lr 发生 native crash 的同时,lr = 0x7270c021pc = 0x6ffc64a8 这个结果。

奇异点

从汇编代码及代码段所属,不难知道 lr 寄存器附近的这段代码就是 art::ArtMethod 的函数跳转过程。从 r0 寄存器的最后的结果看 r0 = 0x0 显然不符合这个过程中产生的。

yaml 复制代码
// r1 是 this 对象
0x7270c00e:     1c0f | adds r7, r1, #0

// 从 this 对象地址偏移 0x28 处取 Object 对象地址
// r5 是 this 对象的成员对象
0x7270c010:     6abd | ldr r5, [r7, #0x28]
0x7270c012:     1c29 | adds r1, r5, #0

// 此时 r0 是 r5 对象的 klass_ 对象
0x7270c014:     6808 | ldr r0, [r1]

// 从 klass_ 内存中取出偏移 0x1c8 的 art::ArtMethod 地址
0x7270c016: 01c8f8d0 | ldr.w r0, [r0, #0x1c8]

// 让 lr = entry_point_from_quick_compiled_code_
0x7270c01a: e024f8d0 | ldr.w lr, [r0, #0x24]
0x7270c01e:     47f0 | blx lr

但从寄存器的角度看解释,这个过程应该是有经过的,只是 blx lr 跳转的函数地址并非 0x6ffc64a8 ,而可能是其它位置,在那个函数中污染了 r0 等寄存器,后在进入的 0x6ffc64a8

提取 Core 文件

css 复制代码
crash-android-arm vmlinux ddr.bin@0x10000000
markdown 复制代码
crash-android-arm> ps | grep system_server
      708     253   2  d57e3000  UN   4.5  1732540   187832  system_server
      731     253   3  d5c9ce00  IN   4.5  1732540   187832  system_server
      763     253   0  d5f58c00  RU   4.5  1732540   187832  system_server
     1307     253   1  d167d400  UN   4.5  1732540   187832  system_server
lua 复制代码
crash-android-arm> extend output/arm/linux-parser.so
output/arm/linux-parser.so: shared object loaded
yaml 复制代码
crash-android-arm> set 708
    PID: 708
COMMAND: "system_server"
   TASK: d57e3000  [THREAD_INFO: d5d18000]
    CPU: 2
  STATE: TASK_UNINTERRUPTIBLE 
crash-android-arm> lp core
Saved [708.core].
crash-android-arm>
yaml 复制代码
Core load (0x5de0b9c54c80) 708.core
Core env:
  * Path: 708.core
  * Machine: arm
  * Bits: 32
  * PointSize: 4
  * PointMask: 0xffffffff
  * VabitsMask: 0xffffffff
  * PageSize: 0x1000
  * Remote: false
  * Thread: 708
Switch android(23) env.
Android env:
  * ABIS: armeabi-v7a,armeabi
  * Release: 6.0.1
  * Type: user
  * Time: 1568044152
  * Debuggable: 0
  * Sdk: 23
core-parser> 

从 Time = 1568044152 上看这个 Ramdump 的大概在是 2019年9月9日 星期一 16:29:12 (UTC) 后产生。

Core 解析

由于曾经解析过这个问题,于是直接确定 lr 寄存器所在的函数是 DirectByteBuffer 的 checkNotFreed 函数,于是这里直接从 Core 中找到这个函数。

csharp 复制代码
core-parser> class java.nio.DirectByteBuffer 
[0x710a2008]
class java.nio.DirectByteBuffer extends java.nio.MappedByteBuffer {
  // Implements:
    java.lang.Comparable

  // Object instance fields:
    [0x034] private final boolean isReadOnly
    [0x030] protected final int offset

  // extends java.nio.MappedByteBuffer
    [0x02c] final java.nio.channels.FileChannel$MapMode mapMode
    [0x028] final java.nio.MemoryBlock block

  // extends java.nio.ByteBuffer
    [0x024] java.nio.ByteOrder order

  // extends java.nio.Buffer
    [0x020] int position
    [0x01c] int mark
    [0x018] int limit
    [0x014] final int capacity
    [0x010] final int _elementSizeShift
    [0x008] final long effectiveDirectAddress

  // extends java.lang.Object
    [0x004] private transient int shadow$_monitor_
    [0x000] private transient java.lang.Class shadow$_klass_

  // Methods:
    [0x712d84a8] void java.nio.DirectByteBuffer.<init>(long, int)
    [0x712d84d0] protected void java.nio.DirectByteBuffer.<init>(java.nio.MemoryBlock, int, int, boolean, java.nio.channels.FileChannel$MapMode)
    [0x712d84f8] private void java.nio.DirectByteBuffer.checkIsAccessible()
    [0x712d8520] private void java.nio.DirectByteBuffer.checkNotFreed()
... ...
yaml 复制代码
core-parser> method 0x712d8520 --dex-dump --oat-dump
private void java.nio.DirectByteBuffer.checkNotFreed() [dex_method_idx=15945]
DEX CODE:
  0x71747e30: 2054 2938                | iget-object v0, v2, Ljava/nio/DirectByteBuffer;.block:Ljava/nio/MemoryBlock; // field@10552
  0x71747e34: 106e 3fac 0000           | invoke-virtual {v0}, boolean java.nio.MemoryBlock.isFreed() // method@16300
  0x71747e3a: 000a                     | move-result v0
  0x71747e3c: 0038 000b                | if-eqz v0, 0x71747e52 //+11
  0x71747e40: 0022 051f                | new-instance v0, java.lang.IllegalStateException // type@1311
  0x71747e44: 011b 50ba 0000           | const-string/jumbo v1, "buffer was freed" // string@20666
  0x71747e4a: 2070 31e6 0010           | invoke-direct {v0, v1}, void java.lang.IllegalStateException.<init>(java.lang.String) // method@12774
  0x71747e50: 0027                     | throw v0
  0x71747e52: 000e                     | return-void
OAT CODE:
  [0x7379e04c, 0x7379e0c0]
  0x7379e04c: 5c00f5bd | subs.w r12, sp, #0x2000
  0x7379e050: c000f8dc | ldr.w r12, [r12]
  0x7379e054: 41e0e92d | push.w {r5, r6, r7, r8, lr}
  0x7379e058:     b087 | sub sp, #0x1c
  0x7379e05a:     4680 | mov r8, r0
  0x7379e05c:     9000 | str r0, [sp]
  0x7379e05e:     1c0f | adds r7, r1, #0
  0x7379e060:     6abd | ldr r5, [r7, #0x28]
  0x7379e062:     1c29 | adds r1, r5, #0
  0x7379e064:     6808 | ldr r0, [r1]
  0x7379e066: 01c8f8d0 | ldr.w r0, [r0, #0x1c8]
  0x7379e06a: e024f8d0 | ldr.w lr, [r0, #0x24]
  0x7379e06e:     47f0 | blx lr
  0x7379e070:     1c05 | adds r5, r0, #0
  0x7379e072:     b1b5 | cbz r5, 0x7379e0a2
  0x7379e074: 0044f8df | ldr.w r0, [pc, #0x44]
  0x7379e078: e11cf8d9 | ldr.w lr, [r9, #0x11c]
  0x7379e07c:     4641 | mov r1, r8
  0x7379e07e:     47f0 | blx lr
  0x7379e080:     1c05 | adds r5, r0, #0
  0x7379e082: 3696f641 | movw r6, #0x1b96
  0x7379e086: 5639f6cf | movt r6, #0xfd39
  0x7379e08a:     447e | add r6, pc
  0x7379e08c: 0028f8df | ldr.w r0, [pc, #0x28]
  0x7379e090:     6836 | ldr r6, [r6]
  0x7379e092:     1c29 | adds r1, r5, #0
  0x7379e094:     1c32 | adds r2, r6, #0
  0x7379e096: fb65f559 | bl 0x734f7764
  0x7379e09a: e260f8d9 | ldr.w lr, [r9, #0x260]
  0x7379e09e:     1c28 | adds r0, r5, #0
  0x7379e0a0:     47f0 | blx lr
  0x7379e0a2: 2000f8b9 | ldrh.w r2, [r9]
  0x7379e0a6:     b912 | cbnz r2, 0x7379e0ae
  0x7379e0a8:     b007 | add sp, #0x1c
  0x7379e0aa: 81e0e8bd | pop.w {r5, r6, r7, r8, pc}
  0x7379e0ae: e25cf8d9 | ldr.w lr, [r9, #0x25c]
  0x7379e0b2:     47f0 | blx lr
  0x7379e0b4:     e7f8 | b 0x7379e0a8
  0x7379e0b6:     0000 | movs r0, r0
  0x7379e0b8:     83f0 | strh r0, [r6, #0x1e]
  0x7379e0ba:     712c | strb r4, [r5, #4]
  0x7379e0bc:     4de8 | ldr r5, [pc, #0x3a0]
  0x7379e0be:     7109 | strb r1, [r1, #4]
scala 复制代码
core-parser> class java.nio.MemoryBlock
[0x710a32f0]
class java.nio.MemoryBlock extends java.lang.Object {
  // Object instance fields:
    [0x019] private boolean freed
    [0x018] private boolean accessible
    [0x010] protected final long size
    [0x008] protected long address

  // extends java.lang.Object
    [0x004] private transient int shadow$_monitor_
    [0x000] private transient java.lang.Class shadow$_klass_
ini 复制代码
art::Method * = 0x710a32f0+0x1c8 = 0x710a34b8
core-parser> rd 0x710a34b8
710a34b8: 712d9dd0712d9da8  ..-q..-q
yaml 复制代码
core-parser> method 712d9da8 --oat-dump --dex-dump
public boolean java.nio.MemoryBlock.isFreed() [dex_method_idx=16300]
DEX CODE:
  0x7174b670: 1055 297d                | iget-boolean v0, v1, Ljava/nio/MemoryBlock;.freed:Z // field@10621
  0x7174b674: 000f                     | return v0
OAT CODE:
  [0x73612e04, 0x73612e08]
  0x73612e04:     7e48 | ldrb r0, [r1, #0x19]
  0x73612e06:     4770 | bx lr

于是奇异点处的 lr 寄存器附近的代码段逻辑如下图所示:

取证一

从内存中取任一个 java.nio.MemoryBlock 对象观察其内存布局,对比 r1 寄存器所在的位置。

java 复制代码
core-parser> p 0x74d5d270 -b -x
Size: 0x20
Padding: 0x6
Object Name: java.nio.MemoryBlock$UnmanagedBlock
  // extends java.nio.MemoryBlock
    [0x19] private boolean freed = false
    [0x18] private boolean accessible = true
    [0x10] protected final long size = 0x9b6430
    [0x08] protected long address = 0x71cc17c8
  // extends java.lang.Object
    [0x04] private transient int shadow$_monitor_ = 0x0
    [0x00] private transient java.lang.Class shadow$_klass_ = 0x710a3750
Binary:
74d5d270: 00000000710a3750  0000000071cc17c8  P7.q.......q....
74d5d280: 00000000009b6430  0000000000000001  0d..............

core-parser> file 0x71cc17c8
[71500000, 734bf000)  0000000000000000  /data/dalvik-cache/arm/system@[email protected]

Tombstone 文件中的 r1 寄存器附近的内存分布,与 java.nio.MemoryBlock$UnmanagedBlock 高度重合,在此处基本可以断定这个 r1 寄存器就是 MemoryBlock 的对象实例。

makefile 复制代码
core-parser> rd 0x12d3c2e0 -e 0x12d3c320
12d3c2e0: 0000000070014550  00000000715e7f9c  PE.p......^q....
12d3c2f0: 00000000004dc6c8  0000000000000001  ..M.............
12d3c300: 000000006fffa1a8  0000000000000070  ...o....p.......
12d3c310: 0000000000000001  0000000000000000  ................

core-parser> file 715e7f9c
[70471000, 7242d000)  0000000000000000  /data/dalvik-cache/arm/system@[email protected]

虽然 r1MemoryBlock 对象,并且它的 freed 成员为 false 会让 r0 寄存器为 0x0,但是它却解释不了 r1 != r5 的现象,于是 blx lr 跳转的函数并非 java.nio.MemoryBlock.isFreed

取证二

观察此时 r5 = 12d32b80寄存器附近的内存分布。

less 复制代码
core-parser> rd 12d32b80 -e 12d32c80
12d32b80: 0000000070012e08  00000000715e7f9c  ...p......^q....
12d32b90: 004dc6c800000000  ffffffff004dc6c8  ......M...M.....
12d32ba0: 6ffc64a800000000  0000000012d3c2e0  .....d.o........
12d32bb0: 0000000000000000  0000000000000000  ................
12d32bc0: 000000006ffe3540  12d1b5c00000000d  @5.o............
12d32bd0: 0000000000000000  0000000000000000  ................
12d32be0: 0000000000000000  0000000000000000  ................
12d32bf0: 0000000000000000  0000000000000000  ................
12d32c00: 000000006ffe4228  ffffffff0000000d  (B.o............
12d32c10: 0000000000000000  0000000000000000  ................
12d32c20: 0000000000000000  0000000000000000  ................
12d32c30: 0000000000000000  0000000000000000  ................
12d32c40: 000000006ffe3540  12d1b6000000000d  @5.o............
12d32c50: 0000000000000000  0000000000000000  ................
12d32c60: 0000000000000000  0000000000000000  ................
12d32c70: 0000000000000000  0000000000000000  ................

对这块内存分布比较敏感的同学,应该很快注意到两个特殊位置的值,12d32ba412d32ba8 处,它们分别是最终出错的地址 6ffc64a8 偏移值为 0x24,而另一是 r1 寄存器地址 12d3c2e0 偏移值 0x28,这个偏移值不正好是 java.nio.DirectByteBuffer 对象的内存布局吗?我们从 Core 内存中取一个对象内存进行参考。

ini 复制代码
core-parser> p 0x74d5d290 -x -b
Size: 0x38
Padding: 0x3
Object Name: java.nio.DirectByteBuffer
    [0x34] private final boolean isReadOnly = false
    [0x30] protected final int offset = 0x0
  // extends java.nio.MappedByteBuffer
    [0x2c] final java.nio.channels.FileChannel$MapMode mapMode = 0x0
    [0x28] final java.nio.MemoryBlock block = 0x74d5d5c0
  // extends java.nio.ByteBuffer
    [0x24] java.nio.ByteOrder order = 0x710577e8
  // extends java.nio.Buffer
    [0x20] int position = 0x0
    [0x1c] int mark = 0xffffffff
    [0x18] int limit = 0x4df7a0
    [0x14] final int capacity = 0x4df7a0
    [0x10] final int _elementSizeShift = 0x0
    [0x08] final long effectiveDirectAddress = 0x7151c8d8
  // extends java.lang.Object
    [0x04] private transient int shadow$_monitor_ = 0x0
    [0x00] private transient java.lang.Class shadow$_klass_ = 0x710a2008
Binary:
74d5d290: 00000000710a2008  000000007151c8d8  ...q......Qq....
74d5d2a0: 004df7a000000000  ffffffff004df7a0  ......M...M.....
74d5d2b0: 710577e800000000  0000000074d5d5c0  .....w.q...t....
74d5d2c0: 0000000000000000  0000000071086bd0  .........k.q....

core-parser> file 710577e8
[70af8000, 71500000)  0000000000000000  /data/dalvik-cache/arm/system@[email protected]

取证三

观察此时 r6 = 702492d0 寄存器附近的内存分布。

makefile 复制代码
core-parser> rd 702492d0 -e 702493b0
702492d0: 6fa6e41870012e08  000800116fa6aec8  ...p...o...o....
702492e0: 00003e5f0022bc10  7242d0090000003b  .."._>..;.....Br
702492f0: 7270d0a500000000  6fa6e41870012e08  ......pr...p...o
70249300: 000800116fa6aec8  00003e600022bc48  ...o....H.".`>..
70249310: 7242d0090000003c  7270d13500000000  <.....Br....5.pr
70249320: 6fa6e41870012e08  000800116fa6aec8  ...p...o...o....
70249330: 00003e610022bc9c  7242d0090000003d  ..".a>..=.....Br
70249340: 7270d1ed00000000  6fa6e41870012e08  ......pr...p...o
70249350: 000800116fa6aec8  00003e620022bcd4  ...o......".b>..
70249360: 7242d0090000003e  7270d28500000000  >.....Br......pr
70249370: 6fa6e41870012e08  000800116fa6aec8  ...p...o...o....
70249380: 00003e630022bd28  7242d0090000003f  (.".c>..?.....Br
70249390: 7270d33d00000000  6fa6e41870012e08  ....=.pr...p...o
702493a0: 000800116fa6aec8  00003e640022bd60  ...o....`.".d>..

注意到 702492d0 地址处正是 r5 对象的 klass_ 地址,观察其内存排布,正好是个 art::ArtMethod 的内存数组,显然 r6 此时是个 art::ArtMethod 指针。并且是 java.nio.DirectByteBuffer 中位于 0x3b 的函数。因此我们可以从 Core 中找到这个函数。

yaml 复制代码
core-parser> method -b 0x712d8980 --dex-dump --oat-dump
public final int java.nio.DirectByteBuffer.getInt(int) [dex_method_idx=15967]
DEX CODE:
  0x717484f8: 1070 3e48 0003           | invoke-direct {v3}, void java.nio.DirectByteBuffer.checkIsAccessible() // method@15944
  0x717484fe: 4012                     | const/4 v0, #+4
  0x71748500: 306e 3e47 0043           | invoke-virtual {v3, v4, v0}, void java.nio.DirectByteBuffer.checkIndex(int, int) // method@15943
  0x71748506: 3054 2938                | iget-object v0, v3, Ljava/nio/DirectByteBuffer;.block:Ljava/nio/MemoryBlock; // field@10552
  0x7174850a: 3152 293e                | iget v1, v3, Ljava/nio/DirectByteBuffer;.offset:I // field@10558
  0x7174850e: 41b0                     | add-int/2addr v1, v4
  0x71748510: 3254 293f                | iget-object v2, v3, Ljava/nio/DirectByteBuffer;.order:Ljava/nio/ByteOrder; // field@10559
  0x71748514: 306e 3fb3 0210           | invoke-virtual {v0, v1, v2}, int java.nio.MemoryBlock.peekInt(int, java.nio.ByteOrder) // method@16307
  0x7174851a: 000a                     | move-result v0
  0x7174851c: 000f                     | return v0
OAT CODE:
  [0x7379f0f4, 0x7379f168]
  0x7379f0f4: 5c00f5bd | subs.w r12, sp, #0x2000
  0x7379f0f8: c000f8dc | ldr.w r12, [r12]
  0x7379f0fc: 45e0e92d | push.w {r5, r6, r7, r8, r10, lr}
  0x7379f100:     b08a | sub sp, #0x28
  0x7379f102:     9000 | str r0, [sp]
  0x7379f104:     1c0d | adds r5, r1, #0
  0x7379f106:     4690 | mov r8, r2
  0x7379f108: 0058f8df | ldr.w r0, [pc, #0x58]
  0x7379f10c:     1c29 | adds r1, r5, #0
  0x7379f10e: ff4df7fe | bl 0x7379dfac
  0x7379f112:     2604 | movs r6, #4
  0x7379f114:     4642 | mov r2, r8
  0x7379f116:     1c29 | adds r1, r5, #0
  0x7379f118:     6808 | ldr r0, [r1]
  0x7379f11a:     1c33 | adds r3, r6, #0
  0x7379f11c: 01ccf8d0 | ldr.w r0, [r0, #0x1cc]
  0x7379f120: e024f8d0 | ldr.w lr, [r0, #0x24]
  0x7379f124:     47f0 | blx lr
  0x7379f126:     6aae | ldr r6, [r5, #0x28]
  0x7379f128: a024f8d5 | ldr.w r10, [r5, #0x24]
  0x7379f12c:     6b2f | ldr r7, [r5, #0x30]
  0x7379f12e: 0030f8df | ldr.w r0, [pc, #0x30]
  0x7379f132: 0708eb17 | adds.w r7, r7, r8
  0x7379f136:     1c31 | adds r1, r6, #0
  0x7379f138:     1c3a | adds r2, r7, #0
  0x7379f13a:     4653 | mov r3, r10
  0x7379f13c:     680c | ldr r4, [r1]
  0x7379f13e: fa01f007 | bl 0x737a6544
  0x7379f142: c000f8b9 | ldrh.w r12, [r9]
  0x7379f146:     1c06 | adds r6, r0, #0
  0x7379f148: 0f00f1bc | cmp.w r12, #0
  0x7379f14c:     d103 | bne 0x7379f156
  0x7379f14e:     1c30 | adds r0, r6, #0
  0x7379f150:     b00a | add sp, #0x28
  0x7379f152: 85e0e8bd | pop.w {r5, r6, r7, r8, r10, pc}
  0x7379f156: e25cf8d9 | ldr.w lr, [r9, #0x25c]
  0x7379f15a:     47f0 | blx lr
  0x7379f15c:     e7f7 | b 0x7379f14e
  0x7379f15e:     0000 | movs r0, r0
  0x7379f160:     9e98 | ldr r6, [sp, #0x260]
  0x7379f162:     712d | strb r5, [r5, #4]
  0x7379f164:     84f8 | strh r0, [r7, #0x26]
  0x7379f166:     712d | strb r5, [r5, #4]
Binary:
712d8980: 70afc428710a2008  0008001170af8ed8  ...q(..p...p....
712d8990: 00003e5f0022bc10  734bf0090000003b  .."._>..;.....Ks
712d89a0: 7379f0f500000000  70afc428710a2008  ......ys...q(..p

由于它个数组排布,因此我们可以利用 Core 内存做差法得到该份 Tombstone 文件中的 checkNotFreed 函数的地址。

csharp 复制代码
[0x712d8520] private void java.nio.DirectByteBuffer.checkNotFreed()
[0x712d8980] public final int java.nio.DirectByteBuffer.getInt(int)
scss 复制代码
(gdb) p /x 0x702492d0-(0x712d8980-0x712d8520)
$5 = 0x70248e70

最后我们可以计算出 checkNotFreed 对应的 art::ArtMethod 指针地址为 0x70248e70

取证四

我们从栈内存搜索地址 0x70248e70 0x702492d0 这两个地址。

makefile 复制代码
core-parser> rd 0x87b3fbb0 -e 0x87b3fcb0
87b3fbb0: 0048000400000000  b6d5c27b8b23f000  ......H...#.{...
87b3fbc0: 0007d8a670248e70  12d32b8000034ccc  p.$p.....L...+..
87b3fbd0: 12d309400007d8a8  7270d0c30007d8a8  [email protected]
87b3fbe0: 00000000702492d0  0000000000000000  ..$p............
87b3fbf0: 0000000000000000  0000000000000000  ................
87b3fc00: 000088f66ff6aef8  0007d8a812d32b80  ...o.....+......
87b3fc10: 00000a7712d30940  72640287000053b8  @...w....S....dr
87b3fc20: 12d309407022c950  726bb5df00000000  P."[email protected]
87b3fc30: 1302ee806ff6aef8  7268bb3300000000  ...o........3.hr
87b3fc40: 0000005b703e3e38  150e340000000a77  8>>p[...w....4..
87b3fc50: 150e340012d30940  726bfe5100000000  @....4......Q.kr
87b3fc60: 00000001703d8178  726be38314d56a60  x.=p....`j....kr
87b3fc70: 00000000703d7fc0  0000000100000000  ..=p............
87b3fc80: 00000000703d81c8  0000000000000001  ..=p............
87b3fc90: 0000000000000000  0000000000000000  ................
87b3fca0: 0000000000000000  0000000000000000  ................
bash 复制代码
87b3fbc0: 0007d8a670248e70  12d32b8000034ccc  p.$p.....L...+..
87b3fbd0: 12d309400007d8a8  7270d0c30007d8a8  [email protected]
87b3fbe0: 00000000702492d0  0000000000000000  ..$p............

显然按 java.nio.DirectByteBuffer.checkNotFreed 函数的栈分布不合理。

makefile 复制代码
87b3fbe0: 00000000702492d0  0000000000000000  ..$p............
87b3fbf0: 0000000000000000  0000000000000000  ................
87b3fc00: 000088f66ff6aef8  0007d8a812d32b80  ...o.....+......
87b3fc10: 00000a7712d30940  72640287000053b8  @...w....S....dr

然而 87b3fbe0java.nio.DirectByteBuffer.getInt(int) 函数的栈分布却基本符合特征。若此处栈地址是正确的。那么 87b3fbdc 位置很可能是这个函数的 lr 寄存器的值。不妨对比看看差值。

getInt(int) checkNotFreed 差值
Tombstone 0x7270d0c3 0x7270c021 0x10a2
Coredump 0x7379f112 0x7379e070 0x10a2

在同一个位置处两个函数距离正好相同。

SP 地址 大小 压栈的寄存器
可能某个 ArtMethod 函数 87b3fbb0 0x30
DirectByteBuffer.getInt(int) 87b3fbe0 0x40 {r5, r6, r7, r8, r10, lr}
arduino 复制代码
invoke-direct {v3}, void java.nio.DirectByteBuffer.checkIsAccessible() // method@15944
invoke-virtual {v3, v4, v0}, void java.nio.DirectByteBuffer.checkIndex(int, int) // method@15943
invoke-virtual {v0, v1, v2}, int java.nio.MemoryBlock.peekInt(int, java.nio.ByteOrder) // method@16307

这几个 invoke 的函数仅有 java.nio.DirectByteBuffer.checkIsAccessible 满足这一特征。

yaml 复制代码
core-parser> method --oat-dump --dex-dump -b 0x712d84f8
private void java.nio.DirectByteBuffer.checkIsAccessible() [dex_method_idx=15944]
DEX CODE:
  0x71747df4: 1070 3e49 0002           | invoke-direct {v2}, void java.nio.DirectByteBuffer.checkNotFreed() // method@15945
  0x71747dfa: 2054 2938                | iget-object v0, v2, Ljava/nio/DirectByteBuffer;.block:Ljava/nio/MemoryBlock; // field@10552
  0x71747dfe: 106e 3fab 0000           | invoke-virtual {v0}, boolean java.nio.MemoryBlock.isAccessible() // method@16299
  0x71747e04: 000a                     | move-result v0
  0x71747e06: 0039 000b                | if-nez v0, 0x71747e1c //+11
  0x71747e0a: 0022 051f                | new-instance v0, java.lang.IllegalStateException // type@1311
  0x71747e0e: 011b 50b8 0000           | const-string/jumbo v1, "buffer is inaccessible" // string@20664
  0x71747e14: 2070 31e6 0010           | invoke-direct {v0, v1}, void java.lang.IllegalStateException.<init>(java.lang.String) // method@12774
  0x71747e1a: 0027                     | throw v0
  0x71747e1c: 000e                     | return-void
OAT CODE:
  [0x7379dfac, 0x7379e02c]
  0x7379dfac: 5c00f5bd | subs.w r12, sp, #0x2000
  0x7379dfb0: c000f8dc | ldr.w r12, [r12]
  0x7379dfb4: 41e0e92d | push.w {r5, r6, r7, r8, lr}
  0x7379dfb8:     b087 | sub sp, #0x1c
  0x7379dfba:     4680 | mov r8, r0
  0x7379dfbc:     9000 | str r0, [sp]
  0x7379dfbe:     1c0e | adds r6, r1, #0
  0x7379dfc0: 0060f8df | ldr.w r0, [pc, #0x60]
  0x7379dfc4:     1c31 | adds r1, r6, #0
  0x7379dfc6: f841f000 | bl 0x7379e04c
  0x7379dfca:     6ab5 | ldr r5, [r6, #0x28]
  0x7379dfcc:     1c29 | adds r1, r5, #0
  0x7379dfce:     6808 | ldr r0, [r1]
  0x7379dfd0: 01c4f8d0 | ldr.w r0, [r0, #0x1c4]
  0x7379dfd4: e024f8d0 | ldr.w lr, [r0, #0x24]
  0x7379dfd8:     47f0 | blx lr
  0x7379dfda:     1c05 | adds r5, r0, #0
  0x7379dfdc:     b9b5 | cbnz r5, 0x7379e00c
  0x7379dfde: 0048f8df | ldr.w r0, [pc, #0x48]
  0x7379dfe2: e11cf8d9 | ldr.w lr, [r9, #0x11c]
  0x7379dfe6:     4641 | mov r1, r8
  0x7379dfe8:     47f0 | blx lr
  0x7379dfea:     1c05 | adds r5, r0, #0
  0x7379dfec: 4724f641 | movw r7, #0x1c24
  0x7379dff0: 5739f6cf | movt r7, #0xfd39
  0x7379dff4:     447f | add r7, pc
  0x7379dff6: 0028f8df | ldr.w r0, [pc, #0x28]
  0x7379dffa:     683f | ldr r7, [r7]
  0x7379dffc:     1c29 | adds r1, r5, #0
  0x7379dffe:     1c3a | adds r2, r7, #0
  0x7379e000: fbb0f559 | bl 0x734f7764
  0x7379e004: e260f8d9 | ldr.w lr, [r9, #0x260]
  0x7379e008:     1c28 | adds r0, r5, #0
  0x7379e00a:     47f0 | blx lr
  0x7379e00c: 2000f8b9 | ldrh.w r2, [r9]
  0x7379e010:     b912 | cbnz r2, 0x7379e018
  0x7379e012:     b007 | add sp, #0x1c
  0x7379e014: 81e0e8bd | pop.w {r5, r6, r7, r8, pc}
  0x7379e018: e25cf8d9 | ldr.w lr, [r9, #0x25c]
  0x7379e01c:     47f0 | blx lr
  0x7379e01e:     e7f8 | b 0x7379e012
  0x7379e020:     83f0 | strh r0, [r6, #0x1e]
  0x7379e022:     712c | strb r4, [r5, #4]
  0x7379e024:     8520 | strh r0, [r4, #0x28]
  0x7379e026:     712d | strb r5, [r5, #4]
  0x7379e028:     4de8 | ldr r5, [pc, #0x3a0]
  0x7379e02a:     7109 | strb r1, [r1, #4]
Binary:
712d84f8: 70afc428710a2008  0008000270af8ed8  ...q(..p...p....
712d8508: 00003e480022b50c  734bf00900000002  ..".H>........Ks
712d8518: 7379dfad00000000  70afc428710a2008  ......ys...q(..p
函数 art::ArtMethod SP 地址 大小 压栈的寄存器
DirectByteBuffer.checkNotFreed() 70248e48 87b3fb80 0x30 {r5, r6, r7, r8, lr}
DirectByteBuffer.checkIsAccessible() 70248e70 87b3fbb0 0x30 {r5, r6, r7, r8, lr}
DirectByteBuffer.getInt(int) 702492d0 87b3fbe0 0x40 {r5, r6, r7, r8, r10, lr}

取证五

有前面几个取证结果,我们知道目前栈位置的函数大概是 checkIsAccessible 函数,然后在由它跳转到下一个函数,也就是跳转到 DirectByteBuffer.checkNotFreed。因此才在最后的寄存器中出现它相关的信息。既然最后函数不是在 DirectByteBuffer.checkNotFreed 函数上,因此我们接下来检查出栈情况。

scss 复制代码
Thread("1371") 
  r0  0x00000000  r1  0x12d3c2e0  r2  0x00000000  r3  0x150e3400  
  r4  0x00000000  r5  0x12d32b80  r6  0x702492d0  r7  0x00000000  
  r8  0x00000001  r9  0x87e6f500  r10 0x000053b8  r11 0x87b3fd5c  
  r12 0x00000000  sp  0x87b3fbb0  lr  0x7270c021  pc  0x6ffc64a8  cpsr 0x400f0010
makefile 复制代码
core-parser> rd 87b3fb80 -e 0x87b3fbb0
87b3fb80:12d32b8070248e70  00034ccc12d3c2e0
87b3fb90: 6ffc64a800034ccc  12d32b8087b3fd5c  .L...d.o....+..
87b3fba0: 00000000702492d0  6ffc64a800000001  ..$p.........d.o

到此堆栈恢复出来,并且它的错误原因页浮出水面了。极有可能在 DirectByteBuffer.checkNotFreed 函数退栈时跑飞的。于是最后的代码流程如下:

java.nio.DirectByteBuffer.getInt(int) 函数过程的代码节选:

yaml 复制代码
  0x7379f0f4: 5c00f5bd | subs.w r12, sp, #0x2000
  0x7379f0f8: c000f8dc | ldr.w r12, [r12]
  0x7379f0fc: 45e0e92d | push.w {r5, r6, r7, r8, r10, lr}
  0x7379f100:     b08a | sub sp, #0x28
  0x7379f102:     9000 | str r0, [sp]
  0x7379f104:     1c0d | adds r5, r1, #0
  0x7379f106:     4690 | mov r8, r2
  0x7379f108: 0058f8df | ldr.w r0, [pc, #0x58]
  0x7379f10c:     1c29 | adds r1, r5, #0
  0x7379f10e: ff4df7fe | bl 0x7379dfac

java.nio.DirectByteBuffer.checkIsAccessible() 函数过程的代码节选:

yaml 复制代码
  0x7379dfac: 5c00f5bd | subs.w r12, sp, #0x2000
  0x7379dfb0: c000f8dc | ldr.w r12, [r12]
  0x7379dfb4: 41e0e92d | push.w {r5, r6, r7, r8, lr}
  0x7379dfb8:     b087 | sub sp, #0x1c
  0x7379dfba:     4680 | mov r8, r0
  0x7379dfbc:     9000 | str r0, [sp]
  0x7379dfbe:     1c0e | adds r6, r1, #0
  0x7379dfc0: 0060f8df | ldr.w r0, [pc, #0x60]
  0x7379dfc4:     1c31 | adds r1, r6, #0
  0x7379dfc6: f841f000 | bl 0x7379e04c

java.nio.DirectByteBuffer.checkNotFreed() 函数过程的代码节选:

yaml 复制代码
  0x7379e04c: 5c00f5bd | subs.w r12, sp, #0x2000
  0x7379e050: c000f8dc | ldr.w r12, [r12]
  0x7379e054: 41e0e92d | push.w {r5, r6, r7, r8, lr}
  0x7379e058:     b087 | sub sp, #0x1c
  0x7379e05a:     4680 | mov r8, r0
  0x7379e05c:     9000 | str r0, [sp]
  0x7379e05e:     1c0f | adds r7, r1, #0
  0x7379e060:     6abd | ldr r5, [r7, #0x28]
  0x7379e062:     1c29 | adds r1, r5, #0
  0x7379e064:     6808 | ldr r0, [r1]
  0x7379e066: 01c8f8d0 | ldr.w r0, [r0, #0x1c8]
  0x7379e06a: e024f8d0 | ldr.w lr, [r0, #0x24]
  0x7379e06e:     47f0 | blx lr
  0x7379e070:     1c05 | adds r5, r0, #0
  0x7379e072:     b1b5 | cbz r5, 0x7379e0a2
  
  0x7379e0a2: 2000f8b9 | ldrh.w r2, [r9]
  0x7379e0a6:     b912 | cbnz r2, 0x7379e0ae
  0x7379e0a8:     b007 | add sp, #0x1c
  0x7379e0aa: 81e0e8bd | pop.w {r5, r6, r7, r8, pc}

数值分析

结合几次取证数据,以及最后的栈内存分布,我们可以有以下总结,明显有 87b3fbb0 是被写入过其它值,它的真实值应该是 0x70248e48 为是函数 checkIsAccessible 的 art::ArtMethod 地址才对的。

makefile 复制代码
87b3fb70:0000000070248e48  1302ee8000002fba
87b3fb80:12d32b8070248e70  00034ccc12d3c2e0
87b3fb90: 6ffc64a800034ccc  12d32b8087b3fd5c  .L...d.o....+..
87b3fba0: 00000000702492d0  6ffc64a800000001  ..$p.........d.o
87b3fbb0: 0048000400000000  b6d5c27b8b23f000  ......H...#.{...
87b3fbc0: 0007d8a670248e70  12d32b8000034ccc  p.$p.....L...+..
87b3fbd0: 12d309400007d8a8  7270d0c30007d8a8  [email protected]
87b3fbe0: 00000000702492d0  0000000000000000  ..$p............
87b3fbf0: 0000000000000000  0000000000000000  ................
87b3fc00: 000088f66ff6aef8  0007d8a812d32b80  ...o.....+......
87b3fc10: 00000a7712d30940  72640287000053b8  @...w....S....dr
87b3fc20: 12d309407022c950  726bb5df00000000  P."[email protected]
87b3fc30: 1302ee806ff6aef8  7268bb3300000000  ...o........3.hr
87b3fc40: 0000005b703e3e38  150e340000000a77  8>>p[...w....4..
87b3fc50: 150e340012d30940  726bfe5100000000  @....4......Q.kr
87b3fc60: 00000001703d8178  726be38314d56a60  x.=p....`j....kr
87b3fc70: 00000000703d7fc0  0000000100000000  ..=p............
87b3fc80: 00000000703d81c8  0000000000000001  ..=p............

地址 12d32b80 是 java.nio.DirectByteBuffer 对象。

makefile 复制代码
core-parser> rd 12d32b80 -e 12d32bc0
12d32b80: 0000000070012e08  00000000715e7f9c  ...p......^q....
12d32b90: 004dc6c800000000  ffffffff004dc6c8  ......M...M.....
12d32ba0: 6ffc64a800000000  0000000012d3c2e0  .....d.o........
12d32bb0: 0000000000000000  0000000000000000  ................

地址 12d3c2e0 是 java.nio.MemoryBlock 对象。

makefile 复制代码
core-parser> rd 0x12d3c2e0 -e 0x12d3c300
12d3c2e0: 0000000070014550  00000000715e7f9c  PE.p......^q....
12d3c2f0: 00000000004dc6c8  0000000000000001  ..M.............

后记

本文在草稿箱放了很长一段时间了,它并没有写完。由于材料有限无法仅从一份 tombstone 中说明白这问题的根本原因,但本文主要是传达利用 core-parser 辅助获得对比数据,通过解析 Java 对象内存分布来充分了解此类问题的内存布局,更加深入的了解程序为何错误。

相关推荐
CYRUS_STUDIO18 分钟前
Android 自定义变形 MD5 算法
android·算法·安全
雨声不在2 小时前
手动集成sqlite的方法
android·sqlite
居然是阿宋3 小时前
TextView、AppCompatTextView和MaterialTextView该用哪一个?Android UI 组件发展史与演进对照表
android·ui
南梦也要学习4 小时前
计算机二级MS之Excel
android·excel
二流小码农5 小时前
鸿蒙开发:远场通信服务rcp拦截器问题
android·ios·harmonyos
所以经济危机就是没有新技术拉动增长了6 小时前
Android 和 Linux 之间关联和区别
android·linux·运维
laohei77 小时前
五分钟快速了解MVI、MVVM、MVP
android
dora8 小时前
Android底层开发之动态注册和附加线程
android
PenguinLetsGo8 小时前
利用内存页筛选法手撕内存越界行为
android
zhangphil9 小时前
Android Coil3 Fetcher preload批量Bitmap拼接扁平宽图,Kotlin
android·kotlin