问题背景
这是来自我们内部某组的问题,高概率出现开机后某进程的 binder 通讯异常或者初始化时发生内存踩踏错误的情况。经过内存分析后,发现这是一个典型越界问题,可见 如何理解Native Crash问题 ,非常适合用内存页筛选法来锁定内存踩踏的案发现场。
原问题
yaml
Timestamp: 2025-03-02 14:19:22.261037539+0800
Process uptime: 0s
Cmdline: /odm/bin/hw/xxx-service
pid: 1240, tid: 1240, name: binder:1240_2 >>> /odm/bin/hw/xxx-service <<<
uid: 1000
tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
pac_enabled_keys: 000000000000000f (PR_PAC_APIAKEY, PR_PAC_APIBKEY, PR_PAC_APDAKEY, PR_PAC_APDBKEY)
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x005d00310082ffea
x0 0000007a9f46cd36 x1 0000007fc4b568d0 x2 0000000000000018 x3 0000000021ae1554
x4 00ffffffffffffff x5 0000000000003126 x6 00000000ffffffff x7 7f7f7f7f7f7f7f7f
x8 0000007fc4b568d0 x9 b4000078ed412950 x10 005d003100830002 x11 0000000000000001
x12 000000000000001b x13 000000007fffffff x14 0000000000003126 x15 0000000021aead24
x16 0000007a9f4f82b8 x17 0000007a9f4916f0 x18 0000007aa01f0000 x19 0000007a9fe3ff80
x20 b4000078fd4052b0 x21 0000000659d30ba5 x22 000000000000001b x23 b4000078ad40f8d0
x24 0000007a9fe3ff80 x25 00000000107fbda5 x26 0000000000000000 x27 0000007fc4b568b9
x28 b40000788d4046f0 x29 0000007fc4b56880
lr 0000007a9f4a59b0 sp 0000007fc4b56840 pc 0000007a9f491750 pst 0000000060001000
6 total frames
backtrace:
#00 pc 0000000000055750 /system/lib64/libbinder.so (android::ProcessState::init(char const*, bool)+96)
#01 pc 00000000000699ac /system/lib64/libbinder.so (android::ServiceManagerShim::waitForService(android::String16 const&)+652)
#02 pc 000000000001d8fc /system/lib64/libbinder_ndk.so (AServiceManager_waitForService+92)
#03 pc 00000000000156fc /odm/bin/hw/xxx-service (SensorService::SensorService(std::__1::shared_ptr<aidl::xxx-service::impl>)+300)
#04 pc 00000000000106c8 /odm/bin/hw/xxx-service (main+360)
#05 pc 0000000000057674 /apex/com.android.runtime/lib64/bionic/libc.so (__libc_init+116)
墓碑分析
直接原因
首先利用 core-parser 生成 fakecore 后在 gdb 上展开。可见Android 墓碑文件转 FakeCore 开源拉!
css
core-parser -t _exp_detail.txt --sysroot symbols/
yaml
Thread("1240")
x0 0x0000007a9f46cd36 x1 0x0000007fc4b568d0 x2 0x0000000000000018 x3 0x0000000021ae1554
x4 0x00ffffffffffffff x5 0x0000000000003126 x6 0x00000000ffffffff x7 0x7f7f7f7f7f7f7f7f
x8 0x0000007fc4b568d0 x9 0xb4000078ed412950 x10 0x005d003100830002 x11 0x0000000000000001
x12 0x000000000000001b x13 0x000000007fffffff x14 0x0000000000003126 x15 0x0000000021aead24
x16 0x0000007a9f4f82b8 x17 0x0000007a9f4916f0 x18 0x0000007aa01f0000 x19 0x0000007a9fe3ff80
x20 0xb4000078fd4052b0 x21 0x0000000659d30ba5 x22 0x000000000000001b x23 0xb4000078ad40f8d0
x24 0x0000007a9fe3ff80 x25 0x00000000107fbda5 x26 0x0000000000000000 x27 0x0000007fc4b568b9
x28 0xb40000788d4046f0 fp 0x0000007fc4b56880
lr 0x0000007a9f4a59b0 sp 0x0000007fc4b56840 pc 0x0000007a9f491750 pst 0x0000000060001000
Native: #0 0000007a9f491750 android::ProcessState::init(char const*, bool)+0x60
Native: #1 0000007a9f4a59ac android::ServiceManagerShim::waitForService(android::String16 const&)+0x28c
Native: #2 0000007a9f5218fc AServiceManager_waitForService+0x5c
Native: #3 00000056a50b16fc
Native: #4 00000056a50ac6c8
Native: #5 0000007a9f86c674 __libc_init+0x74
Native: #6 00000056a50ac070
yaml
core-parser> f 0
Native: #0 0000007a9f491750 android::ProcessState::init(char const*, bool)+0x60
{
library: symbols/system/lib64/libbinder.so
symbol: _ZN7android12ProcessState4initEPKcb
frame_fp: 0x7fc4b56880
frame_pc: 0x7a9f491750
ASM CODE:
0x7a9f49172c: 54000281 | b.ne 0x7a9f49177c
0x7a9f491730: f0000349 | adrp x9, 0x7a9f4fc000
0x7a9f491734: 37000821 | tbnz w1, #0, 0x7a9f491838
0x7a9f491738: f941f529 | ldr x9, [x9, #0x3e8]
0x7a9f49173c: 3943e12a | ldrb w10, [x9, #0xf8]
0x7a9f491740: 350006ca | cbnz w10, 0x7a9f491818
0x7a9f491744: f9000109 | str x9, [x8]
0x7a9f491748: aa0803e1 | mov x1, x8
0x7a9f49174c: f940012a | ldr x10, [x9]
0x7a9f491750: f85e814a | ldur x10, [x10, #-0x18]
0x7a9f491754: 8b0a0120 | add x0, x9, x10
0x7a9f491758: 94015066 | bl 0x7a9f4e58f0
}
从最后的指令得知,此时 x10 = 0x005d003100830002
寄存器是个非法值,因此读内存错误导致段错误发生,我们可以简单的从 gdb
上查看是哪个变量。
间接原因
arduino
(gdb) bt
#0 0x0000007a9f491750 in android::sp<android::ProcessState>::sp (this=0x7fc4b568d0, other=...) at system/core/libutils/include/utils/StrongPointer.h:257
#1 android::ProcessState::init (driver=<optimized out>, requireDefault=<optimized out>) at frameworks/native/libs/binder/ProcessState.cpp:153
#2 0x0000007a9f4a59b0 in android::ProcessState::self () at frameworks/native/libs/binder/ProcessState.cpp:86
#3 android::ServiceManagerShim::waitForService (this=0xb4000078fd4052b0, name16=...) at frameworks/native/libs/binder/IServiceManager.cpp:448
#4 0x0000007a9f521900 in AServiceManager_waitForService (instance=0xb4000078fd405080 "") at frameworks/native/libs/binder/ndk/service_manager.cpp:114
ini
(gdb) p *this
$1 = {
m_ptr = 0xb4000078ed412950
}
ini
(gdb) p *this->m_ptr
$2 = {
<android::RefBase> = <invalid address>,
members of android::ProcessState:
_vptr$ProcessState = 0x5d003100830002,
mDriverName = <incomplete type>,
mDriverFD = 7864367,
mVMStart = 0x20003e0018002e,
mThreadCountLock = {
__private = {3145730, 3604523, 1310876, 3932224, 1835071, 7209019, 2621482, 7078063, 10747918, 4587606}
},
mThreadCountDecrement = {
__private = {7864353, 2424878, 2359532, 8782015, 15073443, 3866761, 8454350, 3014762, 7602231, 2359362, 5636111, 1376296}
},
mExecutingThreadsCount = 19984762003259401,
mWaitingForThreads = 7599996171255833,
mMaxThreads = 5911202150023217,
mCurrentThreads = 9570273762803741,
mKernelStartedThreads = 17732992255328281,
mStarvationStartTimeMs = 15199829133754428,
mLock = {
__m_ = {
__private = {2752566, 4849680, 1900595, 786468, 1376283, 2228260, 23, 0, 0, 0}
}
},
mHandleToObject = {
<android::VectorImpl> = {<No data fields>}, <No data fields>},
mForked = false,
mThreadPoolStarted = true,
mThreadPoolSeq = 4,
mCallRestriction = android::ProcessState::CallRestriction::NONE
}
从内存上看,x10
寄存器是虚表指针,可见 android::ProcessState
内存部分被破坏。
Scudo 内存分配器
分析此类问题,需引入内存分配器的概念,详情可见Android Native | Scudo内存分配器。
makefile
core-parser> rd 0xb4000078ed412900 -e 0xb4000078ed412a50
78ed412900: 0000000000000000 0000000000000000 ................
78ed412910: 0000000000000000 0000000000000000 ................
78ed412920: 0000000000000000 0000000000000000 ................
78ed412930: 0047001e0009000b 0063000b0022003a ......G.:."...c.
78ed412940: 00360070000f004d 0025004700050041 M...p.6.A...G.%.
78ed412950: 005d003100830002 0050007e00730059 ....1.].Y.s.~.P.
78ed412960: 004c006f0078002f 0020003e0018002e /.x.o.L.....>...
78ed412970: 0037002b00300002 003c00400014009c ..0.+.7.....@.<.
78ed412980: 006e003b001c003f 006c00af0028002a ?...;.n.*.(...l.
78ed412990: 0046005600a4000e 0025002e00780021 ....V.F.!.x...%.
78ed4129a0: 008600bf002400ec 003b008900e600a3 ..$...........;.
78ed4129b0: 002e006a008100ce 0024004200740037 ....j...7.t.B.$.
78ed4129c0: 001500280056000f 0047000900200009 ..V.(.........G.
78ed4129d0: 001b002800150019 0015003500590031 ....(...1.Y.5...
78ed4129e0: 0022001d0009001d 003f0010002f0019 ......".../...?.
78ed4129f0: 0036002a002a003c 004a0010002a0036 <.*.*.6.6.*...J.
78ed412a00: 000c0024001d0033 002200240015001b 3...$.......$.".
78ed412a10: 0000000000000017 0000000000000000 ................
78ed412a20: 0000007a9f4ed768 b4000078fd406678 [email protected]...
78ed412a30: 0000000000000003 0000000000000000 ................
78ed412a40: 0000000000000010 0000000400000100 ................
markdown
core-parser> scudo 0xb4000078ed412950
scudo::Chunk::UnpackedHeader
* ClassId: 0x4d
* State: 0x0
* OriginOrWasZeroed: 0x0
* SizeOrUnusedBytes: 0xf0
* Offset: 0x70
* Checksum: 0x36
core-parser>
显然 chunk
的头部内容被破坏,从邻近的数据特征看,显然被前一个同类型指针操作越界导致。由于墓碑文件缺乏更多的内存信息,没法直接从内存中得知是哪个区域的,但可以参考 android::ProcessState
大小。
arduino
(gdb) ptype /o *this
/* offset | size */ type = class android::sp<android::ProcessState> [with T = android::ProcessState] {
/* 0 | 8 */ T *m_ptr;
/* total size (bytes): 8 */
}
(gdb) ptype /o *this->m_ptr
/* offset | size */ type = class android::ProcessState : public virtual android::RefBase {
/* 8 | 0 */ class android::String8 {
<incomplete type>
/* total size (bytes): 0 */
} mDriverName;
/* XXX 8-byte hole */
/* 16 | 4 */ int mDriverFD;
/* XXX 4-byte hole */
/* 24 | 8 */ void *mVMStart;
/* 32 | 40 */ pthread_mutex_t mThreadCountLock;
/* 72 | 48 */ pthread_cond_t mThreadCountDecrement;
/* 120 | 8 */ size_t mExecutingThreadsCount;
/* 128 | 8 */ size_t mWaitingForThreads;
/* 136 | 8 */ size_t mMaxThreads;
/* 144 | 8 */ size_t mCurrentThreads;
/* 152 | 8 */ size_t mKernelStartedThreads;
/* 160 | 8 */ int64_t mStarvationStartTimeMs;
/* 168 | 40 */ class std::__1::mutex {
/* 168 | 40 */ std::__1::__libcpp_mutex_t __m_;
/* total size (bytes): 40 */
} mLock;
/* 208 | 40 */ class android::Vector<android::ProcessState::handle_entry> [with TYPE = android::ProcessState::handle_entry] : private android::VectorImpl {
/* total size (bytes): 40 */
} mHandleToObject;
/* 248 | 1 */ bool mForked;
/* 249 | 1 */ bool mThreadPoolStarted;
/* XXX 2-byte hole */
/* 252 | 4 */ volatile int32_t mThreadPoolSeq;
/* 256 | 4 */ android::ProcessState::CallRestriction mCallRestriction;
/* XXX 20-byte padding */
/* total size (bytes): 280 */
}
Class ID | 12 |
---|---|
原始大小(bytes) | 352 |
去除Chunk Header大小 | 336 |
符合这个区域的是 ClassID=12
的内存段。
Core 分析
重新复测抓取完整的 Core
文件后在看看具体情况。
ini
(gdb) p this->m_ptr
$2 = (android::ProcessState *) 0xb400007eba459ed0
ini
(gdb) p *this->m_ptr
$1 = {
<android::RefBase> = <invalid address>,
members of android::ProcessState:
_vptr$ProcessState = 0x1c00480010003a,
mDriverName = {
mString = 0x3e000d00430031 <error: Cannot access memory at address 0x3e000d00430031>
},
mDriverFD = 4259910,
mVMStart = 0x6500200028001c,
mThreadCountLock = {
__private = {4194306, 5570621, 2621493, 2490377, 4784130, 983064, 589917, 6029348, 3473408, 720919}
},
mThreadCountDecrement = {
__private = {2949147, 6357072, 7209070, 1245217, 1507401, 3735567, 2097235, 4784318, 393265, 196728, 5505129, 11141288}
},
mExecutingThreadsCount = 50384802523578548,
mWaitingForThreads = 50384914194825391,
mMaxThreads = 48132964058661048,
mCurrentThreads = 48695931189330082,
mKernelStartedThreads = 48695836703064243,
mStarvationStartTimeMs = 48132856684282020,
mLock = {
__m_ = {
__private = {11731128, 11468968, 12517573, 11337927, 8847540, 11141268, 157, 0, 0, 0}
}
},
mHandleToObject = {
<android::VectorImpl> = {
_vptr$VectorImpl = 0x7f1b597768 <vtable for android::Vector<android::ProcessState::handle_entry>+16>,
mStorage = 0xb400007eca45c1e8,
mCount = 1,
mFlags = 0,
mItemSize = 16
}, <No data fields>},
mForked = false,
mThreadPoolStarted = true,
mThreadPoolSeq = 4,
mCallRestriction = android::ProcessState::CallRestriction::NONE
}
makefile
core-parser> rd 0xb400007eba459bd0 -e 0xb400007eba459fd0
7eba459bd0: 0000000000000000 0000000000000000 ................
7eba459be0: 0000000000000000 0000000000000000 ................
7eba459bf0: 0000000000000000 0000000000000000 ................
7eba459c00: 09fa00000013810c 0000000000000000 ................
7eba459c10: 002a003c002a0066 002f001200380038 f.*.<.*.8.8.../.
7eba459c20: 0057002f00490036 003f002d00240057 6.I./.W.W.$.-.?.
7eba459c30: 0035003e002f0028 0049001000500010 (./.>.5...P...I.
7eba459c40: 001f002f001b0026 00090020001d0009 &.../...........
7eba459c50: 00120026002d003a 002c0003000e0019 :.-.&.........,.
7eba459c60: 0041003000390050 0032003600560013 P.9.0.A...V.6.2.
7eba459c70: 00050014001f0029 001c003500420035 ).......5.B.5...
7eba459c80: 00300061003f0035 002b00290064003c 5.?.a.0.<.d.).+.
7eba459c90: 003600350044002e 00220050003e004d ..D.5.6.M.>.P.".
7eba459ca0: 00190025002c0032 0022003b0030002e 2.,.%.....0.;.".
7eba459cb0: 0035003400300030 00210029001e0023 0.0.4.5.#...).!.
7eba459cc0: 00220025002d0027 000d00110014001f '.-.%.".........
7eba459cd0: 0031001100260004 002300370026002a ..&...1.*.&.7.#.
7eba459ce0: 003c002900160026 00270023002d0029 &...).<.).-.#.'.
7eba459cf0: 001f001e004e0014 00360024002a0014 ..N.......*.$.6.
7eba459d00: 0044000f00210026 001d003300060006 &.!...D.....3...
7eba459d10: 002e000f0028001f 0042001f0009001c ..(...........B.
7eba459d20: 005e003300430050 003200540028002c P.C.3.^.,.(.T.2.
7eba459d30: 00260053003e002a 001f0026002e0093 *.>.S.&.....&...
7eba459d40: 0053005000280078 001a002b00150026 x.(.P.S.&...+...
7eba459d50: 001e002f0018000f 0014001f003c000b ..../.....<.....
7eba459d60: 0026001500100046 00300020001a001c F.....&.......0.
7eba459d70: 002f004b001c0036 002b0013001b0013 6...K./.......+.
7eba459d80: 0077002700340027 0055001500170039 '.4.'.w.9.....U.
7eba459d90: 004b002a0021000d 00200016002f004c ..!.*.K.L./.....
7eba459da0: 00140018005f001d 004c001600190042 .._.....B.....L.
7eba459db0: 0005000b003b001d 0067005c00160021 ..;.....!....g.
7eba459dc0: 003c003600970055 003100260025005f U...6.<._.%.&.1.
7eba459dd0: 003700e2004f0076 001b008f000b003b v.O...7.;.......
7eba459de0: 009b00af00470066 009b00c70097008e f.G.............
7eba459df0: 009b008c00a400c9 009d009500ad00ad ................
7eba459e00: 00bf008900730095 00aa00cc00b60089 ..s.............
7eba459e10: 00aa00b80099008e 00a400ad00a8008e ................
7eba459e20: 00a000b600c900d5 00aa009b00a200ab ................
7eba459e30: 005c003b0042009f 0030001b0043001c ..B.;....C...0.
7eba459e40: 0022003d00230043 0040003d00160014 C.#.=.".....=.@.
7eba459e50: 0009002d002e0026 002a002a00420032 &...-...2.B.*.*.
7eba459e60: 004800130029000c 0009003e001b0009 ..)...H.....>...
7eba459e70: 0021000300330014 0018000b00060006 ..3...!.........
7eba459e80: 0067003c00390046 0016002d0024002c F.9.<.g.,.$.-...
7eba459e90: 002a004c0032002a 0024002d002a009f *.2.L.*...*.-.$.
7eba459ea0: 001b005400140058 0011002700280018 X...T.....(.'...
7eba459eb0: 001f003100230020 001700230014001a ..#.1.......#...
7eba459ec0: 000b002b0019000b 00440011001c0018 ....+.........D.
7eba459ed0: 001c00480010003a 003e000d00430031 :...H...1.C...>.
7eba459ee0: 0074001b00410046 006500200028001c F.A...t...(...e.
7eba459ef0: 0055003d00400002 0026000900280035 ..@.=.U.5.(...&.
7eba459f00: 000f001800490002 005c00240009005d ..I.....]...$..
7eba459f10: 000b001700350000 00610050002d001b ..5.......-.P.a.
7eba459f20: 00130021006e006e 0039000f00170049 n.n.!...I.....9.
7eba459f30: 004900be00200053 0003007800060031 S.....I.1...x...
7eba459f40: 00aa00a800540069 00b300b6007f00b4 i.T.............
7eba459f50: 00b300d0009f00af 00ab00ad00b400b8 ................
7eba459f60: 00ad00b1008c00a2 00ad009b00ba00b3 ................
7eba459f70: 00ab009400b100a4 00af00a800b300b8 ................
7eba459f80: 00ad00c700bf00c5 00aa0094008700b4 ................
7eba459f90: 000000000000009d 0000000000000000 ................
7eba459fa0: 0000007f1b597768 b400007eca45c1e8 hwY.......E.~...
7eba459fb0: 0000000000000001 0000000000000000 ................
7eba459fc0: 0000000000000010 0000000400000100 ................
Chunk 头地址 | 7eba459c00 | 7eba459d60 | 7eba459ec0 |
---|---|---|---|
内存范围 | 7eba459c10 ~ 7eba459d60 | 7eba459d70 ~ 7eba459ec0 | 7eba459ed0 ~ 7eba45a020 |
可见中指针地址为 0x7eba459c10
的数据内存横跨了两个 chunk
内存。其中一个则是 android::ProcessState
的内存片段,也就是 7eba459ed0
的内存。
BPF 跟踪
由于该问题是启动时发生的,而这个 xxx-service
又是自启动服务,于是我们可以修改对应的二进制文件停住 main
函数后再挂 stackplz
跟踪器,设置完后通过 core-parser
恢复机器码让程序继续运行。
shell
[8814|8897] event_addr:0x7c36f336a0 hit_count:1267, Regs:
[x0=0x138,x1=0x0,x2=0x0,x3=0x200007a357392e0,x4=0x200007a35739340,x5=0x4,x6=0xfb100000f4a,x7=0xd4700000fb1,x8=0x0,x9=0x200007a357392c0,x10=0xb69d57f1,x11=0x72000,x12=0x4,x13=0x4,x14=0x1282,x15=0x11b4,x16=0x799ba81538,x17=0x7c36f336a0,x18=0x79a4014000,x19=0x799ba8d098,x20=0x799ba884a8,x21=0x799ba8d5a8,x22=0x799ba8d5a8,x23=0x799ba8d594,x24=0x799ba8d098,x25=0xb400007b45738e30,x26=0x799ba88398,x27=0x5100,x28=0x799ba8d098,x29=0x79a55e4910,lr=0x799ba6f7cc,sp=0x79a55e48d0,pc=0x7c36f336a0]
Backtrace:
#00 pc 00000000000456a0 /apex/com.android.runtime/lib64/bionic/libc.so (malloc)
#01 pc 00000000000237c8 /odm/lib64/libxxx_hal.so
#02 pc 0000000000012f78 /odm/lib64/libxxx_hal.so
#03 pc 000000000000e8c4 /odm/lib64/libxxx_hal.so
#04 pc 00000000000068ec /odm/lib64/libxxx.so (xxx_daemon_main+184)
#05 pc 0000000000016e98 /odm/bin/hw/xxx-service (xxx_thread_func(void*)+920)
#06 pc 00000000000706a8 /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+200)
#07 pc 0000000000061a40 /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64)
从汇编伪代码上看,容易计算得到 ivar6
的参数为 156 = 0x138 / 2
,通过检索代码也容易找到 hal_xxx_num
的相关代码符合以上特征。
简单验证
将 libxxx_hal.so
库文件 offset = 0x237c4
的机器码修改为 D2820000
,将此处 malloc
参数修改为 0x1000
。
css
原语句:LDR X0, [SP, ...]
修改后:MOV X0, 0x1000
将修改后的 libxxx_hal.so
push 到手机测试,结果有效,由于该特征在 xxx-service
这个模块里比较少,因此检查相关的代码也比较容易找到写入越界的代码段。
内存页筛选法
基本原理
- 内存分页:操作系统将物理内存划分为固定大小的页表(现阶段为 4KB),通过虚拟内存管理进程的内存访问。
- 页保护机制:为内存页设在权限(读/写/运行),当访问违反权限时(如写入只读页),触发段错误或访问违规。
- 越界检测:将数组或缓冲区放置在单独的内存页中,在数组前后分配保护页(Guard Page),并标记其权限(不可访问,或不可写等),越界访问保护页时,中断程序。
例如:如何利用内存访问权限定位内存踩踏 是属于该方法的一种特例。
优点 | 缺点 |
---|---|
立即捕获程序越界行为,精准定位错误代码位置。 | 内存开销大,保护页至少 4KB。 |
支持检测上下界异常访问。 | 无法检测同一页异常。 |
GuardPage 代码段
ini
static void monitor_thread(void*) {
prctl(PR_SET_NAME, "open-monitor");
std::set<uint64_t> free_ptr;
std::set<uint64_t> alloc_ptr;
uint32_t chunk_size = 0x10;
uint32_t class_size = 0x150;
uint32_t seq_num = (0x1000 / (class_size + chunk_size)) + 2;
// premalloc
for (int i = 0; i < 10; ++i) {
uint64_t ptr = reinterpret_cast<uint64_t>(malloc(class_size));
alloc_ptr.insert(ptr);
}
uint32_t k = 0;
uint32_t guard_page_num = 10;
while (k < guard_page_num) {
uint64_t vaddr = reinterpret_cast<uint64_t>(malloc(class_size));
alloc_ptr.insert(vaddr);
if ((vaddr & 0xFFF) >= 0xF00
&& alloc_ptr.count(vaddr - (class_size + chunk_size))) {
for (int j = 0; j < seq_num; ++j) {
uint64_t seq_ptr = reinterpret_cast<uint64_t>(malloc(class_size));
alloc_ptr.insert(seq_ptr);
}
bool found = false;
for (int loop = 1; loop < seq_num; ++loop) {
if (!alloc_ptr.count(vaddr + (class_size + chunk_size) * loop))
break;
if (loop == seq_num - 1)
found = true;
}
if (found) {
uint64_t free_vaddr = vaddr - (class_size + chunk_size);
free_ptr.insert(free_vaddr);
uint64_t guard_page_vaddr = RoundDown(vaddr + (class_size + chunk_size), 0x1000);
mprotect((void *) guard_page_vaddr, 0x1000, PROT_READ);
++k;
JNI_LOGI("Ptr: %" PRIx64 " Guard Page: %" PRIx64, vaddr, guard_page_vaddr);
}
}
}
for (uint64_t ptr : free_ptr) {
free((void *)ptr);
JNI_LOGI("Free ptr: %" PRIx64, ptr);
}
}
设置陷阱
yaml
core-parser> disas main 10
LIB: /odm/bin/hw/xxx-service
main: [5db2a18560, 5db2a18810]
0x5db2a18560: d102c3ff | sub sp, sp, #0xb0
0x5db2a18564: a9087bfd | stp x29, x30, [sp, #0x80]
0x5db2a18568: f9004bf5 | str x21, [sp, #0x90]
0x5db2a1856c: a90a4ff4 | stp x20, x19, [sp, #0xa0]
0x5db2a18570: 910203fd | add x29, sp, #0x80
0x5db2a18574: d53bd055 | mrs x21, TPIDR_EL0
0x5db2a18578: 52800140 | mov w0, #0xa
0x5db2a1857c: f94016a8 | ldr x8, [x21, #0x28]
0x5db2a18580: f81f83a8 | stur x8, [x29, #-8]
0x5db2a18584: 94007343 | bl 0x5db2a35290
core-parser>
我们首先在 main
函数上暂停程序。将机器码 52800140
修改为 14000000
重新 push
到手机后,重启或杀进程。注入 opencore
、以及准备好的 guard page
代码段。
vbnet
core-parser> remote hook --inject -l libopencore.so
arm64: hook inject "libopencore.so"
arm64: hook found "dlopen" address: 0x70bec2e020
arm64: target process current sp: 0x7fc5291c40
arm64: call dlopen(0x7fc5291c30 "libopencore.so", 0x2)
arm64: return 0xa4b722cf8c1b388b
vbnet
core-parser> remote hook --inject -l libopen-monitor.so
arm64: hook inject "libopen-monitor.so"
arm64: hook found "dlopen" address: 0x70bec2e020
arm64: target process current sp: 0x7fc5291c40
arm64: call dlopen(0x7fc5291c20 "libopen-monitor.so", 0x2)
arm64: return 0xd8257b1bc91a8367
less
03-17 17:25:09.174 8023 8023 I opencore: Init inject opencore-1.4.14 environment..
03-17 17:25:19.255 8023 8023 I monitor : Init inject open-monitor 1.0.1 environment..
03-17 17:25:19.256 8023 8455 I monitor : Ptr: b400007008eaef70 Guard Page: b400007008eaf000
03-17 17:25:19.256 8023 8455 I monitor : Ptr: b400007008ec2fd0 Guard Page: b400007008ec3000
03-17 17:25:19.256 8023 8455 I monitor : Ptr: b400007008eddf90 Guard Page: b400007008ede000
03-17 17:25:19.256 8023 8455 I monitor : Ptr: b400007008ef3f90 Guard Page: b400007008ef4000
03-17 17:25:19.256 8023 8455 I monitor : Ptr: b400007008ef9fd0 Guard Page: b400007008efa000
03-17 17:25:19.256 8023 8455 I monitor : Ptr: b400007008f03f50 Guard Page: b400007008f04000
03-17 17:25:19.256 8023 8455 I monitor : Ptr: b400007008f0cfb0 Guard Page: b400007008f0d000
03-17 17:25:19.256 8023 8455 I monitor : Ptr: b400007008f11f70 Guard Page: b400007008f12000
03-17 17:25:19.256 8023 8455 I monitor : Ptr: b400007008f16f30 Guard Page: b400007008f17000
03-17 17:25:19.256 8023 8455 I monitor : Ptr: b400007008f1ef10 Guard Page: b400007008f1f000
03-17 17:25:19.256 8023 8455 I monitor : Free ptr: b400007008eaee10
03-17 17:25:19.256 8023 8455 I monitor : Free ptr: b400007008ec2e70
03-17 17:25:19.256 8023 8455 I monitor : Free ptr: b400007008edde30
03-17 17:25:19.256 8023 8455 I monitor : Free ptr: b400007008ef3e30
03-17 17:25:19.256 8023 8455 I monitor : Free ptr: b400007008ef9e70
03-17 17:25:19.256 8023 8455 I monitor : Free ptr: b400007008f03df0
03-17 17:25:19.256 8023 8455 I monitor : Free ptr: b400007008f0ce50
03-17 17:25:19.256 8023 8455 I monitor : Free ptr: b400007008f11e10
03-17 17:25:19.256 8023 8455 I monitor : Free ptr: b400007008f16dd0
03-17 17:25:19.256 8023 8455 I monitor : Free ptr: b400007008f1edb0
此处我们找到了 10
个 GuardPage
后,并归返它上界指针作为自由指针,尽可能让程序这段时间内申请该范围的内存指针地址都落在 GuardPage
之前,将来若发生越界将会第一时间抓到现场。
等待鱼儿上钩
恢复机器码让程序继续运行,然后静静等待。
yaml
core-parser> sysroot /odm
Mmap segment [6105348000, 6105355000) /odm/bin/hw/xxx-service [0]
Mmap segment [6105358000, 6105376000) /odm/bin/hw/xxx-service [10000]
Read symbols[330] (/odm/bin/hw/xxx-service)
core-parser> disas main 10
LIB: /odm/bin/hw/xxx-service
main: [6105358560, 6105358810]
0x6105358560: d102c3ff | sub sp, sp, #0xb0
0x6105358564: a9087bfd | stp x29, x30, [sp, #0x80]
0x6105358568: f9004bf5 | str x21, [sp, #0x90]
0x610535856c: a90a4ff4 | stp x20, x19, [sp, #0xa0]
0x6105358570: 910203fd | add x29, sp, #0x80
0x6105358574: d53bd055 | mrs x21, TPIDR_EL0
0x6105358578: 14000000 | b 0x6105358578
0x610535857c: f94016a8 | ldr x8, [x21, #0x28]
0x6105358580: f81f83a8 | stur x8, [x29, #-8]
0x6105358584: 94007343 | bl 0x6105375290
shell
core-parser> remote wd 0x6105358578 -v f94016a852800140
less
03-17 17:25:53.806 8023 8537 I opencore: Wait (8572) coredump
03-17 17:25:53.807 8572 8572 I opencore: Coredump /data/cores/core.binder:8023_2_8023_1742203553 ...
03-17 17:25:54.085 8572 8572 I opencore: Finish done.
yaml
core-parser> bt
Thread("8537")
x0 0xb400007008f11e10 x1 0x00000070befd2a70 x2 0xffffffffffffffe1 x3 0xb400007008f11fc1
x4 0x00000070befd2ae1 x5 0xb400007008f12032 x6 0x0000000000000000 x7 0x7f7f7f7f7f7f7f7f
x8 0x0000000000000111 x9 0x0000000000001101 x10 0x0000000000000011 x11 0x0000000000000200
x12 0x00000070befd2b55 x13 0x0000000000000000 x14 0x000000000000000f x15 0x0000000000000000
x16 0x0000006e25649598 x17 0x00000070b8f53ac0 x18 0x0000006e1aed4000 x19 0x0000006e25650ec8
x20 0x0000006e2564dbfc x21 0x00000070befd2018 x22 0x0000000000000072 x23 0x000000000000057c
x24 0x0000000000000222 x25 0x00000000000008a7 x26 0x000000000000062e x27 0x0000006e25650448
x28 0x0000000000000510 fp 0x0000006e1b6bfa50
lr 0x0000006e25636d94 sp 0x0000006e1b6bfa50 pc 0x00000070b8f53bf4 pst 0x0000000080001000
Native: #0 00000070b8f53bf4 __memcpy_aarch64_simd+0x134
Native: #1 0000006e25636d90 A_FUNCTION+0x414
Native: #2 0000006e25637a5c /odm/lib64/libxxx_hal.so+0x23a5c
Native: #3 0000006e25627c44 /odm/lib64/libxxx_hal.so+0x13c44
Native: #4 0000006e256434d0 /odm/lib64/libxxx_hal.so+0x2f4d0
Native: #5 00000070b8f6c6a8 __pthread_start(void*)+0xc8
Native: #6 00000070b8f5da40 __start_thread+0x40
core-parser>
yaml
core-parser> f 1
Native: #1 0000006e25636d90 A_FUNCTION+0x414
{
library: /odm/lib64/libxxx_hal.so
symbol: A_FUNCTION
frame_fp: 0x6e1b6bfa50
frame_pc: 0x6e25636d90
ASM CODE:
0x6e25636d6c: 780d1269 | sturh w9, [x19, #0xd1]
0x6e25636d70: 39434a6a | ldrb w10, [x19, #0xd2]
0x6e25636d74: 39034a69 | strb w9, [x19, #0xd2]
0x6e25636d78: 3903466a | strb w10, [x19, #0xd1]
0x6e25636d7c: b40000e0 | cbz x0, 0x6e25636d98
0x6e25636d80: 788d1268 | ldursh x8, [x19, #0xd1]
0x6e25636d84: 8b1902a1 | add x1, x21, x25
0x6e25636d88: d37ff918 | lsl x24, x8, #1
0x6e25636d8c: aa1803e2 | mov x2, x24
0x6e25636d90: 940036c8 | bl 0x6e256448b0
0x6e25636d94: 8b190319 | add x25, x24, x25
0x6e25636d98: 78796aa8 | ldrh w8, [x21, x25]
}
core-parser>
显然来自函数 A_FUNCTION
上调用了函数 memcpy(0xb400007008f11e10, xxxx, 0x222)
因此发生了写入越界行为。(0x222 / 2 = 273)
makefile
core-parser> rd 0xb400007008f11e00 -e 0xb400007008f12210
7008f11e00: fe6300000013810c 0000000000000000 ......c.........
7008f11e10: 10002b0043002400 24003c0015003100 .$.C.+...1...<.$
7008f11e20: 150024001f002f00 26001f0017002200 ./...$...".....&
7008f11e30: 4e003e002a001d00 1d00260026003300 ...*.>.N.3.&.&..
7008f11e40: 24002d0019001500 120017000c003f00 .....-.$.?......
7008f11e50: 2a00360009000300 0a004a0012000c00 .....6.*.....J..
7008f11e60: 0a003c0030001f00 0200140026003e00 ...0.<...>.&....
7008f11e70: 0d00110011003500 29001f000e000e00 .5.............)
7008f11e80: 29000a002c000a00 2200110026003600 ...,...).6.&..."
7008f11e90: 1900170023000a00 0700070029000a00 ...#.......)....
7008f11ea0: 0d00390017001600 170022001a001a00 .....9......."..
7008f11eb0: 29002e0039002d00 29003c001f002600 .-.9...).&...<.)
7008f11ec0: 4300160026002500 3f0034004a001000 .%.&...C...J.4.?
7008f11ed0: 260026002a003000 300037002d001000 .0.*.&.&...-.7.0
7008f11ee0: 2d00290043002600 3d001f0029002c00 .&.C.).-.,.)...=
7008f11ef0: 2d00210043002c00 43005a0033001100 .,.C.!.-...3.Z.C
7008f11f00: 3a0076006f004000 4c004c005f004300 [email protected].:.C._.L.L
7008f11f10: 39006b0050004300 3f00500044006f00 .C.P.k.9.o.D.P.?
7008f11f20: 3300820043003700 31003a0042003f00 .7.C...3.?.B.:.1
7008f11f30: 2d004d00a9004800 3900490037003a00 .H...M.-.:.7.I.9
7008f11f40: 6b0084002f005500 1200310008001100 .U./...k.....1..
7008f11f50: 40003e002b002600 0f0015002f003200 .&.+.>[email protected]./....
7008f11f60: 480034005b001200 40003e0004000d00 ...[.4.H.....>.@
7008f11f70: 1a00130053004300 09001c0015003100 .C.S.....1......
7008f11f80: 1b0019001e006f00 30003c0026002a00 .o.......*.&.<.0
7008f11f90: 4700590053001400 7b003a0050004900 ...S.Y.G.I.P.:.{
7008f11fa0: 9b0046003e008600 5900550079004000 ...>[email protected]
7008f11fb0: 4f006f0046006f00 5100660062005300 .o.F.o.O.S.b.f.Q
7008f11fc0: 49004f004900bd00 f3005d003e004400 ...I.O.I.D.>.]..
7008f11fd0: 3f00480046004600 460058004f005000 .F.F.H.?.P.O.X.F
7008f11fe0: f800d9008e00b400 0901c900db00ca00 ................
7008f11ff0: 0000000000000000 0000000000000000 ................
...
可见鱼儿申请了我们设置的陷阱指针地址,并写入越界到 GuardPage
触发中断。
后记
内存页筛选法的核心思路是如何进行筛选页,制造陷阱,并且非一成不变,需开发者对内存分配器足够了解。该方法会的人不多,大多数开发者可能都没听过,一个原因是非通用的方法,受用户态内存分配器机制影响,然而与 scudo
却非常搭配。最后问题解决也比较简单的修改个参数即可。
ini
- hal_xxx_num=156
+ hal_xxx_num=273