背景
我们的跨端方案需要在鸿蒙上动态执行js代码,类似RN。鸿蒙提供了JSVM解决方案,JSVM套壳V8。但在运行过程出现一个JSVM内部的崩溃。这篇文章主要记录了如何使用调试方法、少量汇编知识定位到系统库的崩溃原因
现象
使用jsvm执行js脚本,发生了WhiteToGreyAndPush崩溃,堆栈如下:
ruby
Reason:Signal:SIGSEGV(SEGV_MAPERR)@0xbebebebebebebec6
[libjsvm.so] v8::internal::MarkingBarrier::WhiteToGreyAndPush(v8::internal::HeapObject) Disassembly:201
[libjsvm.so] v8::internal::MarkingBarrier::Write(v8::internal::HeapObject, v8::internal::FullHeapObjectSlot, v8::internal::HeapObject) 0x0000005f572087bc
[libjsvm.so] v8::internal::Factory::CodeBuilder::BuildInternal(bool) 0x0000005f5718766c
[libjsvm.so] v8::internal::compiler::CodeGenerator::FinalizeCode() 0x0000005f578bd380
[libjsvm.so] auto v8::internal::compiler::PipelineImpl::Run<v8::internal::compiler::FinalizeCodePhase>() 0x0000005f57b25c3c
[libjsvm.so] v8::internal::compiler::PipelineImpl::FinalizeCode(bool) 0x0000005f57b19208
[libjsvm.so] v8::internal::compiler::PipelineCompilationJob::FinalizeJobImpl(v8::internal::Isolate*) 0x0000005f57b1903c
[libjsvm.so] v8::internal::Compiler::FinalizeTurbofanCompilationJob(v8::internal::TurbofanCompilationJob*, v8::internal::Isolate*) 0x0000005f57065b40
[libjsvm.so] v8::internal::OptimizingCompileDispatcher::InstallOptimizedFunctions() 0x0000005f570b3b24
[libjsvm.so] v8::internal::StackGuard::HandleInterrupts() 0x0000005f57149b20
[libjsvm.so] v8::internal::Runtime_StackGuard(int, unsigned long*, v8::internal::Isolate*) 0x0000005f57592604
[libjsvm.so] Builtins_CEntry_Return1_ArgvOnStack_NoBuiltinExit 0x0000005f56af20b8
[libjsvm.so] Builtins_ProxyConstructor 0x0000005f56b53190
[libjsvm.so] Builtins_JSBuiltinsConstructStub 0x0000005f56a6677c
[??] ?? 0x0000005f3f4c9b84
ruby
Reason:Signal:SIGSEGV(SEGV_MAPERR)@0xbebebebebebebec6
[libjsvm.so] v8::internal::MarkingBarrier::WhiteToGreyAndPush(v8::internal::HeapObject) Disassembly:201
[libjsvm.so] v8::internal::MarkingBarrier::Write(v8::internal::HeapObject, v8::internal::FullHeapObjectSlot, v8::internal::HeapObject) 0x0000005f37f887bc
[libjsvm.so] v8::internal::Dictionary<v8::internal::NameDictionary, v8::internal::NameDictionaryShape>::SetEntry(v8::internal::InternalIndex, v8::internal::Object, v8::internal::Object, v8::internal::PropertyDetails) 0x0000005f381d2c0c
[libjsvm.so] v8::internal::Handle<v8::internal::NameDictionary> v8::internal::Dictionary<v8::internal::NameDictionary, v8::internal::NameDictionaryShape>::Add<v8::internal::Isolate, (v8::internal::AllocationType)0>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::NameDictionary>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyDetails, v8::internal::InternalIndex*) 0x0000005f381d32dc
[libjsvm.so] v8::internal::BaseNameDictionary<v8::internal::NameDictionary, v8::internal::NameDictionaryShape>::Add(v8::internal::Isolate*, v8::internal::Handle<v8::internal::NameDictionary>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyDetails, v8::internal::InternalIndex*) 0x0000005f381d2e64
[libjsvm.so] v8::internal::JSObject::MigrateToMap(v8::internal::Isolate*, v8::internal::Handle<v8::internal::JSObject>, v8::internal::Handle<v8::internal::Map>, int) 0x0000005f3815bd74
[libjsvm.so] v8::internal::JSObject::OptimizeAsPrototype(v8::internal::Handle<v8::internal::JSObject>, bool) 0x0000005f3815fa68
[libjsvm.so] v8::internal::Map::SetPrototype(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::HeapObject>, bool) 0x0000005f381b389c
[libjsvm.so] v8::internal::JSFunction::SetInitialMap(v8::internal::Isolate*, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>) 0x0000005f3814a2a4
[libjsvm.so] v8::internal::ApiNatives::CreateApiFunction(v8::internal::Isolate*, v8::internal::Handle<v8::internal::NativeContext>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::InstanceType, v8::internal::MaybeHandle<v8::internal::Name>) 0x0000005f37d14208
[libjsvm.so] v8::internal::(anonymous namespace)::InstantiateFunction(v8::internal::Isolate*, v8::internal::Handle<v8::internal::NativeContext>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::MaybeHandle<v8::internal::Name>) 0x0000005f37d122ac
[libjsvm.so] v8::internal::ApiNatives::InstantiateFunction(v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::MaybeHandle<v8::internal::Name>) 0x0000005f37d12c4c
[libjsvm.so] v8::FunctionTemplate::GetFunction(v8::Local<v8::Context>) 0x0000005f37d318a4
[libjsvm.so] v8::Function::New(v8::Local<v8::Context>, void (*)(v8::FunctionCallbackInfo<v8::Value> const&), v8::Local<v8::Value>, int, v8::ConstructorBehavior, v8::SideEffectType) 0x0000005f37d317e0
[libjsvm.so] OH_JSVM_CreateFunction 0x0000005f37569780
定位
坏的
MarkingBarrier
复现崩溃: 可以看到是在指令
ldr x20, [x23, #0x8]
出现的崩溃,打印x23,发现x23是一个非法指针0xbebebebebebebebe。
结合上句ldr x23, [x0, #0x50]
,可知x23是x0偏移0x50的属性。根据c++的编译规则,x0是 MarkingBarrier::WhiteToGreyAndPush 的第一个参数,指向this,也就是 MarkingBarrier。
对 MarkingBarrier::WhiteToGreyAndPush 打断点,查看正常情况下 MarkingBarrier 内容:


同上文,x0是第一个参数,也就是this,在WhiteToGreyAndPush里指当前的MarkingBarrier。在memory view中查看MarkingBarrier(0x0000006185bc3640)内容:
可以看到当前MarkingBarrier偏移0x50值是
正常
的,所以这次运行不会有问题。这里我们设置个watchpoint,观察MarkingBarrier偏移0x50值是否什么时候被改坏了:
arduino
(lldb) watchpoint set expression 0x0000006185bc3690
Watchpoint created: Watchpoint 1: addr = 0x6185bc3690 size = 8 state = enabled type = w
new value: 418855532128
断点位置 = 0x0000006185bc3640(MarkingBarrier) + 0x50 = 0x0000006185bc3690
可以看到0x0000006185bc3690当前的值是 418855532128,换算成16进制是0x6185bc3660,也和我们在memory view中看到的一致。
命中watchpoint后可以看到
断点上一句str x12, [x11, #0x50]
指令的意思是把x12存储到x11偏移0x50的位置,打印x11、x12:
bash
(lldb) p/x $x12
(unsigned long) $3 = 0x0000006185bc3660
(lldb) p/x $x11
(unsigned long) $4 = 0x0000006185bc3640
可以看到x11指向的就是前面的 MarkingBarrier 对象,x12打印的则跟之前 0x0000006185bc3690 位置值一致,说明这里给它赋了一个跟之前一样的值,不用管它。继续运行后触发了崩溃:
打印出x0(MarkingBarrier),发现当前的 MarkingBarrier 地址是 0x00000061860d8f40,跟之前我们断点到的 0x0000006185bc3640 是两个对象。查看0x00000061860d8f40内存:
可以发现后面的 MarkingBarrier 偏移0x50是
坏的
,也就是 MarkingBarrier 不是被改坏的
,而是被赋值了一个坏的
MarkingBarrier来源
下载v8源码:github.com/v8/v8,这里我用的... 12.0.46版本。
查看崩溃堆栈二中的 Dictionary::SetEntry 源码,尝试寻找MarkingBarrier来源:
scss
template <typename Derived, typename Shape>
void Dictionary<Derived, Shape>::SetEntry(InternalIndex entry,
Tagged<Object> key,
Tagged<Object> value,
PropertyDetails details) {
DCHECK(Dictionary::kEntrySize == 2 || Dictionary::kEntrySize == 3);
DCHECK(!IsName(key) || details.dictionary_index() > 0);
int index = DerivedHashTable::EntryToIndex(entry);
DisallowGarbageCollection no_gc;
WriteBarrierMode mode = this->GetWriteBarrierMode(no_gc);
this->set(index + Derived::kEntryKeyIndex, key, mode);
this->set(index + Derived::kEntryValueIndex, value, mode);
if (Shape::kHasDetails) DetailsAtPut(entry, details);
}
其中this->set方法实现如下(Dictionary继承链路:Dictionary -> HashTable -> HashTableBase -> FixedArray):
scss
void FixedArray::set(int index, Tagged<Object> value, WriteBarrierMode mode) {
DCHECK_NE(map(), GetReadOnlyRoots().fixed_cow_array_map());
DCHECK_LT(static_cast<unsigned>(index), static_cast<unsigned>(length()));
int offset = OffsetOfElementAt(index);
RELAXED_WRITE_FIELD(*this, offset, value);
CONDITIONAL_WRITE_BARRIER(*this, offset, value, mode);
}
CONDITIONAL_WRITE_BARRIER是个宏定义,定义如下:
csharp
#define CONDITIONAL_WRITE_BARRIER(object, offset, value, mode) \
do { \
DCHECK_NOT_NULL(GetHeapFromWritableObject(object)); \
CombinedWriteBarrier(object, (object)->RawField(offset), value, mode); \
} while (false)
#endif
CombinedWriteBarrier及相关函数的关键实现如下:
scss
inline void CombinedWriteBarrier(Tagged<HeapObject> host, MaybeObjectSlot slot,
MaybeObject value, WriteBarrierMode mode) {
...
heap_internals::CombinedWriteBarrierInternal(host, HeapObjectSlot(slot), value_object, mode);
}
inline void CombinedWriteBarrierInternal(Tagged<HeapObject> host,
HeapObjectSlot slot,
Tagged<HeapObject> value,
WriteBarrierMode mode) {
...
if (V8_UNLIKELY(is_marking)) {
WriteBarrier::MarkingSlow(host, HeapObjectSlot(slot), value);
}
}
void WriteBarrier::MarkingSlow(Tagged<HeapObject> host, HeapObjectSlot slot,
Tagged<HeapObject> value) {
MarkingBarrier* marking_barrier = CurrentMarkingBarrier(host);
marking_barrier->Write(host, slot, value);
}
可以看到是通过 CurrentMarkingBarrier 方法取的MarkingBarrier对象,再看看CurrentMarkingBarrier实现:
ini
namespace {
thread_local MarkingBarrier* current_marking_barrier = nullptr;
} // namespace
MarkingBarrier* WriteBarrier::CurrentMarkingBarrier(
Tagged<HeapObject> verification_candidate) {
MarkingBarrier* marking_barrier = current_marking_barrier;
DCHECK_NOT_NULL(marking_barrier);
...
return marking_barrier;
}
这里用了 thread_local 保存 MarkingBarrier,也就是 current_marking_barrier 指针被修改指向了坏的
MarkingBarrier,导致了上文的崩溃。寻找current_marking_barrier赋值逻辑:
ini
MarkingBarrier* WriteBarrier::SetForThread(MarkingBarrier* marking_barrier) {
MarkingBarrier* existing = current_marking_barrier;
current_marking_barrier = marking_barrier;
return existing;
}
对WriteBarrier::SetForThread打断点:

因为崩溃是在主线程,有很多非主线程的current_marking_barrier修改这里不用管,继续运行,直到触发主线程的 current_marking_barrier 修改:
这里没展示堆栈,我们可以用register read查看lr寄存器的值:
ini
(lldb) register read
General Purpose Registers:
x0 = 0x0000000000000000
x1 = 0x0000000000000020
x2 = 0x0000000000000020
x3 = 0x0000005aeb606a40 ld-musl-aarch64.so.1`memset
x4 = 0x0000005cee8a7f30
x5 = 0x0000000000000001
x6 = 0x0000007fddc54000
x7 = 0x0000000000000001
x8 = 0x0000000000000000
x9 = 0x0000000000000000
x10 = 0x0000005aeb606a40 ld-musl-aarch64.so.1`memset
x11 = 0x000000000b18d225
x12 = 0xffffffffffffffff
x13 = 0x0000000000000000
x14 = 0x0000007fffffffff
x15 = 0x0000007fffffffff
x16 = 0x0000005aecaf4fd8
x17 = 0x0000005aeb71d450 ld-musl-aarch64.so.1`tss_get
x18 = 0xffff000000000006
x19 = 0x0000007b6f867800
x20 = 0x0000005cee8a7e50
x21 = 0x0000007b6f874c40
x22 = 0x0000005cf0298650
x23 = 0x0000007b76a68800
x24 = 0x0000000000000000
x25 = 0x0000005aeb96c570 ld-musl-aarch64.so.1`ohos_malloc_hook_shared_library
x26 = 0x0000000000000007
x27 = 0x0000007b714fb7e8 libace_napi.z.so`ArkNativeEngine::napiProfilerEnabled
x28 = 0x0000007fde44d3a0
fp = 0x0000007fde44a880
lr = 0x0000007c95cc05a8 libjsvm.so`v8::internal::Isolate::Enter() + 232
sp = 0x0000007fde44a880
pc = 0x0000007c95d2c2ac libjsvm.so`v8::internal::WriteBarrier::SetForThread(v8::internal::MarkingBarrier*)
cpsr = 0x20001000
继续运行,lr寄存器分别收集到下面的调用者:
rust
libjsvm.so`v8::internal::Isolate::Enter() + 232
libjsvm.so`v8::internal::Isolate::Init(v8::internal::SnapshotData*, v8::internal::SnapshotData*, v8::internal::Snapshot
libjsvm.so`v8::Isolate::Initialize(v8::Isolate*, v8::Isolate::CreateParams const&) + 496
libjsvm.so`OH_JSVM_CreateVM + 332
除了v8的Isolate逻辑,这里出现一个熟悉的调用者OH_JSVM_CreateVM
。OH_JSVM_CreateVM完整调用堆栈如下:
ruby
[libjsvm.so] v8::internal::WriteBarrier::SetForThread(v8::internal::MarkingBarrier*) Disassembly:401
[??] ?? 0x004f007b25fa494c(OH_JSVM_CreateVM)
[libmylib.so] MyEngineInstance::MyEngineInstance(napi_env__*, napi_value__*, napi_value__*, std::__n1::basic_string<char, std::__n1::char_traits<char>, std::__n1::allocator<char>> const&, unsigned long const&, std::__n1::basic_string<char, std::__n1::char_traits<char>, std::__n1::allocator<char>> const&) MyEngineInstance.cpp:686
[libmylib.so] std::__n1::__shared_ptr_emplace<MyEngineInstance, std::__n1::allocator<MyEngineInstance>>::__shared_ptr_emplace[abi:v15004]<napi_env__*&, napi_value__*&, napi_value__*&, char const (&) [1], unsigned long&, std::__n1::basic_string<char, std::__n1::char_traits<char>, std::__n1::allocator<char>>&>(std::__n1::allocator<MyEngineInstance>, napi_env__*&, napi_value__*&, napi_value__*&, char const (&) [1], unsigned long&, std::__n1::basic_string<char, std::__n1::char_traits<char>, std::__n1::allocator<char>>&) shared_ptr.h:294
[libmylib.so] std::__n1::shared_ptr<MyEngineInstance> std::__n1::allocate_shared[abi:v15004]<MyEngineInstance, std::__n1::allocator<MyEngineInstance>, napi_env__*&, napi_value__*&, napi_value__*&, char const (&) [1], unsigned long&, std::__n1::basic_string<char, std::__n1::char_traits<char>, std::__n1::allocator<char>>&, void>(std::__n1::allocator<MyEngineInstance> const&, napi_env__*&, napi_value__*&, napi_value__*&, char const (&) [1], unsigned long&, std::__n1::basic_string<char, std::__n1::char_traits<char>, std::__n1::allocator<char>>&) shared_ptr.h:953
[libmylib.so] std::__n1::shared_ptr<MyEngineInstance> std::__n1::make_shared[abi:v15004]<MyEngineInstance, napi_env__*&, napi_value__*&, napi_value__*&, char const (&) [1], unsigned long&, std::__n1::basic_string<char, std::__n1::char_traits<char>, std::__n1::allocator<char>>&, void>(napi_env__*&, napi_value__*&, napi_value__*&, char const (&) [1], unsigned long&, std::__n1::basic_string<char, std::__n1::char_traits<char>, std::__n1::allocator<char>>&) shared_ptr.h:962
[libmylib.so] createMyEngine(napi_env__*, napi_callback_info__*) napi_init.cpp:82
[libace_napi.z.so] panda::JSValueRef ArkNativeFunctionCallBack<true>(panda::JsiRuntimeCallInfo*) 0x0000007a07ebdebc
[JIT(0x777c880400)] RTStub_PushCallRangeAndDispatchNative 0x0000007a1c874eb0
[JIT(0x777c880400)] BCStubInterpreterRoutine 0x0000007a1c48c6bc
这里createMyEngine是在新建MyEngineInstance,同时会创建新的jsvm vm。新的jsvm vm会有自己的Isolate,修改了 thread_local 的 current_marking_barrier 指向。主线程同时运行多个jsvm vm时,它们共享一个 current_marking_barrier,从而引发了问题。
解决方案
使用多线程方案,每个线程只运行一个JSVM vm。JSVM套壳V8,从分析结果看V8就无法做到同线程运行多vm实例。