前言
在上一篇文章《Android15 Framework(1):用户空间第一个进程 Init 解析》里面,我们看了Init进程,那么继续看看Zygote进程都做了什么吧。
注意:本文出现的源码基于Android - 15.0.0_r1。另外本文关注主要逻辑,省略部分代码。
一、 Android系统启动流程
本文是介绍Zygote进程,照例先看下Android系统启动流程:
启动电源及系统启动 -> Bootloader -> Linux内核启动 -> Init -> Zygote -> SystemServer -> Launcher
二、Zygote进程
Zygote是由Init启动,负责孵化所有APP进程,通过预加载机制提升系统效率,下面详细看看Zygote进程相关代码
2.1 app_main.cpp::main
Init进程先调用fork方法,再通过execv方法,找到/system/bin/app_process64下的可执行文件,从而启动zygote进程。我们直接看Zygote进程的入口函数:/frameworks/base/cmds/app_process/app_main.cpp
C++
int main(int argc, char* const argv[])
{
...
// 创建AppRuntime
AppRuntime runtime(argv[0], computeArgBlockSize(argc, argv));
...
// 定义了zygote变量
bool zygote = false;
bool startSystemServer = false;
bool application = false;
String8 niceName;
String8 className;
++i; // Skip unused "parent dir" argument.
while (i < argc) {
const char* arg = argv[i++];
// 如果参数argv中有--zygote, zygote变量为true
if (strcmp(arg, "--zygote") == 0) {
zygote = true;
niceName = ZYGOTE_NICE_NAME;
} else if (strcmp(arg, "--start-system-server") == 0) {
startSystemServer = true;
} else if (strcmp(arg, "--application") == 0) {
application = true;
} else if (strncmp(arg, "--nice-name=", 12) == 0) {
niceName = (arg + 12);
} else if (strncmp(arg, "--", 2) != 0) {
className = arg;
break;
} else {
--i;
break;
}
}
...
// zygote为true,执行此处
if (zygote) {
runtime.start("com.android.internal.os.ZygoteInit", args, zygote);
} else if (!className.empty()) {
runtime.start("com.android.internal.os.RuntimeInit", args, zygote);
} else {
fprintf(stderr, "Error: no class name or --zygote supplied.\n");
app_usage();
LOG_ALWAYS_FATAL("app_process: no class name or --zygote supplied.");
}
}
此处的参数argv就是init.zygotexxx.rc文件中的-Xzygote /system/bin --zygote --start-system-server --socket-name=zygote。因此strcmp(arg, "--zygote,") == 0 为true,所以zygote变量也是true。
继续跟runtime.start("com.android.internal.os.ZygoteInit", args, zygote); 而runtime是定义为了AppRuntime, 它本身并没有实现start方法,因此看看它的父类AndroidRuntime
2.2 AndroidRuntime
/frameworks/base/core/jni/AndroidRuntime.cpp
C++
void AndroidRuntime::start(const char* className, const Vector<String8>& options, bool zygote)
{
...
// 加载so
JniInvocation jni_invocation;
jni_invocation.Init(NULL);
JNIEnv* env;
// 启动虚拟机
if (startVm(&mJavaVM, &env, zygote, primary_zygote) != 0) {
return;
}
onVmCreated(env);
/*
* Register android functions.
*/
// 注册JNI
if (startReg(env) < 0) {
ALOGE("Unable to register all android natives\n");
return;
}
...
char* slashClassName = toSlashClassName(className != NULL ? className : "");
jclass startClass = env->FindClass(slashClassName);
if (startClass == NULL) {
ALOGE("JavaVM unable to locate class '%s'\n", slashClassName);
/* keep going */
} else {
jmethodID startMeth = env->GetStaticMethodID(startClass, "main",
"([Ljava/lang/String;)V");
if (startMeth == NULL) {
ALOGE("JavaVM unable to find main() in '%s'\n", className);
/* keep going */
} else {
// 进入Zygote java
env->CallStaticVoidMethod(startClass, startMeth, strArray);
#if 0
if (env->ExceptionCheck())
threadExitUncaughtException(env);
#endif
}
}
...
}
在AndroidRuntime::start方法中,主要做了下面几件事
- 加载so
- 启动Java虚拟机
- 注册JNI
- 进入Zygote java
我们看看具体都是怎么做的
2.2.1 加载so
jni_invocation.Init(NULL) 调用了定义在JniInvocation.h的JniInvocation类的Init方法
/libnativehelper/include_platform/nativehelper/JniInvocation.h
C++
bool Init(const char* library) {
return JniInvocationInit(impl_, library) != 0;
}
这个方法很简单,调用了JniInvocationInit方法,看看它的实现
/libnativehelper/JniInvocation.c
C++
bool JniInvocationInit(struct JniInvocationImpl* instance, const char* library_name) {
#ifdef __ANDROID__
char buffer[PROP_VALUE_MAX];
#else
char* buffer = NULL;
#endif
library_name = JniInvocationGetLibrary(library_name, buffer);
DlLibrary library = DlOpenLibrary(library_name);
...
DlSymbol JNI_CreateJavaVM_ = FindSymbol(library, "JNI_CreateJavaVM");
...
}
JniInvocationInit方法的实现在JniInvocation.c里,在这个方法中,先调用了JniInvocationGetLibrary获取lib名,再调用DlOpenLibrary加载lib。我们先看看JniInvocationGetLibrary做了什么
/libnativehelper/JniInvocation.c
C++
const char* JniInvocationGetLibrary(const char* library, char* buffer) {
bool debuggable = IsDebuggable();
const char* system_preferred_library = NULL;
if (buffer != NULL && (GetLibrarySystemProperty(buffer) > 0)) {
system_preferred_library = buffer;
}
return JniInvocationGetLibraryWith(library, debuggable, system_preferred_library);
}
JniInvocationGetLibrary先调用了IsDebuggable方法,获取debuggable参数,再调用JniInvocationGetLibraryWith方法,先看看IsDebuggable方法
/libnativehelper/JniInvocation.c
C++
static bool IsDebuggable() {
#ifdef __ANDROID__
char debuggable[PROP_VALUE_MAX] = {0};
__system_property_get("ro.debuggable", debuggable);
return strcmp(debuggable, "1") == 0;
#else
// Host is always treated as debuggable, which allows choice of library to be overridden.
return true;
#endif
}
IsDebuggable方法里面,先判断是否定义了宏__ANDROID__,如果定义了,获取属性ro.debuggable,如果是1,则为true,否则为false。未定义则直接返回true。这里android系统中会定义为__ANDROID__,并且正式版本ro.debuggable定义为0。因此IsDebuggable会返回false。
可以通过adb shell getprop ro.xxx来获取系统属性: 
在知道了debuggable为false之后,我们继续跟JniInvocationGetLibrary后续的代码,即JniInvocationGetLibraryWith
/libnativehelper/JniInvocation.c
C++
const char* JniInvocationGetLibraryWith(const char* library,
bool is_debuggable,
const char* system_preferred_library) {
if (is_debuggable) {
// Debuggable property is set. Allow library providing JNI Invocation API to be overridden.
// Choose the library parameter (if provided).
if (library != NULL) {
return library;
}
// If the debug library is installed, use it.
// TODO(b/216099383): Do this in the test harness instead.
struct stat st;
if (stat(kDebugJniInvocationLibraryPath, &st) == 0) {
return kDebugJniInvocationLibrary;
} else if (errno != ENOENT) {
ALOGW("Failed to stat %s: %s", kDebugJniInvocationLibraryPath, strerror(errno));
}
// Choose the system_preferred_library (if provided).
if (system_preferred_library != NULL) {
return system_preferred_library;
}
}
return kDefaultJniInvocationLibrary;
}
JniInvocationGetLibraryWith的参数is_debuggable为false,因此这里就是kDefaultJniInvocationLibrary的值,它是libart.so
到现在为止,我们知道了JniInvocationGetLibrary(library_name, buffer)返回了"libart.so", 并赋值给了library_name。继续看看DlOpenLibrary(library_name)是怎么来加载so的
/libnativehelper/DlHelp.c
C++
DlLibrary DlOpenLibrary(const char* filename) {
#ifdef _WIN32
return LoadLibrary(filename);
#else
// Load with RTLD_NODELETE in order to ensure that libart.so is not unmapped when it is closed.
// This is due to the fact that it is possible that some threads might have yet to finish
// exiting even after JNI_DeleteJavaVM returns, which can lead to segfaults if the library is
// unloaded.
return dlopen(filename, RTLD_NOW | RTLD_NODELETE);
#endif
}
DlOpenLibrary方法先判断是否定义了宏_WIN32,定义了调用LoadLibrary,否则调用dlopen。_WIN32 是 Windows 平台下常用的预定义宏, 所以此处调用dlopen来加载so文件了。而dlopen里面具体做了什么,此处就不继续跟了。
最后看下JniInvocationInit中的DlSymbol JNI_CreateJavaVM_ = FindSymbol(library, "JNI_CreateJavaVM"),这一句会在找到 VM 提供库中实现的那个函数,也就是 java_vm_ext.cc 中的 JNI_CreateJavaVM。后续会调用JNI_CreateJavaVM函数,这里提一下
小结一下加载so文件,先获取需要加载的so文件名libart.so,然后调用dlopen去加载它。
2.2.2 创建Java虚拟机
看完了加载so,继续看看startVm是怎么创建Java虚拟机的
/frameworks/base/core/jni/AndroidRuntime.cpp
C++
int AndroidRuntime::startVm(JavaVM** pJavaVM, JNIEnv** pEnv, bool zygote, bool primary_zygote)
{
JavaVMInitArgs initArgs;
// 声明参数缓冲区
char propBuf[PROPERTY_VALUE_MAX];
char jniOptsBuf[sizeof("-Xjniopts:")-1 + PROPERTY_VALUE_MAX];
char heapstartsizeOptsBuf[sizeof("-Xms")-1 + PROPERTY_VALUE_MAX];
...
// 读取系统属性再拼接之后,添加到mOptions中
parseRuntimeOption("dalvik.vm.heapstartsize", heapstartsizeOptsBuf, "-Xms", "4m");
parseRuntimeOption("dalvik.vm.heapsize", heapsizeOptsBuf, "-Xmx", "16m");
parseRuntimeOption("dalvik.vm.heapgrowthlimit", heapgrowthlimitOptsBuf, "-XX:HeapGrowthLimit=");
...
// 将解析的属性添加到JavaVMInitArgs中
initArgs.version = JNI_VERSION_1_4;
initArgs.options = mOptions.editArray();
initArgs.nOptions = mOptions.size();
initArgs.ignoreUnrecognized = JNI_FALSE;
// 调用JNI_CreateJavaVM创建并启动 Java 虚拟机
if (JNI_CreateJavaVM(pJavaVM, pEnv, &initArgs) < 0) {
ALOGE("JNI_CreateJavaVM failed\n");
return -1;
}
return 0;
在AndroidRuntime::startVm方法中,主要做了4件事
- 声明参数缓冲区
- 读取系统属性再拼接之后,添加到mOptions中
- 将解析的属性添加到JavaVMInitArgs中
- 调用JNI_CreateJavaVM创建并启动 Java 虚拟机
这里只看看JNI_CreateJavaVM具体是怎么创建并启动 Java 虚拟机
在1.2.1小节中,我们知道了JniInvocationInit通过调用FindSymbol(library, "JNI_CreateJavaVM"),在 VM 提供库中找到实现的函数,也就是java_vm_ext.cc 中的 JNI_CreateJavaVM,看看它的代码 /art/runtime/jni/java_vm_ext.cc
C++
extern "C" EXPORT jint JNI_CreateJavaVM(JavaVM** p_vm, JNIEnv** p_env, void* vm_args) {
...
// 创建虚拟机
if (!Runtime::Create(options, ignore_unrecognized)) {
return JNI_ERR;
}
...
// 初始化本地库加载器
android::InitializeNativeLoader();
Runtime* runtime = Runtime::Current();
// 启动虚拟机
bool started = runtime->Start();
...
return JNI_OK;
}
在JNI_CreateJavaVM方法中先调用Runtime::Create去创建虚拟机,再调用runtime->Start去启动虚拟机,我们先看看创建虚拟机是做了什么
/art/runtime/jni/java_vm_ext.cc
C++
std::unique_ptr<JavaVMExt> JavaVMExt::Create(Runtime* runtime, const RuntimeArgumentMap& runtime_options, std::string* error_msg) {
std::unique_ptr<JavaVMExt> java_vm(new JavaVMExt(runtime, runtime_options));
if (!java_vm->Initialize(error_msg)) {
return nullptr;
}
return java_vm;
}
Runtime::Create调用了JavaVMExt的Initialize,继续跟
/art/runtime/jni/java_vm_ext.cc
C++
bool JavaVMExt::Initialize(std::string* error_msg) {
return globals_.Initialize(kGlobalsMax, error_msg) &&
weak_globals_.Initialize(kWeakGlobalsMax, error_msg);
}
JavaVMExt::Initializ方法很简单,初始化了globals_和weak_globals_,它俩的定义如下:
/art/runtime/jni/java_vm_ext.h
C++
IndirectReferenceTable globals_;
IndirectReferenceTable weak_globals_;
globals_表示强引用对象,不会被GC回收;weak_globals_表示弱引用对象,会被GC回收。
因此创建虚拟机Runtime::Create 其实就是初始化了globals_和weak_globals_。
再接着看看启动虚拟机runtime->Start
/art/runtime/runtime.cc
C++
bool Runtime::Start() {
...
// 将运行时需要的 JNI 本地方法注册好。
RegisterRuntimeNativeMethods(self->GetJniEnv());
class_linker_->RunEarlyRootClinits(self);
InitializeIntrinsics();
self->TransitionFromRunnableToSuspended(ThreadState::kNative);
InitNativeMethods();
...
// 启动守护线程、设置系统类加载器
StartDaemonThreads();
...
finished_starting_ = true;
return true;
}
2.2.3 注册JNI
在执行完加载so和启动虚拟机之后,就会注册jni,这部分代码在AndroidRuntime类的startReg方法中,下面详细看看这个方法
/frameworks/base/core/jni/AndroidRuntime.cpp
C++
/*static*/ int AndroidRuntime::startReg(JNIEnv* env)
{
ATRACE_NAME("RegisterAndroidNatives");
// 设置线程创建函数
androidSetCreateThreadFunc((android_create_thread_fn) javaCreateThreadEtc);
ALOGV("--- registering native functions ---\n");
env->PushLocalFrame(200);
// 注册jni
if (register_jni_procs(gRegJNI, NELEM(gRegJNI), env) < 0) {
env->PopLocalFrame(NULL);
return -1;
}
env->PopLocalFrame(NULL);
//createJavaThread("fubar", quickTest, (void*) "hello");
return 0;
}
在startReg方法中,调用了androidSetCreateThreadFunc,register_jni_procs。首先看看androidSetCreateThreadFunc方法
/system/core/libutils/Threads.cpp
C++
static android_create_thread_fn gCreateThreadFn = androidCreateRawThreadEtc;
int androidCreateThreadEtc(android_thread_func_t entryFunction,
void *userData,
const char* threadName,
int32_t threadPriority,
size_t threadStackSize,
android_thread_id_t *threadId)
{
return gCreateThreadFn(entryFunction, userData, threadName,
threadPriority, threadStackSize, threadId);
}
void androidSetCreateThreadFunc(android_create_thread_fn func)
{
gCreateThreadFn = func;
}
androidSetCreateThreadFunc方法就是将传入的方法赋值给gCreateThreadFn,后续调用创建线程时,就会使用androidSetCreateThreadFunc方法中的传参。看完了androidSetCreateThreadFunc,再看看register_jni_procs做了什么
/frameworks/base/core/jni/AndroidRuntime.cpp
C++
static int register_jni_procs(const RegJNIRec array[], size_t count, JNIEnv* env)
{
for (size_t i = 0; i < count; i++) {
if (array[i].mProc(env) < 0) {
#ifndef NDEBUG
ALOGD("----------!!! %s failed to load\n", array[i].mName);
#endif
return -1;
}
}
return 0;
}
static const RegJNIRec gRegJNI[] = {
REG_JNI(register_com_android_internal_os_RuntimeInit),
REG_JNI(register_com_android_internal_os_ZygoteInit_nativeZygoteInit),
REG_JNI(register_android_os_SystemClock),
REG_JNI(register_android_util_CharsetUtils),
...
};
在register_jni_procs方法中,当array[i].mProc(env) < 0,也就是注册失败了,会返回-1,最后会跑到AndroidRuntime::start中,并return。我们找gRegJNI数组中的第一个元素(register_com_android_internal_os_RuntimeInit),看看做了什么 /frameworks/base/core/jni/AndroidRuntime.cpp
C++
int register_com_android_internal_os_RuntimeInit(JNIEnv* env)
{
const JNINativeMethod methods[] = {
{"nativeFinishInit", "()V",
(void*)com_android_internal_os_RuntimeInit_nativeFinishInit},
{"nativeSetExitWithoutCleanup", "(Z)V",
(void*)com_android_internal_os_RuntimeInit_nativeSetExitWithoutCleanup},
};
return jniRegisterNativeMethods(env, "com/android/internal/os/RuntimeInit",
methods, NELEM(methods));
}
register_com_android_internal_os_RuntimeInit方法中调用了jniRegisterNativeMethods去注册了com_android_internal_os_RuntimeInit_nativeFinishInit、com_android_internal_os_RuntimeInit_nativeSetExitWithoutCleanup这2个方法,后续jniRegisterNativeMethods具体怎么注册的,此处就不再跟进了。
2.2.4 进入Zygote java
分析完了上面几个之后,我们最后来看看怎么从AndroidRuntime进入到Zygote java的,代码如下: /frameworks/base/core/jni/AndroidRuntime.cpp
C++
char* slashClassName = toSlashClassName(className != NULL ? className : "");
jclass startClass = env->FindClass(slashClassName);
if (startClass == NULL) {
ALOGE("JavaVM unable to locate class '%s'\n", slashClassName);
/* keep going */
} else {
jmethodID startMeth = env->GetStaticMethodID(startClass, "main",
"([Ljava/lang/String;)V");
if (startMeth == NULL) {
ALOGE("JavaVM unable to find main() in '%s'\n", className);
/* keep going */
} else {
env->CallStaticVoidMethod(startClass, startMeth, strArray);
#if 0
}
先看看toSlashClassName做了什么
/frameworks/base/core/jni/AndroidRuntime.cpp
C++
char* AndroidRuntime::toSlashClassName(const char* className)
{
char* result = strdup(className);
for (char* cp = result; *cp != '\0'; cp++) {
if (*cp == '.') {
*cp = '/';
}
}
return result;
}
代码很简单,就是把.换成/, 即com.android.internal.os.ZygoteInit 变成了 com/android/internal/os/ZygoteInit
之后调用env->FindClass拿到类名,再通过env->GetStaticMethodID,找到ZygoteInit的静态main方法,且参数是字符串数组,最后通过env->CallStaticVoidMethod调用这个方法,也就进入到了Zygote java,即ZygoteInit的main方法。
2.3 ZygoteInit
入口函数:/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
Java
@UnsupportedAppUsage
public static void main(String[] argv) {
...
Runnable caller;
try {
...
boolean startSystemServer = false;
String zygoteSocketName = "zygote";
String abiList = null;
boolean enableLazyPreload = false;
for (int i = 1; i < argv.length; i++) {
// 参数中包含start-system-server,所以startSystemServer为true
if ("start-system-server".equals(argv[i])) {
startSystemServer = true;
} else if ("--enable-lazy-preload".equals(argv[i])) {
enableLazyPreload = true;
} else if (argv[i].startsWith(ABI_LIST_ARG)) {
abiList = argv[i].substring(ABI_LIST_ARG.length());
} else if (argv[i].startsWith(SOCKET_NAME_ARG)) {
zygoteSocketName = argv[i].substring(SOCKET_NAME_ARG.length());
} else {
throw new RuntimeException("Unknown command line argument: " + argv[i]);
}
}
final boolean isPrimaryZygote = zygoteSocketName.equals(Zygote.PRIMARY_SOCKET_NAME);
...
// 参数中没有--enable-lazy-preload,enableLazyPreload为false,执行此处代码
if (!enableLazyPreload) {
bootTimingsTraceLog.traceBegin("ZygotePreload");
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_START,
SystemClock.uptimeMillis());
// 预加载资源
preload(bootTimingsTraceLog);
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_END,
SystemClock.uptimeMillis());
bootTimingsTraceLog.traceEnd(); // ZygotePreload
}
// Do an initial gc to clean up after startup
bootTimingsTraceLog.traceBegin("PostZygoteInitGC");
gcAndFinalize();
bootTimingsTraceLog.traceEnd(); // PostZygoteInitGC
bootTimingsTraceLog.traceEnd(); // ZygoteInit
Zygote.initNativeState(isPrimaryZygote);
ZygoteHooks.stopZygoteNoThreadCreation();
zygoteServer = new ZygoteServer(isPrimaryZygote);
if (startSystemServer) {
// 此处调用了forkSystemServer
Runnable r = forkSystemServer(abiList, zygoteSocketName, zygoteServer);
// {@code r == null} in the parent (zygote) process, and {@code r != null} in the
// child (system_server) process.
if (r != null) {
r.run();
return;
}
}
Log.i(TAG, "Accepting command socket connections");
// 开启循环,等待客户端的请求
caller = zygoteServer.runSelectLoop(abiList);
} catch (Throwable ex) {
Log.e(TAG, "System zygote died with fatal exception", ex);
throw ex;
} finally {
if (zygoteServer != null) {
zygoteServer.closeServerSocket();
}
}
// We're in the child process and have exited the select loop. Proceed to execute the
// command.
if (caller != null) {
caller.run();
}
}
在ZygoteInit的main方法中,做了挺多事的,最重要的3个分别是调用了preload、forkSystemServer、runSelectLoop。下面分别来看看它们都做了什么
2.3.1 preload
/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
java
static void preload(TimingsTraceLog bootTimingsTraceLog) {
...
preloadClasses();// 预加载类,比如android.app.Activity
...
Resources.preloadResources(); // 预加载系统资源,比如图片
...
nativePreloadAppProcessHALs(); // 硬件抽象层
...
preloadSharedLibraries(); // 预加载库
preloadTextResources(); // 预加载字体资源
...
}
在preLoad方法中,预加载类、系统资源等等,这里就不每个方法都跟进了,看看预加载系统资源preloadResources做了什么
/frameworks/base/core/java/android/content/res/Resources.java
java
@UnsupportedAppUsage
public static void preloadResources() {
try {
final Resources sysRes = Resources.getSystem();
sysRes.startPreloading();
if (PRELOAD_RESOURCES) {
...
TypedArray ar = sysRes.obtainTypedArray(
com.android.internal.R.array.preloaded_drawables);
// 加载图片
int numberOfEntries = preloadDrawables(sysRes, ar);
...
ar = sysRes.obtainTypedArray(
com.android.internal.R.array.preloaded_color_state_lists);
// 加载颜色
numberOfEntries = preloadColorStateLists(sysRes, ar);
...
}
sysRes.finishPreloading();
} catch (RuntimeException e) {
Log.w(TAG, "Failure preloading resources", e);
}
}
在preloadResources方法中,调用了preloadDrawables和preloadColorStateLists,R.array.preloaded_drawables和preloadColorStateLists定义如下:
/frameworks/base/core/res/res/values/arrays.xml
.xml
<!-- Do not translate. These are all of the drawable resources that should be preloaded by
the zygote process before it starts forking application processes. -->
<array name="preloaded_drawables">
<item>@drawable/action_bar_item_background_material</item>
<item>@drawable/activated_background_material</item>
<item>@drawable/btn_borderless_material</item>
<item>@drawable/btn_check_material_anim</item>
<item>@drawable/btn_colored_material</item>
...
</array>
<!-- Do not translate. These are all of the color state list resources that should be
preloaded by the zygote process before it starts forking application processes. -->
<array name="preloaded_color_state_lists">
<item>@color/primary_text_dark</item>
<item>@color/primary_text_dark_disable_only</item>
<item>@color/primary_text_dark_nodisable</item>
<item>@color/primary_text_disable_only_holo_dark</item>
<item>@color/primary_text_disable_only_holo_light</item>
...
</array>
R.array.preloaded_drawables就是很多张图片,preloaded_color_state_lists则是很多颜色,具体加载逻辑就不跟了,有兴趣可以自己跟进去看看。
在执行完 preload 之后,就会调用 forkSystemServer 。这是 Zygote 一个极其重要的职责:它会 fork 出第一个子进程,也就是 system_server 进程 。system_server 负责启动和管理所有的系统服务(如 AMS, WMS 等)。这里先简要说明system_server的作用,forkSystemServer 内部详细的调用代码,后续相关文章再深入研究吧。
Zygote 通过 forkSystemServer 完成了它 "孵化" 核心系统进程的使命,之后它就会进入 runSelectLoop 循环去等待孵化普通的APP进程。
2.3.2 runSelectLoop
在执行完preload和forkSystemServer之后,会调用runSelectLoop
/frameworks/base/core/java/com/android/internal/os/ZygoteServer.java
java
Runnable runSelectLoop(String abiList) {
// 初始化:Zygote 的 server socket 和 peers 列表
ArrayList<FileDescriptor> socketFDs = new ArrayList<>();
ArrayList<ZygoteConnection> peers = new ArrayList<>();
socketFDs.add(mZygoteSocket.getFileDescriptor());
peers.add(null);
mUsapPoolRefillTriggerTimestamp = INVALID_TIMESTAMP;
while (true) {
// 更新 USAP 池策略
fetchUsapPoolPolicyPropsWithMinInterval();
mUsapPoolRefillAction = UsapPoolRefillAction.NONE;
int[] usapPipeFDs = null;
StructPollfd[] pollFDs;
// 构造 poll 要监听的 FD 列表
if (mUsapPoolEnabled) { // 1
usapPipeFDs = Zygote.getUsapPipeFDs();
pollFDs = new StructPollfd[socketFDs.size() + 1 + usapPipeFDs.length];
} else {
pollFDs = new StructPollfd[socketFDs.size()];
}
int pollIndex = 0;
// 将所有 socket FD 加入 poll 监听
for (FileDescriptor socketFD : socketFDs) { // 2
pollFDs[pollIndex] = new StructPollfd();
pollFDs[pollIndex].fd = socketFD;
pollFDs[pollIndex].events = (short) POLLIN;
++pollIndex;
}
final int usapPoolEventFDIndex = pollIndex;
if (mUsapPoolEnabled) { // 3
pollFDs[pollIndex] = new StructPollfd();
pollFDs[pollIndex].fd = mUsapPoolEventFD;
pollFDs[pollIndex].events = (short) POLLIN;
++pollIndex;
// The usapPipeFDs array will always be filled in if the USAP Pool is enabled.
assert usapPipeFDs != null;
for (int usapPipeFD : usapPipeFDs) {
FileDescriptor managedFd = new FileDescriptor();
managedFd.setInt$(usapPipeFD);
pollFDs[pollIndex] = new StructPollfd();
pollFDs[pollIndex].fd = managedFd;
pollFDs[pollIndex].events = (short) POLLIN;
++pollIndex;
}
}
int pollTimeoutMs;
...
int pollReturnValue;
try {
// 等待事件
pollReturnValue = Os.poll(pollFDs, pollTimeoutMs);
} catch (ErrnoException ex) {
throw new RuntimeException("poll failed", ex);
}
if (pollReturnValue == 0) {
// 超时,没有 fd 可读
mUsapPoolRefillTriggerTimestamp = INVALID_TIMESTAMP;
mUsapPoolRefillAction = UsapPoolRefillAction.DELAYED;
} else {
boolean usapPoolFDRead = false;
// 倒序处理 fd
while (--pollIndex >= 0) { // 4
if ((pollFDs[pollIndex].revents & POLLIN) == 0) {
continue;
}
if (pollIndex == 0) {
// Zygote server socket
ZygoteConnection newPeer = acceptCommandPeer(abiList);
peers.add(newPeer);
socketFDs.add(newPeer.getFileDescriptor());
} else if (pollIndex < usapPoolEventFDIndex) {
// Session socket accepted from the Zygote server socket
try {
// 已连接的客户端 socket 发来命令
ZygoteConnection connection = peers.get(pollIndex);
boolean multipleForksOK = !isUsapPoolEnabled()
&& ZygoteHooks.isIndefiniteThreadSuspensionSafe();
final Runnable command =
connection.processCommand(this, multipleForksOK);
// TODO (chriswailes): Is this extra check necessary?
if (mIsForkChild) {
if (command == null) {
throw new IllegalStateException("command == null");
}
// 子进程:返回命令,跳出当前循环
return command;
} else {
// We're in the server - we should never have any commands to run.
if (command != null) {
throw new IllegalStateException("command != null");
}
// 父进程:清理连接并关闭 socket
if (connection.isClosedByPeer()) {
connection.closeSocket();
peers.remove(pollIndex);
socketFDs.remove(pollIndex);
}
}
} catch (Exception e) {
...
} finally {
mIsForkChild = false;
}
} else {
...
}
...
}
...
}
}
runSelectLoop方法代码较长,我们只关注比较重要的几件事。
我们知道,当启动app时,Zygote会fork一个进程,并在这个进程中执行app代码。为了加快app启动速度,可以预先fork一些进程,当启动app的请求来的时候,直接使用预先fork的进程,而这就是USAP。
也就是Zygote会先fork一些进程放在USAP进程池 ,要启动app的时候,就从这个池子里拿一个使用。这就是代码1和代码3的作用。代码2处
/frameworks/base/core/java/com/android/internal/os/ZygoteServer.java
java
int pollIndex = 0;
// 将所有 socket FD 加入 poll 监听
for (FileDescriptor socketFD : socketFDs) { // 2
pollFDs[pollIndex] = new StructPollfd();
pollFDs[pollIndex].fd = socketFD;
pollFDs[pollIndex].events = (short) POLLIN;
++pollIndex;
}
这里就是构造要监听的 FD 列表,并且马上在后续代码中,调用Os.poll,
/frameworks/base/core/java/com/android/internal/os/ZygoteServer.java
C++
pollReturnValue = Os.poll(pollFDs, pollTimeoutMs);
这里实际是调用了poll来等待事件。之后如果没有超时,会进入是代码4里进行倒序处理 fd
/frameworks/base/core/java/com/android/internal/os/ZygoteServer.java
C++
boolean usapPoolFDRead = false;
// 倒序处理 fd
while (--pollIndex >= 0) { // 4
if ((pollFDs[pollIndex].revents & POLLIN) == 0) {
continue;
}
if (pollIndex == 0) {
// Zygote server socket
ZygoteConnection newPeer = acceptCommandPeer(abiList);
peers.add(newPeer);
socketFDs.add(newPeer.getFileDescriptor());
} else if (pollIndex < usapPoolEventFDIndex) {
// Session socket accepted from the Zygote server socket
try {
// 已连接的客户端 socket 发来命令
ZygoteConnection connection = peers.get(pollIndex);
boolean multipleForksOK = !isUsapPoolEnabled()
&& ZygoteHooks.isIndefiniteThreadSuspensionSafe();
final Runnable command =
connection.processCommand(this, multipleForksOK);
// TODO (chriswailes): Is this extra check necessary?
if (mIsForkChild) {
if (command == null) {
throw new IllegalStateException("command == null");
}
// 子进程:返回命令,跳出当前循环
return command;
} else {
// We're in the server - we should never have any commands to run.
if (command != null) {
throw new IllegalStateException("command != null");
}
// 父进程:清理连接并关闭 socket
if (connection.isClosedByPeer()) {
connection.closeSocket();
peers.remove(pollIndex);
socketFDs.remove(pollIndex);
}
}
} catch (Exception e) {
...
} else {
...
}
在这里的循环中,首先判断(pollFDs[pollIndex].revents & POLLIN) == 0,如果为0,说明没有如果 可读事件发生,就跳过当前 FD,不处理它。
后续则判断pollIndex == 0, 这是代表 这里的usapPoolEventFDIndex,是代码2处将所有 socket FD 加入 poll 监听后,将pollIndex 给它。如果pollIndex < usapPoolEventFDIndex,这说明新的客户端通过 Zygote 的监听 socket 发起连接
/frameworks/base/core/java/com/android/internal/os/ZygoteServer.java
C++
ZygoteConnection newPeer = acceptCommandPeer(abiList);
peers.add(newPeer);
socketFDs.add(newPeer.getFileDescriptor());
Zygote会调用acceptCommandPeer(...)创建一个ZygoteConnection来处理该连接,并且将它添加到peers,fd添加到socketFDs中。之后就到重头戏了
/frameworks/base/core/java/com/android/internal/os/ZygoteServer.java
C++
try {
// 已连接的客户端 socket 发来命令
ZygoteConnection connection = peers.get(pollIndex);
boolean multipleForksOK = !isUsapPoolEnabled()
&& ZygoteHooks.isIndefiniteThreadSuspensionSafe();
final Runnable command =
connection.processCommand(this, multipleForksOK);
// TODO (chriswailes): Is this extra check necessary?
if (mIsForkChild) {
if (command == null) {
throw new IllegalStateException("command == null");
}
// 子进程:返回命令,跳出当前循环
return command;
} else {
// We're in the server - we should never have any commands to run.
if (command != null) {
throw new IllegalStateException("command != null");
}
// 父进程:清理连接并关闭 socket
if (connection.isClosedByPeer()) {
connection.closeSocket();
peers.remove(pollIndex);
socketFDs.remove(pollIndex);
}
}
} catch (Exception e) {
...
}
首先pollIndex < usapPoolEventFDIndex,这表示客户端发来了命令,调用processCommand来处理这个命令。如果processCommand(...)是在子进程里 (fork 出来) 会返回一个 Runnable,并且这个Runnable会在子进程中被执行 (通常是启动 app 进程的入口逻辑,比如 ActivityThread.main(...));而如果是父进程则清理连接并关闭 socket,继续这个循环。
我们详细看看processCommand的代码 frameworks/base/core/java/com/android/internal/os/ZygoteConnection.java
java
Runnable processCommand(ZygoteServer zygoteServer, boolean multipleOK) {
ZygoteArguments parsedArgs;
try (ZygoteCommandBuffer argBuffer = new ZygoteCommandBuffer(mSocket)) {
while (true) {
...
if (parsedArgs.mInvokeWith != null || parsedArgs.mStartChildZygote
pid = Zygote.forkAndSpecialize(parsedArgs.mUid, parsedArgs.mGid,
parsedArgs.mGids, parsedArgs.mRuntimeFlags, rlimits,
parsedArgs.mMountExternal, parsedArgs.mSeInfo, parsedArgs.mNiceName,
fdsToClose, fdsToIgnore, parsedArgs.mStartChildZygote,
parsedArgs.mInstructionSet, parsedArgs.mAppDataDir,
parsedArgs.mIsTopApp, parsedArgs.mPkgDataInfoList,
parsedArgs.mAllowlistedDataInfoList, parsedArgs.mBindMountAppDataDirs,
parsedArgs.mBindMountAppStorageDirs,
parsedArgs.mBindMountSyspropOverrides);
try {
if (pid == 0) {
// in child
zygoteServer.setForkChild();
zygoteServer.closeServerSocket();
IoUtils.closeQuietly(serverPipeFd);
serverPipeFd = null;
return handleChildProc(parsedArgs, childPipeFd,
parsedArgs.mStartChildZygote);
} else {
// In the parent. A pid < 0 indicates a failure and will be handled in
// handleParentProc.
IoUtils.closeQuietly(childPipeFd);
childPipeFd = null;
handleParentProc(pid, serverPipeFd);
return null;
}
} finally {
IoUtils.closeQuietly(childPipeFd);
IoUtils.closeQuietly(serverPipeFd);
}
} else {
...
}
}
}
...
}
在processCommand方法里面,先调用了forkAndSpecialize,然后再调用zygoteServer.closeServerSocket()去关闭ServerSocket,最后调用handleChildProc
我们先来看先调用了forkAndSpecialize代码 frameworks/base/core/java/com/android/internal/os/Zygote.java
java
static int forkAndSpecialize(int uid, int gid, int[] gids, int runtimeFlags,
...
int pid = nativeForkAndSpecialize(
uid, gid, gids, runtimeFlags, rlimits, mountExternal, seInfo, niceName, fdsToClose,
fdsToIgnore, startChildZygote, instructionSet, appDataDir, isTopApp,
pkgDataInfoList, allowlistedDataInfoList, bindMountAppDataDirs,
bindMountAppStorageDirs, bindMountSyspropOverrides);
...
return pid;
}
在forkAndSpecialize中调用了nativeForkAndSpecialize,它是一个native方法
/frameworks/base/core/jni/com_android_internal_os_Zygote.cpp
C++
static jint com_android_internal_os_Zygote_nativeForkAndSpecialize(
JNIEnv* env, jclass, jint uid, jint gid, jintArray gids, jint runtime_flags,
jobjectArray rlimits, jint mount_external, jstring se_info, jstring nice_name,
jintArray managed_fds_to_close, jintArray managed_fds_to_ignore, jboolean is_child_zygote,
jstring instruction_set, jstring app_data_dir, jboolean is_top_app,
jobjectArray pkg_data_info_list, jobjectArray allowlisted_data_info_list,
jboolean mount_data_dirs, jboolean mount_storage_dirs, jboolean mount_sysprop_overrides) {
...
pid_t pid = zygote::ForkCommon(env, /* is_system_server= */ false, fds_to_close, fds_to_ignore,
true);
if (pid == 0) {
SpecializeCommon(env, uid, gid, gids, runtime_flags, rlimits, capabilities, capabilities,
bounding_capabilities, mount_external, se_info, nice_name, false,
is_child_zygote == JNI_TRUE, instruction_set, app_data_dir,
is_top_app == JNI_TRUE, pkg_data_info_list, allowlisted_data_info_list,
mount_data_dirs == JNI_TRUE, mount_storage_dirs == JNI_TRUE,
mount_sysprop_overrides == JNI_TRUE);
}
return pid;
}
com_android_internal_os_Zygote_nativeForkAndSpecialize方法中就是调用了zygote::ForkCommon,继续跟
/frameworks/base/core/jni/com_android_internal_os_Zygote.cpp
C++
pid_t zygote::ForkCommon(JNIEnv* env, bool is_system_server,
const std::vector<int>& fds_to_close,
const std::vector<int>& fds_to_ignore,
bool is_priority_fork,
bool purge) {
...
// fork客户端请求的进程。
pid_t pid = fork();
...
return pid;
}
在ForkCommon的实现中,就是调用了fork。
因此可以得出processCommand调用forkAndSpecialize,其实就是调用了fork函数。而后面在子进程中,调用了zygoteServer.closeServerSocket()去关闭ServerSocket, 因为是fork的Zygote进程,需要关闭socket。而后在handleChildProc处理子进程相关,那么就继续跟进代码 /frameworks/base/core/java/com/android/internal/os/ZygoteConnection.java
java
private Runnable handleChildProc(ZygoteArguments parsedArgs,
FileDescriptor pipeFd, boolean isZygote) {
closeSocket();
Zygote.setAppProcessName(parsedArgs, TAG);
Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
if (parsedArgs.mInvokeWith != null) {
WrapperInit.execApplication(parsedArgs.mInvokeWith,
parsedArgs.mNiceName, parsedArgs.mTargetSdkVersion,
VMRuntime.getCurrentInstructionSet(),
pipeFd, parsedArgs.mRemainingArgs);
// Should not get here.
throw new IllegalStateException("WrapperInit.execApplication unexpectedly returned");
} else {
if (!isZygote) {
return ZygoteInit.zygoteInit(parsedArgs.mTargetSdkVersion,
parsedArgs.mDisabledCompatChanges,
parsedArgs.mRemainingArgs, null /* classLoader */);
} else {
return ZygoteInit.childZygoteInit(
parsedArgs.mRemainingArgs /* classLoader */);
}
}
}
在handleChildProc中,先判断parsedArgs.mInvokeWith是否为空,不为空则执行WrapperInit.execApplication。而正常通过 AMS (ActivityManagerService) 启动应用进程时,一般不会使用 --invoke-with 参数。因此这里继续跟else分支,在else分支中,isZygote为false,执行ZygoteInit.zygoteInit,否则,则执行ZygoteInit.childZygoteInit。 继续跟ZygoteInit.zygoteInit
/frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
java
public static Runnable zygoteInit(int targetSdkVersion, long[] disabledCompatChanges,
String[] argv, ClassLoader classLoader) {
if (RuntimeInit.DEBUG) {
Slog.d(RuntimeInit.TAG, "RuntimeInit: Starting application from zygote");
}
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ZygoteInit");
RuntimeInit.redirectLogStreams();
RuntimeInit.commonInit();
ZygoteInit.nativeZygoteInit();
return RuntimeInit.applicationInit(targetSdkVersion, disabledCompatChanges, argv,
classLoader);
}
zygoteInit里面调用了RuntimeInit.applicationInit方法,继续跟 /frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
java
protected static Runnable applicationInit(int targetSdkVersion, long[] disabledCompatChanges,
String[] argv, ClassLoader classLoader) {
// If the application calls System.exit(), terminate the process
// immediately without running any shutdown hooks. It is not possible to
// shutdown an Android application gracefully. Among other things, the
// Android runtime shutdown hooks close the Binder driver, which can cause
// leftover running threads to crash before the process actually exits.
nativeSetExitWithoutCleanup(true);
VMRuntime.getRuntime().setTargetSdkVersion(targetSdkVersion);
VMRuntime.getRuntime().setDisabledCompatChanges(disabledCompatChanges);
final Arguments args = new Arguments(argv);
// The end of of the RuntimeInit event (see #zygoteInit).
Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
// 根据客户端的参数,找到对应的类,并调用main函数
return findStaticMain(args.startClass, args.startArgs, classLoader);
}
protected static Runnable findStaticMain(String className, String[] argv,
ClassLoader classLoader) {
Class<?> cl;
try {
// 根据给定的类名返回一个Class对象
cl = Class.forName(className, true, classLoader);
} catch (ClassNotFoundException ex) {
throw new RuntimeException(
"Missing class when invoking static main " + className,
ex);
}
Method m;
try {
// 查找该类里的Main方法
m = cl.getMethod("main", new Class[] { String[].class });
} catch (NoSuchMethodException ex) {
throw new RuntimeException(
"Missing static main on " + className, ex);
} catch (SecurityException ex) {
throw new RuntimeException(
"Problem getting static main on " + className, ex);
}
int modifiers = m.getModifiers();
if (! (Modifier.isStatic(modifiers) && Modifier.isPublic(modifiers))) {
throw new RuntimeException(
"Main method is not public and static on " + className);
}
/*
* This throw gets caught in ZygoteInit.main(), which responds
* by invoking the exception's run() method. This arrangement
* clears up all the stack frames that were required in setting
* up the process.
*/
return new MethodAndArgsCaller(m, argv);
}
在applicationInit通过调用findStaticMain,找到对应类的Main方法。而在findStaticMain先调用Class.forName,根据给定的类名返回一个Class对象,然后再调用cl.getMethod,查找该类里的Main方法,最后返回了MethodAndArgsCaller对象 /frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
java
static class MethodAndArgsCaller implements Runnable {
/** method to call */
private final Method mMethod;
/** argument array */
private final String[] mArgs;
public MethodAndArgsCaller(Method method, String[] args) {
mMethod = method;
mArgs = args;
}
public void run() {
try {
// 调用对应类的Main方法
mMethod.invoke(null, new Object[] { mArgs });
} catch (IllegalAccessException ex) {
throw new RuntimeException(ex);
} catch (InvocationTargetException ex) {
Throwable cause = ex.getCause();
if (cause instanceof RuntimeException) {
throw (RuntimeException) cause;
} else if (cause instanceof Error) {
throw (Error) cause;
}
throw new RuntimeException(ex);
}
}
}
在MethodAndArgsCaller类中,会执行run方法,调用了mMethod.invoke,也就调用了客户端请求的class类的Main方法。
2.4 Zygote为何选择Socket而非Binder?
在上一小节中,文我们看到Zygote通过Socket监听请求,那么为什么不使用Binder呢?
一方面是Zygote进程使用binder会造成死锁: Binder通信机制本质上是为多线程环境设计的。当进程作为Binder服务端时,会自动创建Binder线程池来处理并发请求。这就意味着使用Binder的Zygote进程必然会成为一个多线程进程。
而Linux的fork()系统调用在复制多线程进程时存在风险:它只会复制调用线程,其他线程持有的锁状态会被子进程继承,但持有锁的线程本身却不存在于子进程中。这会导致子进程中的锁永远无法被释放,从而引发死锁或异常。 因此,选择单线程的Socket通信成为Zygote更好的选择。
另一方面是时序问题,Init进程是先创建ServiceManager,后创建Zygote进程的。但不能保证Zygote进程去注册binder的时候,ServiceManager已经初始化好了。
注:Binder是一个复杂的机制,后续会专门写Binder相关文章。
三、总结
至此就分析完了Zygote进程,Zygote在系统初始化阶段,执行初始化虚拟机(startVm),预加载公共类、资源和库(preload()),后续Zygote利用 fork 机制复制自身状态来创建新的应用进程,避免了重复初始化虚拟机、加载系统资源的开销,极大加速了应用启动速度。
感谢阅读,希望本文对你有所帮助,如有任何不对的地方,欢迎大家指出。
四、参考资料
《Android进阶解密》