注意(WARNING):本文含有大量 AOSP 源码,阅读过程中如出现头晕、目眩、恶心、犯困等症状属正常情况,作者本人亦无法避免症状产生,故不承担任何法律责任
SystemServer 是 Android 系统 Java 层最重要的进程之一,几乎所有的 Java 层 Binder 服务都运行在这个进程里。
SystemServer 的启动大致可分为两个阶段:
- 在 Zygote 进程中调用 fork 系统调用创建 SystemServer 进程
- 执行 SystemServer 类的 main 方法来启动系统服务
本节我们分析第一阶段:
在 Zygote 启动的分析中,我们知道 init.rc 文件中定义了 Zygote 进程的启动参数:
bash
service zygote /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server --socket-name=zygote
class main
priority -20
user root
group root readproc reserved_disk
socket zygote stream 660 root system
socket usap_pool_primary stream 660 root system
onrestart write /sys/android_power/request_state wake
onrestart write /sys/power/state on
onrestart restart audioserver
onrestart restart cameraserver
onrestart restart media
onrestart restart netd
onrestart restart wificond
writepid /dev/cpuset/foreground/tasks
这里传入了 --start-system-server
参数,在 app_process 的源码中,会解析出这个参数:
cpp
// frameworks/base/cmds/app_process/app_main.cpp
// 需要解析出的参数
bool zygote = false;
bool startSystemServer = false;
bool application = false;
String8 niceName;
String8 className;
++i; // Skip unused "parent dir" argument.
while (i < argc) {
const char* arg = argv[i++];
if (strcmp(arg, "--zygote") == 0) {
zygote = true;
niceName = ZYGOTE_NICE_NAME;
} else if (strcmp(arg, "--start-system-server") == 0) {
// 解析出 --start-system-server 参数
startSystemServer = true;
} else if (strcmp(arg, "--application") == 0) {
application = true;
} else if (strncmp(arg, "--nice-name=", 12) == 0) {
niceName.setTo(arg + 12);
} else if (strncmp(arg, "--", 2) != 0) {
className.setTo(arg);
break;
} else {
--i;
break;
}
}
这里会解析到 --start-system-server
参数,并将 startSystemServer
变量设置为 true。
当 app_process 执行到 Java 层的 ZygoteInit 的 main 函数中, startSystemServer
的值为 true,则会执行 forkSystemServer
函数:
java
// frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
if (startSystemServer) {
// fork SystemServer
Runnable r = forkSystemServer(abiList, zygoteSocketName, zygoteServer);
// {@code r == null} in the parent (zygote) process, and {@code r != null} in the
// child (system_server) process.
// 如果 r 为空,说明是 zygote 进程,不做任何处理,继续执行
if (r != null) {
// r 不为空,说明是孵化的子进程 systemserver,启动后直接返回
r.run();
return;
}
}
forkSystemServer()
函数会调用到 native 层的 fork 系统调用,启动一个新的进程,在新的进程中,会把新进程对应的 main 方法包装为一个 Runnable 对象返回,接着调用 Runnable 对象的 run 方法,执行新进程的 main 方法。
forkSystemServer()
函数的具体实现如下:
java
private static Runnable forkSystemServer(String abiList, String socketName,
ZygoteServer zygoteServer) {
// ......
// 准备Systemserver的启动参数
String args[] = {
"--setuid=1000",
"--setgid=1000",
"--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,"
+ "1024,1032,1065,3001,3002,3003,3006,3007,3009,3010",
"--capabilities=" + capabilities + "," + capabilities,
"--nice-name=system_server",
"--runtime-args",
"--target-sdk-version=" + VMRuntime.SDK_VERSION_CUR_DEVELOPMENT,
"com.android.server.SystemServer",
};
ZygoteArguments parsedArgs = null;
int pid;
try {
parsedArgs = new ZygoteArguments(args);
Zygote.applyDebuggerSystemProperty(parsedArgs);
Zygote.applyInvokeWithSystemProperty(parsedArgs);
boolean profileSystemServer = SystemProperties.getBoolean(
"dalvik.vm.profilesystemserver", false);
if (profileSystemServer) {
parsedArgs.mRuntimeFlags |= Zygote.PROFILE_SYSTEM_SERVER;
}
// 调用Zygote类的forkSystemServer()函数fork出SystemServer进程
pid = Zygote.forkSystemServer(
parsedArgs.mUid, parsedArgs.mGid,
parsedArgs.mGids,
parsedArgs.mRuntimeFlags,
null,
parsedArgs.mPermittedCapabilities,
parsedArgs.mEffectiveCapabilities);
} catch (IllegalArgumentException ex) {
throw new RuntimeException(ex);
}
/* For child process */
if (pid == 0) {
if (hasSecondZygote(abiList)) {
waitForSecondaryZygote(socketName);
}
zygoteServer.closeServerSocket();
return handleSystemServerProcess(parsedArgs);
}
return null;
}
函数的主要流程:
- 准备Systemserver的启动参数
- 进程 ID 和组 ID 设置为 1000
- 设定进程名称为 system_server
- 指定 Systemserver 的执行类 com.android.server.SystemServer
- 调用 Zygote 类的 forkSystemServer() 方法 fork 出 SystemServer 进程
我们接着看 Zygote 类的 forkSystemServer() 方法:
java
// frameworks/base/core/java/com/android/internal/os/Zygote.java
public static int forkSystemServer(int uid, int gid, int[] gids, int runtimeFlags,
int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) {
ZygoteHooks.preFork();
// Resets nice priority for zygote process.
resetNicePriority();
int pid = nativeForkSystemServer(
uid, gid, gids, runtimeFlags, rlimits,
permittedCapabilities, effectiveCapabilities);
// Enable tracing as soon as we enter the system_server.
if (pid == 0) {
Trace.setTracingEnabled(true, runtimeFlags);
}
ZygoteHooks.postForkCommon();
return pid;
}
private static native int nativeForkSystemServer(int uid, int gid, int[] gids, int runtimeFlags,
int[][] rlimits, long permittedCapabilities, long effectiveCapabilities);
这里的 forkSystemServer
是一个 native 方法,其对应的 JNI 函数是 com_android_internal_os_Zygote_nativeForkSystemServer
:
cpp
// frameworks/base/core/jni/com_android_internal_os_Zygote.cpp
static jint com_android_internal_os_Zygote_nativeForkSystemServer(
JNIEnv* env, jclass, uid_t uid, gid_t gid, jintArray gids,
jint runtime_flags, jobjectArray rlimits, jlong permitted_capabilities,
jlong effective_capabilities) {
std::vector<int> fds_to_close(MakeUsapPipeReadFDVector()),
fds_to_ignore(fds_to_close);
fds_to_close.push_back(gUsapPoolSocketFD);
if (gUsapPoolEventFD != -1) {
fds_to_close.push_back(gUsapPoolEventFD);
fds_to_ignore.push_back(gUsapPoolEventFD);
}
// ForkCommon 对 fork 进程了包装
pid_t pid = ForkCommon(env, true,
fds_to_close,
fds_to_ignore);
if (pid == 0) {
// 根据参数配置子进程
SpecializeCommon(env, uid, gid, gids, runtime_flags, rlimits,
permitted_capabilities, effective_capabilities,
MOUNT_EXTERNAL_DEFAULT, nullptr, nullptr, true,
false, nullptr, nullptr);
} else if (pid > 0) {
// The zygote process checks whether the child process has died or not.
ALOGI("System server process %d has been created", pid);
gSystemServerPid = pid;
// There is a slight window that the system server process has crashed
// but it went unnoticed because we haven't published its pid yet. So
// we recheck here just to make sure that all is well.
int status;
// Zygote 进程会通过 waitpid() 函数来检查 SystemServer 进程是否启动成功,如果不成功 ,Zygote 进程会退出重启
if (waitpid(pid, &status, WNOHANG) == pid) {
ALOGE("System server process %d has died. Restarting Zygote!", pid);
RuntimeAbort(env, __LINE__, "System server process has died. Restarting Zygote!");
}
if (UsePerAppMemcg()) {
// Assign system_server to the correct memory cgroup.
// Not all devices mount memcg so check if it is mounted first
// to avoid unnecessarily printing errors and denials in the logs.
if (!SetTaskProfiles(pid, std::vector<std::string>{"SystemMemoryProcess"})) {
ALOGE("couldn't add process %d into system memcg group", pid);
}
}
}
return pid;
}
- 调用 ForkCommon 函数,ForkCommon 函数是对 fork 系统调用的包装,会 fork 出两个进程
- 对于子进程,会调用 SpecializeCommon 函数根据参数来配置子进程
- 父进程,也就是 Zygote 进程会通过 waitpid() 函数来检查 SystemServer 进程是否启动成功,如果不成功 ,Zygote 进程会退出重启
接下来我们看下 ForkCommon 和 SpecializeCommon 函数的具体实现:
cpp
static pid_t ForkCommon(JNIEnv* env, bool is_system_server,
const std::vector<int>& fds_to_close,
const std::vector<int>& fds_to_ignore) {
// 设置子进程的signal信号处理函数
SetSignalHandlers();
// 失败处理函数
auto fail_fn = std::bind(ZygoteFailure, env, is_system_server ? "system_server" : "zygote",
nullptr, _1);
// 临时block住子进程SIGCHLD信号,信号处理导致出错
BlockSignal(SIGCHLD, fail_fn);
// 关闭所有日志相关的文件描述符
__android_log_close();
stats_log_close();
// 如果是当前zygote第一次fork,创建文件描述符表
if (gOpenFdTable == nullptr) {
gOpenFdTable = FileDescriptorTable::Create(fds_to_ignore, fail_fn);
} else {
// 否则判断需要ignore的文件描述与表中是否有变化
gOpenFdTable->Restat(fds_to_ignore, fail_fn);
}
android_fdsan_error_level fdsan_error_level = android_fdsan_get_error_level();
pid_t pid = fork();
if (pid == 0) { // 子进程
// 基本的一些初始化操作
PreApplicationInit();
DetachDescriptors(env, fds_to_close, fail_fn);
ClearUsapTable();
gOpenFdTable->ReopenOrDetach(fail_fn);
android_fdsan_set_error_level(fdsan_error_level);
} else {
ALOGD("Forked child process %d", pid);
}
// We blocked SIGCHLD prior to a fork, we unblock it here.
UnblockSignal(SIGCHLD, fail_fn);
return pid;
}
ForkCommon 是对 fork 系统调用的包装,在 fork 之前,还需要处理子进程信号和文件描述符问题。对于文件描述符有两个数组,fds_to_close 中存放子进程需要关闭的文件描述符,fds_to_ignore 中存放子进程需要继承的文件描述符,不过子进程会重新打开这些文件描述符,因此与 Zygote 并不是共享的。
cpp
static void SpecializeCommon(JNIEnv* env, uid_t uid, gid_t gid, jintArray gids, jint runtime_flags,
jobjectArray rlimits, jlong permitted_capabilities,
jlong effective_capabilities, jint mount_external,
jstring managed_se_info, jstring managed_nice_name,
bool is_system_server, bool is_child_zygote,
jstring managed_instruction_set, jstring managed_app_data_dir,
bool is_top_app, jobjectArray pkg_data_info_list,
jobjectArray allowlisted_data_info_list, bool mount_data_dirs,
bool mount_storage_dirs) {
···
if (!is_system_server && getuid() == 0) {
// 创建进程组
const int rc = createProcessGroup(uid, getpid());
if (rc == -EROFS) {
ALOGW("createProcessGroup failed, kernel missing CONFIG_CGROUP_CPUACCT?");
} else if (rc != 0) {
ALOGE("createProcessGroup(%d, %d) failed: %s", uid, /* pid= */ 0, strerror(-rc));
}
}
// 设置Gid
SetGids(env, gids, is_child_zygote, fail_fn);
// 设置资源limit
SetRLimits(env, rlimits, fail_fn);
if (need_pre_initialize_native_bridge) {
// Due to the logic behind need_pre_initialize_native_bridge we know that
// instruction_set contains a value.
android::PreInitializeNativeBridge(app_data_dir.has_value() ? app_data_dir.value().c_str()
: nullptr,
instruction_set.value().c_str());
}
if (setresgid(gid, gid, gid) == -1) {
fail_fn(CREATE_ERROR("setresgid(%d) failed: %s", gid, strerror(errno)));
}
SetUpSeccompFilter(uid, is_child_zygote);
// 设置调度策略
SetSchedulerPolicy(fail_fn, is_top_app);
···
// 给子进程主线程设置一个名字
if (nice_name.has_value()) {
SetThreadName(nice_name.value());
} else if (is_system_server) {
SetThreadName("system_server");
}
// 恢复对于SIGCHLD信号的处理
UnsetChldSignalHandler();
if (is_system_server) {
env->CallStaticVoidMethod(gZygoteClass, gCallPostForkSystemServerHooks, runtime_flags);
if (env->ExceptionCheck()) {
fail_fn("Error calling post fork system server hooks.");
}
// TODO(b/117874058): Remove hardcoded label here.
static const char* kSystemServerLabel = "u:r:system_server:s0";
if (selinux_android_setcon(kSystemServerLabel) != 0) {
fail_fn(CREATE_ERROR("selinux_android_setcon(%s)", kSystemServerLabel));
}
}
if (is_child_zygote) {
initUnsolSocketToSystemServer();
}
// 等价于调用 Zygote.callPostForkChildHooks
env->CallStaticVoidMethod(gZygoteClass, gCallPostForkChildHooks, runtime_flags,
is_system_server, is_child_zygote, managed_instruction_set);
// 设置默认进程优先级
setpriority(PRIO_PROCESS, 0, PROCESS_PRIORITY_DEFAULT);
if (env->ExceptionCheck()) {
fail_fn("Error calling post fork hooks.");
}
}
SpecializeCommon 函数主要是根据之前解析出的参数来配置 fork 出的子进程。
接下来我们回到一开始的 private static Runnable forkSystemServer(String abiList, String socketName,ZygoteServer zygoteServer)
函数中:
java
private static Runnable forkSystemServer(String abiList, String socketName,
ZygoteServer zygoteServer) {
// ......
try {
// ......
// 调用Zygote类的forkSystemServer()函数fork出SystemServer进程
pid = Zygote.forkSystemServer(
parsedArgs.mUid, parsedArgs.mGid,
parsedArgs.mGids,
parsedArgs.mRuntimeFlags,
null,
parsedArgs.mPermittedCapabilities,
parsedArgs.mEffectiveCapabilities);
} catch (IllegalArgumentException ex) {
throw new RuntimeException(ex);
}
if (pid == 0) { // 子进程
if (hasSecondZygote(abiList)) {
waitForSecondaryZygote(socketName);
}
zygoteServer.closeServerSocket();
return handleSystemServerProcess(parsedArgs);
}
return null;
}
fork 完成后,在子进程中会调用 handleSystemServerProcess 函数,将 SystemServer 的 main 方法包装成一个 Runnable 返回,接下来我们来看这个函数是如何实现的:
cpp
private static Runnable handleSystemServerProcess(ZygoteArguments parsedArgs) {
// 设置umask为0077(权限的补码)
// 这样SystemServer创建的文件属性就是0700,只有进程本身可以访问
Os.umask(S_IRWXG | S_IRWXO);
// 设置进程的名字
if (parsedArgs.mNiceName != null) {
Process.setArgV0(parsedArgs.mNiceName);
}
// 设置 classpath
final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH");
if (systemServerClasspath != null) {
if (performSystemServerDexOpt(systemServerClasspath)) {
// Throw away the cached classloader. If we compiled here, the classloader would
// not have had AoT-ed artifacts.
// Note: This only works in a very special environment where selinux enforcement is
// disabled, e.g., Mac builds.
sCachedSystemServerClassLoader = null;
}
// Capturing profiles is only supported for debug or eng builds since selinux normally
// prevents it.
boolean profileSystemServer = SystemProperties.getBoolean(
"dalvik.vm.profilesystemserver", false);
if (profileSystemServer && (Build.IS_USERDEBUG || Build.IS_ENG)) {
try {
prepareSystemServerProfile(systemServerClasspath);
} catch (Exception e) {
Log.wtf(TAG, "Failed to set up system server profile", e);
}
}
}
// invokeWith 通常为 null
if (parsedArgs.mInvokeWith != null) {
//......
} else { // 一般走这里
createSystemServerClassLoader();
ClassLoader cl = sCachedSystemServerClassLoader;
if (cl != null) {
Thread.currentThread().setContextClassLoader(cl);
}
// 通过查找启动类的main方法,然后打包成Runnable对象返回
return ZygoteInit.zygoteInit(parsedArgs.mTargetSdkVersion,
parsedArgs.mRemainingArgs, cl);
}
/* should never reach here */
}
handleSystemServerProcess 在完成一些配置工作后,最终会调用到 zygoteInit 查找启动类的 main 方法,然后打包成 Runnable 对象返回。
java
// frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
public static final Runnable zygoteInit(int targetSdkVersion, String[] argv,
ClassLoader classLoader) {
if (RuntimeInit.DEBUG) {
Slog.d(RuntimeInit.TAG, "RuntimeInit: Starting application from zygote");
}
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ZygoteInit");
RuntimeInit.redirectLogStreams();
RuntimeInit.commonInit();
ZygoteInit.nativeZygoteInit();
return RuntimeInit.applicationInit(targetSdkVersion, argv, classLoader);
}
ZygoteInit.zygoteInit 方法又调用了三个方法:RuntimeInit.commonInit()、ZygoteInit.nativeZygoteInit()、RuntimeInit.applicationInit(),最后 return 一个 Runnable 的对象给调用者。
commonInit() 用于执行一些通用配置的初始化:
- 设置 KillApplicationHandler 为默认的 UncaughtExceptionHandler
- 设置时区
- 设置 http.agent 属性,用于 HttpURLConnection
- 重置 Android 的 Log 系统
- 通过 NetworkManagementSocketTagger 设置 socket 的 tag,用于流量统计
nativeZygoteInit() 是一个 native 方法:
java
private static final native void nativeZygoteInit();
// frameworks/base/core/jni/AndroidRuntime.cpp
static void com_android_internal_os_ZygoteInit_nativeZygoteInit(JNIEnv* env, jobject clazz)
{
// gCurRuntime 是 AppRuntime 的实例
gCurRuntime->onZygoteInit();
}
这里接着会调用 AppRuntime 的 onZygoteInit() 方法:
java
// frameworks/base/cmds/app_process/app_main.cpp
virtual void onZygoteInit()
{
sp<ProcessState> proc = ProcessState::self();
ALOGV("App process: starting thread pool.\n");
proc->startThreadPool();
}
这部分代码在 Binder 中我都介绍过了,主要用于初始化 Binder 的使用环境,这样,应用进程就可以使用 Binder 了。
接着函数会执行到 applicationInit
java
protected static Runnable applicationInit(int targetSdkVersion, String[] argv,
ClassLoader classLoader) {
// If the application calls System.exit(), terminate the process
// immediately without running any shutdown hooks. It is not possible to
// shutdown an Android application gracefully. Among other things, the
// Android runtime shutdown hooks close the Binder driver, which can cause
// leftover running threads to crash before the process actually exits.
nativeSetExitWithoutCleanup(true);
// We want to be fairly aggressive about heap utilization, to avoid
// holding on to a lot of memory that isn't needed.
VMRuntime.getRuntime().setTargetHeapUtilization(0.75f);
VMRuntime.getRuntime().setTargetSdkVersion(targetSdkVersion);
final Arguments args = new Arguments(argv);
// The end of of the RuntimeInit event (see #zygoteInit).
Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
// Remaining arguments are passed to the start class's static main
return findStaticMain(args.startClass, args.startArgs, classLoader);
}
- 设置虚拟机的 HeapUtilization 为 0.75f
- 设置当前的 SDKVersion
- 调用 findStaticMain() 函数来查找 Java 类的 main 方法,并包装成 Runnable 的形式
java
protected static Runnable findStaticMain(String className, String[] argv,
ClassLoader classLoader) {
Class<?> cl;
try {
cl = Class.forName(className, true, classLoader);
} catch (ClassNotFoundException ex) {
throw new RuntimeException(
"Missing class when invoking static main " + className,
ex);
}
Method m;
try {
m = cl.getMethod("main", new Class[] { String[].class });
} catch (NoSuchMethodException ex) {
throw new RuntimeException(
"Missing static main on " + className, ex);
} catch (SecurityException ex) {
throw new RuntimeException(
"Problem getting static main on " + className, ex);
}
int modifiers = m.getModifiers();
if (! (Modifier.isStatic(modifiers) && Modifier.isPublic(modifiers))) {
throw new RuntimeException(
"Main method is not public and static on " + className);
}
/*
* This throw gets caught in ZygoteInit.main(), which responds
* by invoking the exception's run() method. This arrangement
* clears up all the stack frames that were required in setting
* up the process.
*/
return new MethodAndArgsCaller(m, argv);
}
static class MethodAndArgsCaller implements Runnable {
private final Method mMethod;
private final String[] mArgs;
......
public void run() {
......
mMethod.invoke(null, new Object[] { mArgs });
......
}
}
这里就是通过反射拿到 main 方法,然后在 Runnable 的 run 方法中去执行这个 main 方法。
最后回到 ZygoteInit 的 main 函数中:
java
// frameworks/base/core/java/com/android/internal/os/ZygoteInit.java
if (startSystemServer) {
// fork SystemServer
Runnable r = forkSystemServer(abiList, zygoteSocketName, zygoteServer);
// {@code r == null} in the parent (zygote) process, and {@code r != null} in the
// child (system_server) process.
// 如果 r 为空,说明是 zygote 进程,不做任何处理,继续执行
if (r != null) {
// r 不为空,说明是孵化的子进程 systemserver,启动后直接返回
r.run();
return;
}
}
这里会执行到 Runnable 的 run 方法,然后我们的程序就进入到 SystemServer 类中了。