Java Native 方法底层原理深度解析：从 JNI 注册到 Native Wrapper 生成

本文基于 OpenJDK 17 源码，以 java.lang.Thread 的 native 方法为例，完整剖析 JNI 动态注册、Method::register_native、generate_native_wrapper 等核心机制，揭开 Java 调用 native 方法的神秘面纱。

一、从一段 Java 代码开始

java

ini 复制代码

Thread t = new Thread(() -> System.out.println("Hello"));
t.start();

Thread.start() 是一个 native 方法，它的实现位于 C++ 层。这个调用是如何从 Java 世界跨入 C/C++ 世界的？背后涉及哪些关键步骤？

注册：将 Java 方法名与 C 函数指针绑定。
调用：生成适配器代码（native wrapper），处理参数转换、状态切换、异常传递等。

本文会带着这些疑问，逐段解析 OpenJDK 17 中相关的源码。注意：全文以 Linux x86-64 平台为例，调用约定遵循 System V AMD64 ABI。

二、JNI 动态注册：以 `Thread.registerNatives` 为例

2.1 Java 侧的静态初始化

java.lang.Thread 类在类加载时会执行静态代码块：

java

csharp 复制代码

// java.base/share/classes/java/lang/Thread.java
private static native void registerNatives();
static {
    registerNatives();
}

registerNatives 是一个 native 方法，它通过静态注册（名字为 Java_java_lang_Thread_registerNatives）与 C 函数关联。但它的作用却是动态注册该类中所有其他的 native 方法。

2.2 C 侧的方法表与注册函数

在 Thread.c 中定义了方法表和注册函数：

arduino 复制代码

// jdk/src/java.base/share/native/libjava/Thread.c

// 定义 native 方法映射表：每个条目包含 Java 方法名、签名和对应的 C 函数指针
static JNINativeMethod methods[] = {
    {"start0",           "()V",        (void *)&JVM_StartThread},
    {"stop0",            "(" OBJ ")V", (void *)&JVM_StopThread},
    {"isAlive",          "()Z",        (void *)&JVM_IsThreadAlive},
    {"suspend0",         "()V",        (void *)&JVM_SuspendThread},
    {"resume0",          "()V",        (void *)&JVM_ResumeThread},
    {"setPriority0",     "(I)V",       (void *)&JVM_SetThreadPriority},
    {"yield",            "()V",        (void *)&JVM_Yield},
    {"sleep",            "(J)V",       (void *)&JVM_Sleep},
    {"currentThread",    "()" THD,     (void *)&JVM_CurrentThread},
    {"interrupt0",       "()V",        (void *)&JVM_Interrupt},
    {"holdsLock",        "(" OBJ ")Z", (void *)&JVM_HoldsLock},
    {"getThreads",       "()[" THD,    (void *)&JVM_GetAllThreads},
    {"dumpThreads",      "([" THD ")[[" STE, (void *)&JVM_DumpThreads},
    {"setNativeName",    "(" STR ")V", (void *)&JVM_SetNativeThreadName},
};

// 静态注册的入口函数，由 JVM 在类初始化时调用
JNIEXPORT void JNICALL
Java_java_lang_Thread_registerNatives(JNIEnv *env, jclass cls)
{
    // 调用 JNI 函数 RegisterNatives 进行动态注册
    (*env)->RegisterNatives(env, cls, methods, ARRAY_LENGTH(methods));
}

关键点：

JNINativeMethod 结构包含 name、signature、fnPtr。
JVM_StartThread 等是 JVM 内部实现的 C 函数，位于 jvm.cpp 中。
RegisterNatives 是 JNI 规范提供的函数，由 JVM 实现。

2.3 JVM 层 `RegisterNatives` 的实现

JNI 函数 RegisterNatives 在 JVM 中的入口是 jni_RegisterNatives（位于 jni.cpp）。下面逐段分析源码，并添加中文注释。

cpp

ini 复制代码

// src/hotspot/share/prims/jni.cpp

JNI_ENTRY(jint, jni_RegisterNatives(JNIEnv *env, jclass clazz,
                                    const JNINativeMethod *methods,
                                    jint nMethods))
  HOTSPOT_JNI_REGISTERNATIVES_ENTRY(env, clazz, (void *) methods, nMethods);
  jint ret = 0;
  DT_RETURN_MARK(RegisterNatives, jint, (const jint&)ret);

  // 1. 将 jclass 转换为 JVM 内部的 Klass* 指针
  Klass* k = java_lang_Class::as_Klass(JNIHandles::resolve_non_null(clazz));

  // 2. 检查是否需要发出警告（平台类被非引导类加载器重新注册）
  bool do_warning = false;
  if (k->is_instance_klass()) {
    oop cl = k->class_loader();
    InstanceKlass* ik = InstanceKlass::cast(k);
    if ((cl == NULL || SystemDictionary::is_platform_class_loader(cl)) &&
        ik->module()->is_named()) {
      Klass* caller = thread->security_get_caller_class(1);
      do_warning = (caller == NULL) || caller->class_loader() != cl;
    }
  }

  // 3. 遍历每个要注册的方法
  for (int index = 0; index < nMethods; index++) {
    const char* meth_name = methods[index].name;
    const char* meth_sig = methods[index].signature;
    int meth_name_len = (int)strlen(meth_name);

    // 从符号表中查找方法名和签名（必须已存在，否则类加载阶段就失败了）
    // SymbolTable::probe 只查找不创建，提高性能
    TempNewSymbol name = SymbolTable::probe(meth_name, meth_name_len);
    TempNewSymbol signature = SymbolTable::probe(meth_sig, (int)strlen(meth_sig));

    if (name == NULL || signature == NULL) {
      // 找不到符号 -> 抛出 NoSuchMethodError
      ResourceMark rm(THREAD);
      stringStream st;
      st.print("Method %s.%s%s not found", k->external_name(), meth_name, meth_sig);
      THROW_MSG_(vmSymbols::java_lang_NoSuchMethodError(), st.as_string(), -1);
    }

    if (do_warning) {
      ResourceMark rm(THREAD);
      log_warning(jni, resolve)("Re-registering of platform native method: %s.%s%s "
              "from code in a different classloader", k->external_name(), meth_name, meth_sig);
    }

    // 核心：调用 Method::register_native 完成绑定
    bool res = Method::register_native(k, name, signature,
                                       (address) methods[index].fnPtr, THREAD);
    if (!res) {
      ret = -1;
      break;
    }
  }
  return ret;
JNI_END

代码注释解读：

JNI_ENTRY/JNI_END 是宏，用于定义 JNI 函数，包含异常处理和线程状态管理。
JNIHandles::resolve_non_null 将 JNI 引用转为 oop。
SymbolTable::probe 快速查找符号，如果类加载时方法未加载则失败。
最终调用 Method::register_native 真正建立 Java 方法对象与本地函数地址的关联。

三、`Method::register_native`：绑定元数据与函数指针

Method::register_native 位于 method.cpp，它的职责是：在 Klass 中找到对应的 Method 对象，检查是否为 native 方法，然后设置其 _native_function 指针。

cpp

scss 复制代码

// src/hotspot/share/oops/method.cpp

bool Method::register_native(Klass* k, Symbol* name, Symbol* signature, address entry, TRAPS) {
  // 1. 在 Klass 中查找方法（通过名称和签名）
  Method* method = k->lookup_method(name, signature);
  if (method == NULL) {
    ResourceMark rm(THREAD);
    stringStream st;
    st.print("Method '");
    print_external_name(&st, k, name, signature);
    st.print("' name or signature does not match");
    THROW_MSG_(vmSymbols::java_lang_NoSuchMethodError(), st.as_string(), false);
  }

  // 2. 如果不是 native 方法，尝试查找加了前缀的版本（用于 JVM TI 代理）
  if (!method->is_native()) {
    method = find_prefixed_native(k, name, signature, THREAD);
    if (method == NULL) {
      ResourceMark rm(THREAD);
      stringStream st;
      st.print("Method '");
      print_external_name(&st, k, name, signature);
      st.print("' is not declared as native");
      THROW_MSG_(vmSymbols::java_lang_NoSuchMethodError(), st.as_string(), false);
    }
  }

  // 3. 设置或清除 native 函数指针
  if (entry != NULL) {
    // 第二个参数表示是否要发送 JVMTI 事件（通常为 true）
    method->set_native_function(entry, native_bind_event_is_interesting);
  } else {
    method->clear_native_function();
  }

  // 记录调试日志
  if (log_is_enabled(Debug, jni, resolve)) {
    ResourceMark rm(THREAD);
    log_debug(jni, resolve)("[Registering JNI native method %s.%s]",
                            method->method_holder()->external_name(),
                            method->name()->as_C_string());
  }
  return true;
}

关键点：

lookup_method 遍历类的虚函数表（vtable）和 itable 查找方法。
find_prefixed_native 用于 JVM TI 代理可能给方法名添加前缀（如 prefix_methodName）的场景。
set_native_function 最终将函数地址写入 Method 对象。

3.1 `set_native_function` 的内存写入

cpp

scss 复制代码

// method.hpp 中定义
address* native_function_addr() const {
  assert(is_native(), "must be native");
  // Method 对象之后紧跟着一个指针槽位，用来存放 native 函数地址
  return (address*) (this + 1);
}

// method.cpp
void Method::set_native_function(address function, bool post_event_flag) {
  assert(function != NULL, "use clear_native_function to unregister");
  address* native_function = native_function_addr();

  address current = *native_function;
  if (current == function) return;   // 已经指向同一地址

  // 如果 JVMTI 需要发送 native 方法绑定事件，则允许代理修改 function 指针
  if (post_event_flag && JvmtiExport::should_post_native_method_bind() &&
      function != NULL) {
    JvmtiExport::post_native_method_bind(this, &function);
  }

  // 原子更新（实际是简单赋值，因为该字段只在此处修改）
  *native_function = function;

  // 如果该方法已经有编译后的 native wrapper（nmethod），需要使其失效
  // 因为旧的 wrapper 可能内联了旧的函数地址
  CompiledMethod* nm = code();
  if (nm != NULL) {
    nm->make_not_entrant();
  }
}

解释：

Method 对象的内存布局中，紧跟着对象头部的是 _native_function 指针（只对 native 方法有意义）。这个指针通过 native_function_addr() 计算得到。
当函数地址发生变化（比如重新注册），需要让已编译的 native wrapper（nmethod）失效，因为 wrapper 中可能直接调用了旧地址。

至此，注册阶段完成：Java 层的 Thread.start0() 已经被绑定到 JVM_StartThread 函数地址，并且该地址存储在 Method 对象的 _native_function 字段中。

四、Native 方法的调用触发：解释器 / JIT 的入口

当一个 native 方法被调用时，如果还未生成 native wrapper，JVM 会调用 SharedRuntime::generate_native_wrapper 为该方法生成一段特定的机器码（nmethod），这段代码负责：

将 Java 层参数转换为 C 调用约定。
切换线程状态（_thread_in_Java → _thread_in_native）。
调用实际的 C 函数。
处理返回值和异常。
恢复状态并返回。

我们接下来深度解析 generate_native_wrapper，这也是整个 native 调用机制中最复杂的部分。

五、`SharedRuntime::generate_native_wrapper` 全面解析

该函数定义在 src/hotspot/share/runtime/sharedRuntime.cpp（x86_64 版本）。由于代码非常长，我们将其分成几个逻辑阶段讲解。

5.1 函数签名与初始检查

cpp

ini 复制代码

nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm,
                                                const methodHandle& method,
                                                int compile_id,
                                                BasicType* in_sig_bt,
                                                VMRegPair* in_regs,
                                                BasicType ret_type,
                                                address critical_entry) {
  // 处理 MethodHandle 内部 intrinsic 的特殊情况（非本文重点）
  if (method->is_method_handle_intrinsic()) {
    // ... 省略 ...
  }

  bool is_critical_native = true;
  address native_func = critical_entry;
  if (native_func == NULL) {
    native_func = method->native_function();   // 获取之前注册的函数地址
    is_critical_native = false;
  }
  assert(native_func != NULL, "must have function");

method 是 methodHandle，代表当前 native 方法。
in_sig_bt 和 in_regs 描述了 Java 调用约定下参数的类型和存放位置（寄存器或栈）。
critical_entry 用于 Critical Native（一种优化，直接传递原始数组指针），此处不深入。
获取真正的 C 函数地址：method->native_function()。

5.2 计算参数个数与 C 调用约定签名

Java 方法有隐含参数：JNIEnv*（总是第一个）和（若是静态方法）jclass。因此需要构建 C 函数的参数类型数组 out_sig_bt。

cpp

ini 复制代码

  const int total_in_args = method->size_of_parameters();          // Java 参数个数
  int total_c_args = total_in_args;
  if (!is_critical_native) {
    total_c_args += 1;                    // +1 for JNIEnv*
    if (method->is_static()) {
      total_c_args++;                     // +1 for jclass
    }
  } else {
    // Critical native 的特殊处理：数组会展开为 (length, pointer)
    for (int i = 0; i < total_in_args; i++) {
      if (in_sig_bt[i] == T_ARRAY) {
        total_c_args++;
      }
    }
  }

  BasicType* out_sig_bt = NEW_RESOURCE_ARRAY(BasicType, total_c_args);
  VMRegPair* out_regs   = NEW_RESOURCE_ARRAY(VMRegPair, total_c_args);

  int argc = 0;
  if (!is_critical_native) {
    out_sig_bt[argc++] = T_ADDRESS;       // JNIEnv*
    if (method->is_static()) {
      out_sig_bt[argc++] = T_OBJECT;      // jclass
    }
    for (int i = 0; i < total_in_args ; i++ ) {
      out_sig_bt[argc++] = in_sig_bt[i];
    }
  } else {
    // ... 省略 critical native 的签名构造 ...
  }

  // 根据 C 调用约定确定每个参数应该放在哪个寄存器或栈上
  int out_arg_slots = c_calling_convention(out_sig_bt, out_regs, NULL, total_c_args);

c_calling_convention 根据平台（Linux x64 用 System V AMD64 ABI）决定前几个整数参数用 RDI, RSI, RDX, RCX, R8, R9，浮点参数用 XMM0-XMM7，多余参数压栈。
返回的 out_arg_slots 是栈上所需的总槽位数（每个 slot 8 字节）。

5.3 计算栈帧大小

需要容纳：ABI 保留区、C 参数需要的栈空间、保存寄存器（如传入的 Java 参数寄存器）、同步锁的 lock record、静态方法的 klass 槽等。

cpp

ini 复制代码

  // 基本栈大小 = ABI 保留区 + C 参数需要的栈空间
  int stack_slots = SharedRuntime::out_preserve_stack_slots() + out_arg_slots;

  // 用于保存传入寄存器中的 oop 句柄的空间（最多 6 个寄存器参数）
  int total_save_slots = 6 * VMRegImpl::slots_per_word;
  // ... 对于 critical native 会重新计算实际使用的寄存器数量
  int oop_handle_offset = stack_slots;
  stack_slots += total_save_slots;

  // 静态方法需要额外空间存放 klass 的 handle
  if (method->is_static()) {
    klass_slot_offset = stack_slots;
    stack_slots += VMRegImpl::slots_per_word;
    klass_offset = klass_slot_offset * VMRegImpl::stack_slot_size;
    is_static = true;
  }

  // 同步方法需要 lock record
  if (method->is_synchronized()) {
    lock_slot_offset = stack_slots;
    stack_slots += VMRegImpl::slots_per_word;
  }

  // 一些临时 slot（用于参数移动时的交换）
  stack_slots += 6;

  // 对齐栈（16 字节）
  stack_slots = align_up(stack_slots, StackAlignmentInSlots);
  int stack_size = stack_slots * VMRegImpl::stack_slot_size;

内存布局（从高地址到低地址）：

text

csharp 复制代码

[rbp]          <- 旧的 rbp
[return addr]
[6个临时slot]
[lock record (若同步)]
[klass handle (若静态)]
[oop handle area]   <- 保存从寄存器中来的对象引用
[C 参数所需栈空间]
[ABI 保留区]
[rsp]

5.4 生成入口代码：栈溢出检查、帧建立

cpp

scss 复制代码

  // 内联缓存检查（确保接收者的 klass 与预期的相符）
  const Register ic_reg = rax;
  const Register receiver = j_rarg0;   // 第一个参数寄存器（通常是 RDI）
  __ load_klass(rscratch1, receiver, rscratch2);
  __ cmpq(ic_reg, rscratch1);
  __ jcc(Assembler::equal, hit);
  __ jump(RuntimeAddress(SharedRuntime::get_ic_miss_stub()));
  __ bind(hit);

  // 栈溢出检查
  __ bang_stack_with_offset((int)StackOverflow::stack_shadow_zone_size());

  // 建立新栈帧
  __ enter();                         // push rbp ; mov rbp, rsp
  __ subptr(rsp, stack_size - 2*wordSize);

ic_reg 是内联缓存寄存器，在首次调用时包含预期 klass。
bang_stack_with_offset 防止栈溢出（触碰保护页）。
enter 指令生成标准的栈帧。

5.5 "大洗牌"：将 Java 参数移动到 C 参数位置

这是最核心的部分：根据输入位置 in_regs 和输出位置 out_regs，生成移动指令。由于可能存在寄存器重叠（比如同一个寄存器既保存了某个 Java 参数，又是某个 C 参数的目标），需要合理安排移动顺序，甚至使用临时寄存器。

cpp

ini 复制代码

  GrowableArray<int> arg_order(2 * total_in_args);
  VMRegPair tmp_vmreg;
  tmp_vmreg.set2(rbx->as_VMReg());   // 使用 RBX 作为临时寄存器

  if (!is_critical_native) {
    // 普通 JNI：从后向前移动可以避免覆盖尚未移动的值
    for (int i = total_in_args - 1, c_arg = total_c_args - 1; i >= 0; i--, c_arg--) {
      arg_order.push(i);
      arg_order.push(c_arg);
    }
  } else {
    // Critical native 需要用算法计算无冲突的移动顺序
    ComputeMoveOrder cmo(...);
  }

  int temploc = -1;
  for (int ai = 0; ai < arg_order.length(); ai += 2) {
    int i = arg_order.at(ai);          // Java 参数索引
    int c_arg = arg_order.at(ai + 1);  // C 参数索引
    if (c_arg == -1) {
      // 需要将参数移动到临时位置
      __ mov(tmp_vmreg.first()->as_Register(), in_regs[i].first()->as_Register());
      in_regs[i] = tmp_vmreg;
      temploc = i;
      continue;
    } else if (i == -1) {
      i = temploc;
      temploc = -1;
    }

    switch (in_sig_bt[i]) {
      case T_OBJECT:
        // 对象引用需要转换为 JNI 本地引用，并保存在 oop handle area
        __ object_move(map, oop_handle_offset, stack_slots, in_regs[i], out_regs[c_arg],
                       ((i == 0) && (!is_static)), &receiver_offset);
        break;
      case T_LONG:
        __ long_move(in_regs[i], out_regs[c_arg]);
        break;
      // 其他类型类似...
    }
  }

object_move 会为对象创建 JNI 本地引用（handlize），并在 OopMap 中标记位置，以便 GC 扫描。
long_move、float_move 等处理基本类型的宽度和寄存器类别差异。

5.6 同步方法的锁获取

对于 synchronized native 方法（Thread.start() 不是 synchronized，但其他方法可能有），需要在调用前获取锁。

cpp

scss 复制代码

  if (method->is_synchronized()) {
    // 获取对象的 oop（非静态为 this，静态为 klass mirror）
    __ movptr(obj_reg, Address(oop_handle_reg, 0));
    if (UseBiasedLocking) {
      __ biased_locking_enter(lock_reg, obj_reg, swap_reg, rscratch1, rscratch2, false, lock_done, &slow_path_lock);
    }
    // 尝试轻量级锁：CAS 对象头中的 mark word
    __ movptr(swap_reg, 1);
    __ orptr(swap_reg, Address(obj_reg, oopDesc::mark_offset_in_bytes()));
    __ movptr(Address(lock_reg, mark_word_offset), swap_reg);
    __ lock();
    __ cmpxchgptr(lock_reg, Address(obj_reg, oopDesc::mark_offset_in_bytes()));
    __ jcc(Assembler::equal, lock_done);
    // 慢路径：膨胀为重量级锁
    // ...
  }

5.7 设置线程状态并调用 native 函数

在调用 C 函数前，必须将线程状态改为 _thread_in_native，并设置 last_Java_frame 以便 GC 可以扫描栈。

cpp

scss 复制代码

  intptr_t the_pc = (intptr_t) __ pc();
  oop_maps->add_gc_map(the_pc - start, map);

  __ set_last_Java_frame(rsp, noreg, (address)the_pc);

  if (!is_critical_native) {
    // 将 JNIEnv* 放入第一个参数寄存器（RDI）
    __ lea(c_rarg0, Address(r15_thread, in_bytes(JavaThread::jni_environment_offset())));
    // 设置线程状态
    __ movl(Address(r15_thread, JavaThread::thread_state_offset()), _thread_in_native);
  }

  // 调用真正的 native 函数
  __ call(RuntimeAddress(native_func));

set_last_Java_frame 存储当前栈指针和程序计数器到 JavaThread 对象中，便于 GC 时扫描栈上的 oop。
c_rarg0 在 Linux x64 上是 RDI，对应 C 函数的第一个参数 JNIEnv*。
thread_state 字段保证 GC 不会在 native 代码执行时暂停线程（因为 native 代码不操作 Java 堆）。

5.8 从 native 返回后的状态切换与安全点检查

cpp

scss 复制代码

  Label after_transition;
  if (is_critical_native) {
    // 检查安全点或挂起请求
    __ safepoint_poll(needs_safepoint, r15_thread, false, false);
    __ cmpl(Address(r15_thread, JavaThread::suspend_flags_offset()), 0);
    __ jcc(Assembler::equal, after_transition);
    __ bind(needs_safepoint);
  }

  // 过渡状态
  __ movl(Address(r15_thread, JavaThread::thread_state_offset()), _thread_in_native_trans);
  __ membar(Assembler::StoreLoad);   // 确保状态变更对其他线程可见

  // 检查安全点或挂起
  {
    Label slow_path;
    __ safepoint_poll(slow_path, r15_thread, true, false);
    __ cmpl(Address(r15_thread, JavaThread::suspend_flags_offset()), 0);
    __ jcc(Assembler::equal, Continue);
    __ bind(slow_path);
    // 调用 C 函数处理挂起和安全点
    save_native_result(masm, ret_type, stack_slots);
    __ mov(c_rarg0, r15_thread);
    __ call(RuntimeAddress(CAST_FROM_FN_PTR(address, JavaThread::check_special_condition_for_native_trans)));
    restore_native_result(masm, ret_type, stack_slots);
    __ bind(Continue);
  }

  __ movl(Address(r15_thread, JavaThread::thread_state_offset()), _thread_in_Java);
  __ bind(after_transition);

从 native 返回后，JVM 需要检查是否在 native 执行期间发生了 GC 请求（安全点）或线程挂起请求。先切换到 _thread_in_native_trans 状态，然后调用 check_special_condition_for_native_trans 处理。
最后恢复状态为 _thread_in_Java。

5.9 处理返回值与异常

cpp

scss 复制代码

  // 解包 native 结果（例如，将 jint 零扩展为 Java int）
  switch (ret_type) {
    case T_BOOLEAN: __ c2bool(rax); break;
    case T_BYTE:    __ sign_extend_byte(rax); break;
    // ...
  }

  // 解锁（如果是同步方法）...
  // 处理异常：检查 JavaThread::pending_exception
  __ cmpptr(Address(r15_thread, Thread::pending_exception_offset()), (int32_t)NULL_WORD);
  __ jcc(Assembler::notEqual, exception_pending);

  // 清除 JNI 本地引用表
  __ movptr(rcx, Address(r15_thread, JavaThread::active_handles_offset()));
  __ movl(Address(rcx, JNIHandleBlock::top_offset_in_bytes()), 0);

  __ leave();
  __ ret(0);

  // 异常处理代码
  __ bind(exception_pending);
  __ jump(RuntimeAddress(StubRoutines::forward_exception_entry()));

如果返回值是对象引用，还需要调用 resolve_jobject 将 JNI 本地引用转为 Java 可用的 oop。
异常处理会跳转到 StubRoutines::forward_exception_entry()，最终抛出异常。

5.10 生成 nmethod 并返回

cpp

css 复制代码

  nmethod *nm = nmethod::new_native_nmethod(method,
                                            compile_id,
                                            masm->code(),
                                            vep_offset,
                                            frame_complete,
                                            stack_slots / VMRegImpl::slots_per_word,
                                            (is_static ? in_ByteSize(klass_offset) : in_ByteSize(receiver_offset)),
                                            in_ByteSize(lock_slot_offset*VMRegImpl::stack_slot_size),
                                            oop_maps);
  return nm;

new_native_nmethod 会创建 nmethod 对象，并将生成的机器码嵌入其中。
返回的 nmethod 会被安装到 Method 对象中，后续调用直接执行该 wrapper。

六、完整调用链路总结

Java 代码调用 Thread.start() → start0()（native）。
解释器/JIT 发现 Method 的 _from_compiled_entry 还未设置，调用 SharedRuntime::generate_native_wrapper。
Wrapper 生成：
- 计算 C 调用约定的参数位置。
- 移动参数（handlize 对象）。
- 建立栈帧、保存寄存器。
- 设置 last_Java_frame、切换线程状态。
- 调用 native_func（即 JVM_StartThread）。
- 返回后处理安全点、恢复状态、处理结果。
JVM_StartThread 在 jvm.cpp 中实现，最终调用操作系统的线程创建函数（如 pthread_create）。
控制权回到 wrapper，然后返回到 Java 代码。

七、总结与延伸思考

通过以上源码分析，我们可以得出以下结论：

JNI 动态注册 使用 RegisterNatives 方法表，比静态命名查找更高效、灵活。
Method 对象 通过额外的指针槽位存储 native 函数地址，允许运行时重新绑定。
Native wrapper 是 JVM 自动生成的适配层，负责解决 Java 与 C 调用约定不匹配的问题，并处理 GC、安全点、异常等复杂机制。
参数传递 的"大洗牌"算法确保了即使寄存器有重叠，也能正确移动所有参数。

了解这些底层实现，不仅能帮助我们编写更高效的 JNI 代码，还能在遇到 UnsatisfiedLinkError、SIGSEGV 等疑难杂症时，从根源上分析问题。例如，如果 JNI 代码错误地缓存了 jclass 或 jmethodID 而未考虑类卸载，就可能导致崩溃------理解了 JNIHandles 和 oopMap 的作用后，就能明白为什么需要正确使用全局引用。

Native 方法的背后，是 JVM 精心设计的一套安全、高效的跨语言调用机制。希望本文能帮助读者揭开这一神秘面纱，为进一步研究 OpenJDK 源码打下基础。 #源码

static 复制代码

    {"start0",           "()V",        (void *)&JVM_StartThread},
    {"stop0",            "(" OBJ ")V", (void *)&JVM_StopThread},
    {"isAlive",          "()Z",        (void *)&JVM_IsThreadAlive},
    {"suspend0",         "()V",        (void *)&JVM_SuspendThread},
    {"resume0",          "()V",        (void *)&JVM_ResumeThread},
    {"setPriority0",     "(I)V",       (void *)&JVM_SetThreadPriority},
    {"yield",            "()V",        (void *)&JVM_Yield},
    {"sleep",            "(J)V",       (void *)&JVM_Sleep},
    {"currentThread",    "()" THD,     (void *)&JVM_CurrentThread},
    {"interrupt0",       "()V",        (void *)&JVM_Interrupt},
    {"holdsLock",        "(" OBJ ")Z", (void *)&JVM_HoldsLock},
    {"getThreads",        "()[" THD,   (void *)&JVM_GetAllThreads},
    {"dumpThreads",      "([" THD ")[[" STE, (void *)&JVM_DumpThreads},
    {"setNativeName",    "(" STR ")V", (void *)&JVM_SetNativeThreadName},
};

JNIEXPORT void JNICALL
Java_java_lang_Thread_registerNatives(JNIEnv *env, jclass cls)
{
    (*env)->RegisterNatives(env, cls, methods, ARRAY_LENGTH(methods));
}

#注册JNI
JNI_ENTRY(jint, jni_RegisterNatives(JNIEnv *env, jclass clazz,
                                    const JNINativeMethod *methods,
                                    jint nMethods))
  HOTSPOT_JNI_REGISTERNATIVES_ENTRY(env, clazz, (void *) methods, nMethods);
  jint ret = 0;
  DT_RETURN_MARK(RegisterNatives, jint, (const jint&)ret);

  Klass* k = java_lang_Class::as_Klass(JNIHandles::resolve_non_null(clazz));

  // There are no restrictions on native code registering native methods,
  // which allows agents to redefine the bindings to native methods, however
  // we issue a warning if any code running outside of the boot/platform
  // loader is rebinding any native methods in classes loaded by the
  // boot/platform loader that are in named modules. That will catch changes
  // to platform classes while excluding classes added to the bootclasspath.
  bool do_warning = false;

  // Only instanceKlasses can have native methods
  if (k->is_instance_klass()) {
    oop cl = k->class_loader();
    InstanceKlass* ik = InstanceKlass::cast(k);
    // Check for a platform class
    if ((cl ==  NULL || SystemDictionary::is_platform_class_loader(cl)) &&
        ik->module()->is_named()) {
      Klass* caller = thread->security_get_caller_class(1);
      // If no caller class, or caller class has a different loader, then
      // issue a warning below.
      do_warning = (caller == NULL) || caller->class_loader() != cl;
    }
  }


  for (int index = 0; index < nMethods; index++) {
    const char* meth_name = methods[index].name;
    const char* meth_sig = methods[index].signature;
    int meth_name_len = (int)strlen(meth_name);

    // The class should have been loaded (we have an instance of the class
    // passed in) so the method and signature should already be in the symbol
    // table.  If they're not there, the method doesn't exist.
    TempNewSymbol  name = SymbolTable::probe(meth_name, meth_name_len);
    TempNewSymbol  signature = SymbolTable::probe(meth_sig, (int)strlen(meth_sig));

    if (name == NULL || signature == NULL) {
      ResourceMark rm(THREAD);
      stringStream st;
      st.print("Method %s.%s%s not found", k->external_name(), meth_name, meth_sig);
      // Must return negative value on failure
      THROW_MSG_(vmSymbols::java_lang_NoSuchMethodError(), st.as_string(), -1);
    }

    if (do_warning) {
      ResourceMark rm(THREAD);
      log_warning(jni, resolve)("Re-registering of platform native method: %s.%s%s "
              "from code in a different classloader", k->external_name(), meth_name, meth_sig);
    }

    bool res = Method::register_native(k, name, signature,
                                       (address) methods[index].fnPtr, THREAD);
    if (!res) {
      ret = -1;
      break;
    }
  }
  return ret;
JNI_END


bool Method::register_native(Klass* k, Symbol* name, Symbol* signature, address entry, TRAPS) {
  Method* method = k->lookup_method(name, signature);
  if (method == NULL) {
    ResourceMark rm(THREAD);
    stringStream st;
    st.print("Method '");
    print_external_name(&st, k, name, signature);
    st.print("' name or signature does not match");
    THROW_MSG_(vmSymbols::java_lang_NoSuchMethodError(), st.as_string(), false);
  }
  if (!method->is_native()) {
    // trying to register to a non-native method, see if a JVM TI agent has added prefix(es)
    method = find_prefixed_native(k, name, signature, THREAD);
    if (method == NULL) {
      ResourceMark rm(THREAD);
      stringStream st;
      st.print("Method '");
      print_external_name(&st, k, name, signature);
      st.print("' is not declared as native");
      THROW_MSG_(vmSymbols::java_lang_NoSuchMethodError(), st.as_string(), false);
    }
  }

  if (entry != NULL) {
    method->set_native_function(entry, native_bind_event_is_interesting);
  } else {
    method->clear_native_function();
  }
  if (log_is_enabled(Debug, jni, resolve)) {
    ResourceMark rm(THREAD);
    log_debug(jni, resolve)("[Registering JNI native method %s.%s]",
                            method->method_holder()->external_name(),
                            method->name()->as_C_string());
  }
  return true;
}


void Method::set_native_function(address function, bool post_event_flag) {
  assert(function != NULL, "use clear_native_function to unregister natives");
  assert(!is_method_handle_intrinsic() || function == SharedRuntime::native_method_throw_unsatisfied_link_error_entry(), "");
  address* native_function = native_function_addr();

  // We can see racers trying to place the same native function into place. Once
  // is plenty.
  address current = *native_function;
  if (current == function) return;
  if (post_event_flag && JvmtiExport::should_post_native_method_bind() &&
      function != NULL) {
    // native_method_throw_unsatisfied_link_error_entry() should only
    // be passed when post_event_flag is false.
    assert(function !=
      SharedRuntime::native_method_throw_unsatisfied_link_error_entry(),
      "post_event_flag mis-match");

    // post the bind event, and possible change the bind function
    JvmtiExport::post_native_method_bind(this, &function);
  }
  *native_function = function;
  // This function can be called more than once. We must make sure that we always
  // use the latest registered method -> check if a stub already has been generated.
  // If so, we have to make it not_entrant.
  CompiledMethod* nm = code(); // Put it into local variable to guard against concurrent updates
  if (nm != NULL) {
    nm->make_not_entrant();
  }
}

address* native_function_addr() const          { assert(is_native(), "must be native"); return (address*) (this+1); }

// ---------------------------------------------------------------------------
// Generate a native wrapper for a given method.  The method takes arguments
// in the Java compiled code convention, marshals them to the native
// convention (handlizes oops, etc), transitions to native, makes the call,
// returns to java state (possibly blocking), unhandlizes any result and
// returns.
//
// Critical native functions are a shorthand for the use of
// GetPrimtiveArrayCritical and disallow the use of any other JNI
// functions.  The wrapper is expected to unpack the arguments before
// passing them to the callee. Critical native functions leave the state _in_Java,
// since they cannot stop for GC.
// Some other parts of JNI setup are skipped like the tear down of the JNI handle
// block and the check for pending exceptions it's impossible for them
// to be thrown.
//
nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm,
                                                const methodHandle& method,
                                                int compile_id,
                                                BasicType* in_sig_bt,
                                                VMRegPair* in_regs,
                                                BasicType ret_type,
                                                address critical_entry) {
  if (method->is_method_handle_intrinsic()) {
    vmIntrinsics::ID iid = method->intrinsic_id();
    intptr_t start = (intptr_t)__ pc();
    int vep_offset = ((intptr_t)__ pc()) - start;
    gen_special_dispatch(masm,
                         method,
                         in_sig_bt,
                         in_regs);
    int frame_complete = ((intptr_t)__ pc()) - start;  // not complete, period
    __ flush();
    int stack_slots = SharedRuntime::out_preserve_stack_slots();  // no out slots at all, actually
    return nmethod::new_native_nmethod(method,
                                       compile_id,
                                       masm->code(),
                                       vep_offset,
                                       frame_complete,
                                       stack_slots / VMRegImpl::slots_per_word,
                                       in_ByteSize(-1),
                                       in_ByteSize(-1),
                                       (OopMapSet*)NULL);
  }
  bool is_critical_native = true;
  address native_func = critical_entry;
  if (native_func == NULL) {
    native_func = method->native_function();
    is_critical_native = false;
  }
  assert(native_func != NULL, "must have function");

  // An OopMap for lock (and class if static)
  OopMapSet *oop_maps = new OopMapSet();
  intptr_t start = (intptr_t)__ pc();

  // We have received a description of where all the java arg are located
  // on entry to the wrapper. We need to convert these args to where
  // the jni function will expect them. To figure out where they go
  // we convert the java signature to a C signature by inserting
  // the hidden arguments as arg[0] and possibly arg[1] (static method)

  const int total_in_args = method->size_of_parameters();
  int total_c_args = total_in_args;
  if (!is_critical_native) {
    total_c_args += 1;
    if (method->is_static()) {
      total_c_args++;
    }
  } else {
    for (int i = 0; i < total_in_args; i++) {
      if (in_sig_bt[i] == T_ARRAY) {
        total_c_args++;
      }
    }
  }

  BasicType* out_sig_bt = NEW_RESOURCE_ARRAY(BasicType, total_c_args);
  VMRegPair* out_regs   = NEW_RESOURCE_ARRAY(VMRegPair, total_c_args);
  BasicType* in_elem_bt = NULL;

  int argc = 0;
  if (!is_critical_native) {
    out_sig_bt[argc++] = T_ADDRESS;
    if (method->is_static()) {
      out_sig_bt[argc++] = T_OBJECT;
    }

    for (int i = 0; i < total_in_args ; i++ ) {
      out_sig_bt[argc++] = in_sig_bt[i];
    }
  } else {
    in_elem_bt = NEW_RESOURCE_ARRAY(BasicType, total_in_args);
    SignatureStream ss(method->signature());
    for (int i = 0; i < total_in_args ; i++ ) {
      if (in_sig_bt[i] == T_ARRAY) {
        // Arrays are passed as int, elem* pair
        out_sig_bt[argc++] = T_INT;
        out_sig_bt[argc++] = T_ADDRESS;
        ss.skip_array_prefix(1);  // skip one '['
        assert(ss.is_primitive(), "primitive type expected");
        in_elem_bt[i] = ss.type();
      } else {
        out_sig_bt[argc++] = in_sig_bt[i];
        in_elem_bt[i] = T_VOID;
      }
      if (in_sig_bt[i] != T_VOID) {
        assert(in_sig_bt[i] == ss.type() ||
               in_sig_bt[i] == T_ARRAY, "must match");
        ss.next();
      }
    }
  }

  // Now figure out where the args must be stored and how much stack space
  // they require.
  int out_arg_slots;
  out_arg_slots = c_calling_convention(out_sig_bt, out_regs, NULL, total_c_args);

  // Compute framesize for the wrapper.  We need to handlize all oops in
  // incoming registers

  // Calculate the total number of stack slots we will need.

  // First count the abi requirement plus all of the outgoing args
  int stack_slots = SharedRuntime::out_preserve_stack_slots() + out_arg_slots;

  // Now the space for the inbound oop handle area
  int total_save_slots = 6 * VMRegImpl::slots_per_word;  // 6 arguments passed in registers
  if (is_critical_native) {
    // Critical natives may have to call out so they need a save area
    // for register arguments.
    int double_slots = 0;
    int single_slots = 0;
    for ( int i = 0; i < total_in_args; i++) {
      if (in_regs[i].first()->is_Register()) {
        const Register reg = in_regs[i].first()->as_Register();
        switch (in_sig_bt[i]) {
          case T_BOOLEAN:
          case T_BYTE:
          case T_SHORT:
          case T_CHAR:
          case T_INT:  single_slots++; break;
          case T_ARRAY:  // specific to LP64 (7145024)
          case T_LONG: double_slots++; break;
          default:  ShouldNotReachHere();
        }
      } else if (in_regs[i].first()->is_XMMRegister()) {
        switch (in_sig_bt[i]) {
          case T_FLOAT:  single_slots++; break;
          case T_DOUBLE: double_slots++; break;
          default:  ShouldNotReachHere();
        }
      } else if (in_regs[i].first()->is_FloatRegister()) {
        ShouldNotReachHere();
      }
    }
    total_save_slots = double_slots * 2 + single_slots;
    // align the save area
    if (double_slots != 0) {
      stack_slots = align_up(stack_slots, 2);
    }
  }

  int oop_handle_offset = stack_slots;
  stack_slots += total_save_slots;

  // Now any space we need for handlizing a klass if static method

  int klass_slot_offset = 0;
  int klass_offset = -1;
  int lock_slot_offset = 0;
  bool is_static = false;

  if (method->is_static()) {
    klass_slot_offset = stack_slots;
    stack_slots += VMRegImpl::slots_per_word;
    klass_offset = klass_slot_offset * VMRegImpl::stack_slot_size;
    is_static = true;
  }

  // Plus a lock if needed

  if (method->is_synchronized()) {
    lock_slot_offset = stack_slots;
    stack_slots += VMRegImpl::slots_per_word;
  }

  // Now a place (+2) to save return values or temp during shuffling
  // + 4 for return address (which we own) and saved rbp
  stack_slots += 6;

  // Ok The space we have allocated will look like:
  //
  //
  // FP-> |                     |
  //      |---------------------|
  //      | 2 slots for moves   |
  //      |---------------------|
  //      | lock box (if sync)  |
  //      |---------------------| <- lock_slot_offset
  //      | klass (if static)   |
  //      |---------------------| <- klass_slot_offset
  //      | oopHandle area      |
  //      |---------------------| <- oop_handle_offset (6 java arg registers)
  //      | outbound memory     |
  //      | based arguments     |
  //      |                     |
  //      |---------------------|
  //      |                     |
  // SP-> | out_preserved_slots |
  //
  //


  // Now compute actual number of stack words we need rounding to make
  // stack properly aligned.
  stack_slots = align_up(stack_slots, StackAlignmentInSlots);

  int stack_size = stack_slots * VMRegImpl::stack_slot_size;

  // First thing make an ic check to see if we should even be here

  // We are free to use all registers as temps without saving them and
  // restoring them except rbp. rbp is the only callee save register
  // as far as the interpreter and the compiler(s) are concerned.


  const Register ic_reg = rax;
  const Register receiver = j_rarg0;

  Label hit;
  Label exception_pending;

  assert_different_registers(ic_reg, receiver, rscratch1);
  __ verify_oop(receiver);
  __ load_klass(rscratch1, receiver, rscratch2);
  __ cmpq(ic_reg, rscratch1);
  __ jcc(Assembler::equal, hit);

  __ jump(RuntimeAddress(SharedRuntime::get_ic_miss_stub()));

  // Verified entry point must be aligned
  __ align(8);

  __ bind(hit);

  int vep_offset = ((intptr_t)__ pc()) - start;

  if (VM_Version::supports_fast_class_init_checks() && method->needs_clinit_barrier()) {
    Label L_skip_barrier;
    Register klass = r10;
    __ mov_metadata(klass, method->method_holder()); // InstanceKlass*
    __ clinit_barrier(klass, r15_thread, &L_skip_barrier /*L_fast_path*/);

    __ jump(RuntimeAddress(SharedRuntime::get_handle_wrong_method_stub())); // slow path

    __ bind(L_skip_barrier);
  }

#ifdef COMPILER1
  // For Object.hashCode, System.identityHashCode try to pull hashCode from object header if available.
  if ((InlineObjectHash && method->intrinsic_id() == vmIntrinsics::_hashCode) || (method->intrinsic_id() == vmIntrinsics::_identityHashCode)) {
    inline_check_hashcode_from_object_header(masm, method, j_rarg0 /*obj_reg*/, rax /*result*/);
  }
#endif // COMPILER1

  // The instruction at the verified entry point must be 5 bytes or longer
  // because it can be patched on the fly by make_non_entrant. The stack bang
  // instruction fits that requirement.

  // Generate stack overflow check
  __ bang_stack_with_offset((int)StackOverflow::stack_shadow_zone_size());

  // Generate a new frame for the wrapper.
  __ enter();
  // -2 because return address is already present and so is saved rbp
  __ subptr(rsp, stack_size - 2*wordSize);

  BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler();
  bs->nmethod_entry_barrier(masm);

  // Frame is now completed as far as size and linkage.
  int frame_complete = ((intptr_t)__ pc()) - start;

    if (UseRTMLocking) {
      // Abort RTM transaction before calling JNI
      // because critical section will be large and will be
      // aborted anyway. Also nmethod could be deoptimized.
      __ xabort(0);
    }

#ifdef ASSERT
    {
      Label L;
      __ mov(rax, rsp);
      __ andptr(rax, -16); // must be 16 byte boundary (see amd64 ABI)
      __ cmpptr(rax, rsp);
      __ jcc(Assembler::equal, L);
      __ stop("improperly aligned stack");
      __ bind(L);
    }
#endif /* ASSERT */


  // We use r14 as the oop handle for the receiver/klass
  // It is callee save so it survives the call to native

  const Register oop_handle_reg = r14;

  //
  // We immediately shuffle the arguments so that any vm call we have to
  // make from here on out (sync slow path, jvmti, etc.) we will have
  // captured the oops from our caller and have a valid oopMap for
  // them.

  // -----------------
  // The Grand Shuffle

  // The Java calling convention is either equal (linux) or denser (win64) than the
  // c calling convention. However the because of the jni_env argument the c calling
  // convention always has at least one more (and two for static) arguments than Java.
  // Therefore if we move the args from java -> c backwards then we will never have
  // a register->register conflict and we don't have to build a dependency graph
  // and figure out how to break any cycles.
  //

  // Record esp-based slot for receiver on stack for non-static methods
  int receiver_offset = -1;

  // This is a trick. We double the stack slots so we can claim
  // the oops in the caller's frame. Since we are sure to have
  // more args than the caller doubling is enough to make
  // sure we can capture all the incoming oop args from the
  // caller.
  //
  OopMap* map = new OopMap(stack_slots * 2, 0 /* arg_slots*/);

  // Mark location of rbp (someday)
  // map->set_callee_saved(VMRegImpl::stack2reg( stack_slots - 2), stack_slots * 2, 0, vmreg(rbp));

  // Use eax, ebx as temporaries during any memory-memory moves we have to do
  // All inbound args are referenced based on rbp and all outbound args via rsp.


#ifdef ASSERT
  bool reg_destroyed[RegisterImpl::number_of_registers];
  bool freg_destroyed[XMMRegisterImpl::number_of_registers];
  for ( int r = 0 ; r < RegisterImpl::number_of_registers ; r++ ) {
    reg_destroyed[r] = false;
  }
  for ( int f = 0 ; f < XMMRegisterImpl::number_of_registers ; f++ ) {
    freg_destroyed[f] = false;
  }

#endif /* ASSERT */

  // This may iterate in two different directions depending on the
  // kind of native it is.  The reason is that for regular JNI natives
  // the incoming and outgoing registers are offset upwards and for
  // critical natives they are offset down.
  GrowableArray<int> arg_order(2 * total_in_args);

  VMRegPair tmp_vmreg;
  tmp_vmreg.set2(rbx->as_VMReg());

  if (!is_critical_native) {
    for (int i = total_in_args - 1, c_arg = total_c_args - 1; i >= 0; i--, c_arg--) {
      arg_order.push(i);
      arg_order.push(c_arg);
    }
  } else {
    // Compute a valid move order, using tmp_vmreg to break any cycles
    ComputeMoveOrder cmo(total_in_args, in_regs, total_c_args, out_regs, in_sig_bt, arg_order, tmp_vmreg);
  }

  int temploc = -1;
  for (int ai = 0; ai < arg_order.length(); ai += 2) {
    int i = arg_order.at(ai);
    int c_arg = arg_order.at(ai + 1);
    __ block_comment(err_msg("move %d -> %d", i, c_arg));
    if (c_arg == -1) {
      assert(is_critical_native, "should only be required for critical natives");
      // This arg needs to be moved to a temporary
      __ mov(tmp_vmreg.first()->as_Register(), in_regs[i].first()->as_Register());
      in_regs[i] = tmp_vmreg;
      temploc = i;
      continue;
    } else if (i == -1) {
      assert(is_critical_native, "should only be required for critical natives");
      // Read from the temporary location
      assert(temploc != -1, "must be valid");
      i = temploc;
      temploc = -1;
    }
#ifdef ASSERT
    if (in_regs[i].first()->is_Register()) {
      assert(!reg_destroyed[in_regs[i].first()->as_Register()->encoding()], "destroyed reg!");
    } else if (in_regs[i].first()->is_XMMRegister()) {
      assert(!freg_destroyed[in_regs[i].first()->as_XMMRegister()->encoding()], "destroyed reg!");
    }
    if (out_regs[c_arg].first()->is_Register()) {
      reg_destroyed[out_regs[c_arg].first()->as_Register()->encoding()] = true;
    } else if (out_regs[c_arg].first()->is_XMMRegister()) {
      freg_destroyed[out_regs[c_arg].first()->as_XMMRegister()->encoding()] = true;
    }
#endif /* ASSERT */
    switch (in_sig_bt[i]) {
      case T_ARRAY:
        if (is_critical_native) {
          unpack_array_argument(masm, in_regs[i], in_elem_bt[i], out_regs[c_arg + 1], out_regs[c_arg]);
          c_arg++;
#ifdef ASSERT
          if (out_regs[c_arg].first()->is_Register()) {
            reg_destroyed[out_regs[c_arg].first()->as_Register()->encoding()] = true;
          } else if (out_regs[c_arg].first()->is_XMMRegister()) {
            freg_destroyed[out_regs[c_arg].first()->as_XMMRegister()->encoding()] = true;
          }
#endif
          break;
        }
      case T_OBJECT:
        assert(!is_critical_native, "no oop arguments");
        __ object_move(map, oop_handle_offset, stack_slots, in_regs[i], out_regs[c_arg],
                    ((i == 0) && (!is_static)),
                    &receiver_offset);
        break;
      case T_VOID:
        break;

      case T_FLOAT:
        __ float_move(in_regs[i], out_regs[c_arg]);
          break;

      case T_DOUBLE:
        assert( i + 1 < total_in_args &&
                in_sig_bt[i + 1] == T_VOID &&
                out_sig_bt[c_arg+1] == T_VOID, "bad arg list");
        __ double_move(in_regs[i], out_regs[c_arg]);
        break;

      case T_LONG :
        __ long_move(in_regs[i], out_regs[c_arg]);
        break;

      case T_ADDRESS: assert(false, "found T_ADDRESS in java args");

      default:
        __ move32_64(in_regs[i], out_regs[c_arg]);
    }
  }

  int c_arg;

  // Pre-load a static method's oop into r14.  Used both by locking code and
  // the normal JNI call code.
  if (!is_critical_native) {
    // point c_arg at the first arg that is already loaded in case we
    // need to spill before we call out
    c_arg = total_c_args - total_in_args;

    if (method->is_static()) {

      //  load oop into a register
      __ movoop(oop_handle_reg, JNIHandles::make_local(method->method_holder()->java_mirror()));

      // Now handlize the static class mirror it's known not-null.
      __ movptr(Address(rsp, klass_offset), oop_handle_reg);
      map->set_oop(VMRegImpl::stack2reg(klass_slot_offset));

      // Now get the handle
      __ lea(oop_handle_reg, Address(rsp, klass_offset));
      // store the klass handle as second argument
      __ movptr(c_rarg1, oop_handle_reg);
      // and protect the arg if we must spill
      c_arg--;
    }
  } else {
    // For JNI critical methods we need to save all registers in save_args.
    c_arg = 0;
  }

  // Change state to native (we save the return address in the thread, since it might not
  // be pushed on the stack when we do a a stack traversal). It is enough that the pc()
  // points into the right code segment. It does not have to be the correct return pc.
  // We use the same pc/oopMap repeatedly when we call out

  intptr_t the_pc = (intptr_t) __ pc();
  oop_maps->add_gc_map(the_pc - start, map);

  __ set_last_Java_frame(rsp, noreg, (address)the_pc);


  // We have all of the arguments setup at this point. We must not touch any register
  // argument registers at this point (what if we save/restore them there are no oop?

  {
    SkipIfEqual skip(masm, &DTraceMethodProbes, false);
    // protect the args we've loaded
    save_args(masm, total_c_args, c_arg, out_regs);
    __ mov_metadata(c_rarg1, method());
    __ call_VM_leaf(
      CAST_FROM_FN_PTR(address, SharedRuntime::dtrace_method_entry),
      r15_thread, c_rarg1);
    restore_args(masm, total_c_args, c_arg, out_regs);
  }

  // RedefineClasses() tracing support for obsolete method entry
  if (log_is_enabled(Trace, redefine, class, obsolete)) {
    // protect the args we've loaded
    save_args(masm, total_c_args, c_arg, out_regs);
    __ mov_metadata(c_rarg1, method());
    __ call_VM_leaf(
      CAST_FROM_FN_PTR(address, SharedRuntime::rc_trace_method_entry),
      r15_thread, c_rarg1);
    restore_args(masm, total_c_args, c_arg, out_regs);
  }

  // Lock a synchronized method

  // Register definitions used by locking and unlocking

  const Register swap_reg = rax;  // Must use rax for cmpxchg instruction
  const Register obj_reg  = rbx;  // Will contain the oop
  const Register lock_reg = r13;  // Address of compiler lock object (BasicLock)
  const Register old_hdr  = r13;  // value of old header at unlock time

  Label slow_path_lock;
  Label lock_done;

  if (method->is_synchronized()) {
    assert(!is_critical_native, "unhandled");


    const int mark_word_offset = BasicLock::displaced_header_offset_in_bytes();

    // Get the handle (the 2nd argument)
    __ mov(oop_handle_reg, c_rarg1);

    // Get address of the box

    __ lea(lock_reg, Address(rsp, lock_slot_offset * VMRegImpl::stack_slot_size));

    // Load the oop from the handle
    __ movptr(obj_reg, Address(oop_handle_reg, 0));

    if (UseBiasedLocking) {
      __ biased_locking_enter(lock_reg, obj_reg, swap_reg, rscratch1, rscratch2, false, lock_done, &slow_path_lock);
    }

    // Load immediate 1 into swap_reg %rax
    __ movl(swap_reg, 1);

    // Load (object->mark() | 1) into swap_reg %rax
    __ orptr(swap_reg, Address(obj_reg, oopDesc::mark_offset_in_bytes()));

    // Save (object->mark() | 1) into BasicLock's displaced header
    __ movptr(Address(lock_reg, mark_word_offset), swap_reg);

    // src -> dest iff dest == rax else rax <- dest
    __ lock();
    __ cmpxchgptr(lock_reg, Address(obj_reg, oopDesc::mark_offset_in_bytes()));
    __ jcc(Assembler::equal, lock_done);

    // Hmm should this move to the slow path code area???

    // Test if the oopMark is an obvious stack pointer, i.e.,
    //  1) (mark & 3) == 0, and
    //  2) rsp <= mark < mark + os::pagesize()
    // These 3 tests can be done by evaluating the following
    // expression: ((mark - rsp) & (3 - os::vm_page_size())),
    // assuming both stack pointer and pagesize have their
    // least significant 2 bits clear.
    // NOTE: the oopMark is in swap_reg %rax as the result of cmpxchg

    __ subptr(swap_reg, rsp);
    __ andptr(swap_reg, 3 - os::vm_page_size());

    // Save the test result, for recursive case, the result is zero
    __ movptr(Address(lock_reg, mark_word_offset), swap_reg);
    __ jcc(Assembler::notEqual, slow_path_lock);

    // Slow path will re-enter here

    __ bind(lock_done);
  }

  // Finally just about ready to make the JNI call

  // get JNIEnv* which is first argument to native
  if (!is_critical_native) {
    __ lea(c_rarg0, Address(r15_thread, in_bytes(JavaThread::jni_environment_offset())));

    // Now set thread in native
    __ movl(Address(r15_thread, JavaThread::thread_state_offset()), _thread_in_native);
  }

  __ call(RuntimeAddress(native_func));

  // Verify or restore cpu control state after JNI call
  __ restore_cpu_control_state_after_jni();

  // Unpack native results.
  switch (ret_type) {
  case T_BOOLEAN: __ c2bool(rax);            break;
  case T_CHAR   : __ movzwl(rax, rax);      break;
  case T_BYTE   : __ sign_extend_byte (rax); break;
  case T_SHORT  : __ sign_extend_short(rax); break;
  case T_INT    : /* nothing to do */        break;
  case T_DOUBLE :
  case T_FLOAT  :
    // Result is in xmm0 we'll save as needed
    break;
  case T_ARRAY:                 // Really a handle
  case T_OBJECT:                // Really a handle
      break; // can't de-handlize until after safepoint check
  case T_VOID: break;
  case T_LONG: break;
  default       : ShouldNotReachHere();
  }

  Label after_transition;

  // If this is a critical native, check for a safepoint or suspend request after the call.
  // If a safepoint is needed, transition to native, then to native_trans to handle
  // safepoints like the native methods that are not critical natives.
  if (is_critical_native) {
    Label needs_safepoint;
    __ safepoint_poll(needs_safepoint, r15_thread, false /* at_return */, false /* in_nmethod */);
    __ cmpl(Address(r15_thread, JavaThread::suspend_flags_offset()), 0);
    __ jcc(Assembler::equal, after_transition);
    __ bind(needs_safepoint);
  }

  // Switch thread to "native transition" state before reading the synchronization state.
  // This additional state is necessary because reading and testing the synchronization
  // state is not atomic w.r.t. GC, as this scenario demonstrates:
  //     Java thread A, in _thread_in_native state, loads _not_synchronized and is preempted.
  //     VM thread changes sync state to synchronizing and suspends threads for GC.
  //     Thread A is resumed to finish this native method, but doesn't block here since it
  //     didn't see any synchronization is progress, and escapes.
  __ movl(Address(r15_thread, JavaThread::thread_state_offset()), _thread_in_native_trans);

  // Force this write out before the read below
  __ membar(Assembler::Membar_mask_bits(
              Assembler::LoadLoad | Assembler::LoadStore |
              Assembler::StoreLoad | Assembler::StoreStore));

  // check for safepoint operation in progress and/or pending suspend requests
  {
    Label Continue;
    Label slow_path;

    __ safepoint_poll(slow_path, r15_thread, true /* at_return */, false /* in_nmethod */);

    __ cmpl(Address(r15_thread, JavaThread::suspend_flags_offset()), 0);
    __ jcc(Assembler::equal, Continue);
    __ bind(slow_path);

    // Don't use call_VM as it will see a possible pending exception and forward it
    // and never return here preventing us from clearing _last_native_pc down below.
    // Also can't use call_VM_leaf either as it will check to see if rsi & rdi are
    // preserved and correspond to the bcp/locals pointers. So we do a runtime call
    // by hand.
    //
    __ vzeroupper();
    save_native_result(masm, ret_type, stack_slots);
    __ mov(c_rarg0, r15_thread);
    __ mov(r12, rsp); // remember sp
    __ subptr(rsp, frame::arg_reg_save_area_bytes); // windows
    __ andptr(rsp, -16); // align stack as required by ABI
    __ call(RuntimeAddress(CAST_FROM_FN_PTR(address, JavaThread::check_special_condition_for_native_trans)));
    __ mov(rsp, r12); // restore sp
    __ reinit_heapbase();
    // Restore any method result value
    restore_native_result(masm, ret_type, stack_slots);
    __ bind(Continue);
  }

  // change thread state
  __ movl(Address(r15_thread, JavaThread::thread_state_offset()), _thread_in_Java);
  __ bind(after_transition);

  Label reguard;
  Label reguard_done;
  __ cmpl(Address(r15_thread, JavaThread::stack_guard_state_offset()), StackOverflow::stack_guard_yellow_reserved_disabled);
  __ jcc(Assembler::equal, reguard);
  __ bind(reguard_done);

  // native result if any is live

  // Unlock
  Label unlock_done;
  Label slow_path_unlock;
  if (method->is_synchronized()) {

    // Get locked oop from the handle we passed to jni
    __ movptr(obj_reg, Address(oop_handle_reg, 0));

    Label done;

    if (UseBiasedLocking) {
      __ biased_locking_exit(obj_reg, old_hdr, done);
    }

    // Simple recursive lock?

    __ cmpptr(Address(rsp, lock_slot_offset * VMRegImpl::stack_slot_size), (int32_t)NULL_WORD);
    __ jcc(Assembler::equal, done);

    // Must save rax if if it is live now because cmpxchg must use it
    if (ret_type != T_FLOAT && ret_type != T_DOUBLE && ret_type != T_VOID) {
      save_native_result(masm, ret_type, stack_slots);
    }


    // get address of the stack lock
    __ lea(rax, Address(rsp, lock_slot_offset * VMRegImpl::stack_slot_size));
    //  get old displaced header
    __ movptr(old_hdr, Address(rax, 0));

    // Atomic swap old header if oop still contains the stack lock
    __ lock();
    __ cmpxchgptr(old_hdr, Address(obj_reg, oopDesc::mark_offset_in_bytes()));
    __ jcc(Assembler::notEqual, slow_path_unlock);

    // slow path re-enters here
    __ bind(unlock_done);
    if (ret_type != T_FLOAT && ret_type != T_DOUBLE && ret_type != T_VOID) {
      restore_native_result(masm, ret_type, stack_slots);
    }

    __ bind(done);

  }
  {
    SkipIfEqual skip(masm, &DTraceMethodProbes, false);
    save_native_result(masm, ret_type, stack_slots);
    __ mov_metadata(c_rarg1, method());
    __ call_VM_leaf(
         CAST_FROM_FN_PTR(address, SharedRuntime::dtrace_method_exit),
         r15_thread, c_rarg1);
    restore_native_result(masm, ret_type, stack_slots);
  }

  __ reset_last_Java_frame(false);

  // Unbox oop result, e.g. JNIHandles::resolve value.
  if (is_reference_type(ret_type)) {
    __ resolve_jobject(rax /* value */,
                       r15_thread /* thread */,
                       rcx /* tmp */);
  }

  if (CheckJNICalls) {
    // clear_pending_jni_exception_check
    __ movptr(Address(r15_thread, JavaThread::pending_jni_exception_check_fn_offset()), NULL_WORD);
  }

  if (!is_critical_native) {
    // reset handle block
    __ movptr(rcx, Address(r15_thread, JavaThread::active_handles_offset()));
    __ movl(Address(rcx, JNIHandleBlock::top_offset_in_bytes()), (int32_t)NULL_WORD);
  }

  // pop our frame

  __ leave();

  if (!is_critical_native) {
    // Any exception pending?
    __ cmpptr(Address(r15_thread, in_bytes(Thread::pending_exception_offset())), (int32_t)NULL_WORD);
    __ jcc(Assembler::notEqual, exception_pending);
  }

  // Return

  __ ret(0);

  // Unexpected paths are out of line and go here

  if (!is_critical_native) {
    // forward the exception
    __ bind(exception_pending);

    // and forward the exception
    __ jump(RuntimeAddress(StubRoutines::forward_exception_entry()));
  }

  // Slow path locking & unlocking
  if (method->is_synchronized()) {

    // BEGIN Slow path lock
    __ bind(slow_path_lock);

    // has last_Java_frame setup. No exceptions so do vanilla call not call_VM
    // args are (oop obj, BasicLock* lock, JavaThread* thread)

    // protect the args we've loaded
    save_args(masm, total_c_args, c_arg, out_regs);

    __ mov(c_rarg0, obj_reg);
    __ mov(c_rarg1, lock_reg);
    __ mov(c_rarg2, r15_thread);

    // Not a leaf but we have last_Java_frame setup as we want
    __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::complete_monitor_locking_C), 3);
    restore_args(masm, total_c_args, c_arg, out_regs);

#ifdef ASSERT
    { Label L;
    __ cmpptr(Address(r15_thread, in_bytes(Thread::pending_exception_offset())), (int32_t)NULL_WORD);
    __ jcc(Assembler::equal, L);
    __ stop("no pending exception allowed on exit from monitorenter");
    __ bind(L);
    }
#endif
    __ jmp(lock_done);

    // END Slow path lock

    // BEGIN Slow path unlock
    __ bind(slow_path_unlock);

    // If we haven't already saved the native result we must save it now as xmm registers
    // are still exposed.
    __ vzeroupper();
    if (ret_type == T_FLOAT || ret_type == T_DOUBLE ) {
      save_native_result(masm, ret_type, stack_slots);
    }

    __ lea(c_rarg1, Address(rsp, lock_slot_offset * VMRegImpl::stack_slot_size));

    __ mov(c_rarg0, obj_reg);
    __ mov(c_rarg2, r15_thread);
    __ mov(r12, rsp); // remember sp
    __ subptr(rsp, frame::arg_reg_save_area_bytes); // windows
    __ andptr(rsp, -16); // align stack as required by ABI

    // Save pending exception around call to VM (which contains an EXCEPTION_MARK)
    // NOTE that obj_reg == rbx currently
    __ movptr(rbx, Address(r15_thread, in_bytes(Thread::pending_exception_offset())));
    __ movptr(Address(r15_thread, in_bytes(Thread::pending_exception_offset())), (int32_t)NULL_WORD);

    // args are (oop obj, BasicLock* lock, JavaThread* thread)
    __ call(RuntimeAddress(CAST_FROM_FN_PTR(address, SharedRuntime::complete_monitor_unlocking_C)));
    __ mov(rsp, r12); // restore sp
    __ reinit_heapbase();
#ifdef ASSERT
    {
      Label L;
      __ cmpptr(Address(r15_thread, in_bytes(Thread::pending_exception_offset())), (int)NULL_WORD);
      __ jcc(Assembler::equal, L);
      __ stop("no pending exception allowed on exit complete_monitor_unlocking_C");
      __ bind(L);
    }
#endif /* ASSERT */

    __ movptr(Address(r15_thread, in_bytes(Thread::pending_exception_offset())), rbx);

    if (ret_type == T_FLOAT || ret_type == T_DOUBLE ) {
      restore_native_result(masm, ret_type, stack_slots);
    }
    __ jmp(unlock_done);

    // END Slow path unlock

  } // synchronized

  // SLOW PATH Reguard the stack if needed

  __ bind(reguard);
  __ vzeroupper();
  save_native_result(masm, ret_type, stack_slots);
  __ mov(r12, rsp); // remember sp
  __ subptr(rsp, frame::arg_reg_save_area_bytes); // windows
  __ andptr(rsp, -16); // align stack as required by ABI
  __ call(RuntimeAddress(CAST_FROM_FN_PTR(address, SharedRuntime::reguard_yellow_pages)));
  __ mov(rsp, r12); // restore sp
  __ reinit_heapbase();
  restore_native_result(masm, ret_type, stack_slots);
  // and continue
  __ jmp(reguard_done);



  __ flush();

  nmethod *nm = nmethod::new_native_nmethod(method,
                                            compile_id,
                                            masm->code(),
                                            vep_offset,
                                            frame_complete,
                                            stack_slots / VMRegImpl::slots_per_word,
                                            (is_static ? in_ByteSize(klass_offset) : in_ByteSize(receiver_offset)),
                                            in_ByteSize(lock_slot_offset*VMRegImpl::stack_slot_size),
                                            oop_maps);

  return nm;
}

address native_function() const                { return *(native_function_addr()); }

Java Native 方法底层原理深度解析：从 JNI 注册到 Native Wrapper 生成

一、从一段 Java 代码开始

二、JNI 动态注册：以 Thread.registerNatives 为例

2.1 Java 侧的静态初始化

2.2 C 侧的方法表与注册函数

2.3 JVM 层 RegisterNatives 的实现

三、Method::register_native：绑定元数据与函数指针

3.1 set_native_function 的内存写入

四、Native 方法的调用触发：解释器 / JIT 的入口

五、SharedRuntime::generate_native_wrapper 全面解析

5.1 函数签名与初始检查

5.2 计算参数个数与 C 调用约定签名

5.3 计算栈帧大小

5.4 生成入口代码：栈溢出检查、帧建立

5.5 "大洗牌"：将 Java 参数移动到 C 参数位置

5.6 同步方法的锁获取

5.7 设置线程状态并调用 native 函数

5.8 从 native 返回后的状态切换与安全点检查

5.9 处理返回值与异常

5.10 生成 nmethod 并返回

六、完整调用链路总结

七、总结与延伸思考

二、JNI 动态注册：以 `Thread.registerNatives` 为例

2.3 JVM 层 `RegisterNatives` 的实现

三、`Method::register_native`：绑定元数据与函数指针

3.1 `set_native_function` 的内存写入

五、`SharedRuntime::generate_native_wrapper` 全面解析