安全点(Safepoint)完成后唤醒暂停线程的过程

OpenJdk 17源码,安全点(Safepoint)完成后唤醒暂停线程的过程涉及多个关键步骤,主要集中在SafepointSynchronize::end()disarm_safepoint()函数中。以下是完整的唤醒流程分析:


1. 安全点结束入口:SafepointSynchronize::end()

  • 调用路径
    VMThread::inner_execute()SafepointSynchronize::end()(当安全点操作结束时)。

  • 核心操作

    cpp

    复制代码
    void SafepointSynchronize::end() {
      assert(Thread::current()->is_VM_thread(), "Only VM thread can end safepoint");
      disarm_safepoint();           // 解除安全点并唤醒线程
      Universe::heap()->safepoint_synchronize_end(); // 通知堆管理器
      post_safepoint_end_event();   // 记录事件
    }

2. 核心唤醒逻辑:disarm_safepoint()

此函数是唤醒线程的核心,分为四个阶段:

(1) 更新全局状态

cpp

复制代码
_state = _not_synchronized;  // 标记安全点已结束
Atomic::release_store(&_safepoint_counter, _safepoint_counter + 1); // 更新安全点计数器(奇→偶)
  • 作用

    将安全点状态从_synchronized改为_not_synchronized,确保后续线程检查时不再进入安全点等待。

  • 计数器意义

    奇数表示安全点活跃,偶数表示非活动。线程通过检查计数器的奇偶性决定是否进入安全点。

(2) 恢复线程状态

cpp

复制代码
for (JavaThread *current = jtiwh.next(); ) {
  ThreadSafepointState* cur_state = current->safepoint_state();
  cur_state->restart(); // 将线程状态标记为运行(TSS _running)
}
  • 作用

    遍历所有Java线程,将其安全点状态(ThreadSafepointState)从暂停状态(_at_safepoint)重置为运行状态(_running)。

(3) 释放线程锁

cpp

复制代码
Threads_lock->unlock(); // 允许线程创建/销毁
  • 作用

    在安全点期间持有的Threads_lock被释放,允许JVM创建或销毁线程。

(4) 唤醒所有等待线程

cpp

复制代码
_wait_barrier->disarm(); // 通过屏障唤醒线程
  • 作用

    调用WaitBarrier(Linux实现为LinuxWaitBarrier)的disarm()方法,唤醒所有在安全点中暂停的线程。


3. 屏障唤醒实现:LinuxWaitBarrier::disarm()

cpp

复制代码
void LinuxWaitBarrier::disarm() {
  _futex_barrier = 0; // 重置屏障值为0(表示非活动)
  futex(&_futex_barrier, FUTEX_WAKE_PRIVATE, INT_MAX); // 唤醒所有等待线程
}
  • futex系统调用

    使用FUTEX_WAKE_PRIVATE唤醒所有在_futex_barrier上等待的线程(INT_MAX表示唤醒数量无上限)。

  • 屏障状态

    _futex_barrier设为0,使线程在wait()中退出循环(见下方逻辑)。


4. 线程如何被唤醒?

  • 线程暂停逻辑

    在安全点期间,线程通过LinuxWaitBarrier::wait()挂起:

    cpp

    复制代码
    void LinuxWaitBarrier::wait(int barrier_tag) {
      while (barrier_tag == _futex_barrier) {
        futex(&_futex_barrier, FUTEX_WAIT_PRIVATE, barrier_tag); // 挂起线程
      }
    }
  • 唤醒后的行为

    disarm()_futex_barrier设为0后:

    1. futex调用唤醒所有等待线程。

    2. 线程检查barrier_tag != _futex_barrier(因为_futex_barrier=0),退出循环。

    3. 线程继续执行安全点之后的代码。


关键设计总结

  1. 状态同步

    通过_state_safepoint_counter全局状态,确保线程安全退出安全点。

  2. 屏障机制

    使用futex实现高效等待/唤醒,避免忙等待。

  3. 线程状态重置

    每个线程的ThreadSafepointState被标记为运行(restart())。

  4. 锁释放

    安全点结束后释放Threads_lock,恢复JVM正常操作。

注意 :唤醒后线程会执行OrderAccess::fence()内存屏障,确保看到安全点结束后的内存状态一致性。

##源码

cpp 复制代码
void VMThread::inner_execute(VM_Operation* op) {
  assert(Thread::current()->is_VM_thread(), "Must be the VM thread");

  VM_Operation* prev_vm_operation = NULL;
  if (_cur_vm_operation != NULL) {
    // Check that the VM operation allows nested VM operation.
    // This is normally not the case, e.g., the compiler
    // does not allow nested scavenges or compiles.
    if (!_cur_vm_operation->allow_nested_vm_operations()) {
      fatal("Unexpected nested VM operation %s requested by operation %s",
            op->name(), _cur_vm_operation->name());
    }
    op->set_calling_thread(_cur_vm_operation->calling_thread());
    prev_vm_operation = _cur_vm_operation;
  }

  _cur_vm_operation = op;

  HandleMark hm(VMThread::vm_thread());
  EventMarkVMOperation em("Executing %sVM operation: %s", prev_vm_operation != NULL ? "nested " : "", op->name());

  log_debug(vmthread)("Evaluating %s %s VM operation: %s",
                       prev_vm_operation != NULL ? "nested" : "",
                      _cur_vm_operation->evaluate_at_safepoint() ? "safepoint" : "non-safepoint",
                      _cur_vm_operation->name());

  bool end_safepoint = false;
  if (_cur_vm_operation->evaluate_at_safepoint() &&
      !SafepointSynchronize::is_at_safepoint()) {
    SafepointSynchronize::begin();
    if (_timeout_task != NULL) {
      _timeout_task->arm();
    }
    end_safepoint = true;
  }

  evaluate_operation(_cur_vm_operation);

  if (end_safepoint) {
    if (_timeout_task != NULL) {
      _timeout_task->disarm();
    }
    SafepointSynchronize::end();
  }

  _cur_vm_operation = prev_vm_operation;
}


// Wake up all threads, so they are ready to resume execution after the safepoint
// operation has been carried out
void SafepointSynchronize::end() {
  assert(Threads_lock->owned_by_self(), "must hold Threads_lock");
  EventSafepointEnd event;
  assert(Thread::current()->is_VM_thread(), "Only VM thread can execute a safepoint");

  disarm_safepoint();

  Universe::heap()->safepoint_synchronize_end();

  SafepointTracing::end();

  post_safepoint_end_event(event, safepoint_id());
}


void SafepointSynchronize::disarm_safepoint() {
  uint64_t active_safepoint_counter = _safepoint_counter;
  {
    JavaThreadIteratorWithHandle jtiwh;
#ifdef ASSERT
    // A pending_exception cannot be installed during a safepoint.  The threads
    // may install an async exception after they come back from a safepoint into
    // pending_exception after they unblock.  But that should happen later.
    for (; JavaThread *cur = jtiwh.next(); ) {
      assert (!(cur->has_pending_exception() &&
                cur->safepoint_state()->is_at_poll_safepoint()),
              "safepoint installed a pending exception");
    }
#endif // ASSERT

    OrderAccess::fence(); // keep read and write of _state from floating up
    assert(_state == _synchronized, "must be synchronized before ending safepoint synchronization");

    // Change state first to _not_synchronized.
    // No threads should see _synchronized when running.
    _state = _not_synchronized;

    // Set the next dormant (even) safepoint id.
    assert((_safepoint_counter & 0x1) == 1, "must be odd");
    Atomic::release_store(&_safepoint_counter, _safepoint_counter + 1);

    OrderAccess::fence(); // Keep the local state from floating up.

    jtiwh.rewind();
    for (; JavaThread *current = jtiwh.next(); ) {
      // Clear the visited flag to ensure that the critical counts are collected properly.
      DEBUG_ONLY(current->reset_visited_for_critical_count(active_safepoint_counter);)
      ThreadSafepointState* cur_state = current->safepoint_state();
      assert(!cur_state->is_running(), "Thread not suspended at safepoint");
      cur_state->restart(); // TSS _running
      assert(cur_state->is_running(), "safepoint state has not been reset");
    }
  } // ~JavaThreadIteratorWithHandle

  // Release threads lock, so threads can be created/destroyed again.
  Threads_lock->unlock();

  // Wake threads after local state is correctly set.
  _wait_barrier->disarm();
}

// Guarantees any thread that called wait() will be awake when it returns.
  // Provides a trailing fence.
  void disarm() {
    assert(_owner == Thread::current(), "Not owner thread");
    _impl.disarm();
  }


void LinuxWaitBarrier::disarm() {
  assert(_futex_barrier != 0, "Should be armed/non-zero.");
  _futex_barrier = 0;
  int s = futex(&_futex_barrier,
                FUTEX_WAKE_PRIVATE,
                INT_MAX /* wake a max of this many threads */);
  guarantee_with_errno(s > -1, "futex FUTEX_WAKE failed");
}

void LinuxWaitBarrier::wait(int barrier_tag) {
  assert(barrier_tag != 0, "Trying to wait on disarmed value");
  if (barrier_tag == 0 ||
      barrier_tag != _futex_barrier) {
    OrderAccess::fence();
    return;
  }
  do {
    int s = futex(&_futex_barrier,
                  FUTEX_WAIT_PRIVATE,
                  barrier_tag /* should be this tag */);
    guarantee_with_errno((s == 0) ||
                         (s == -1 && errno == EAGAIN) ||
                         (s == -1 && errno == EINTR),
                         "futex FUTEX_WAIT failed");
    // Return value 0: woken up, but re-check in case of spurious wakeup.
    // Error EINTR: woken by signal, so re-check and re-wait if necessary.
    // Error EAGAIN: we are already disarmed and so will pass the check.
  } while (barrier_tag == _futex_barrier);
}