在 OpenJDK 中,每一个 Java 线程最终都会调用操作系统提供的线程创建接口。对于 Linux 平台,这个接口就是 POSIX 标准中的 pthread_create。JVM 将自己的线程入口函数 thread_native_entry 作为参数传递给 pthread_create,那么 glibc 是如何一步步将这个函数"安装"到新线程中,并最终调用它的呢?本文将通过 glibc 2.34+ 的源码(NPTL 线程库)来还原这一过程。
一、JVM 的调用约定
在 OpenJDK 的 os::create_thread 中,最终会调用:
c
pthread_create(&tid, &attr, thread_native_entry, thread)
-
thread_native_entry是一个静态函数,原型为void* thread_native_entry(void*)。 -
第四个参数
thread是 JVM 内部的Thread对象指针。
JVM 期望新线程一启动就执行 thread_native_entry(thread)。下面我们看 glibc 如何处理这个请求。
二、__pthread_create_2_1:分配线程描述符并保存用户函数
glibc 中 pthread_create 的实际实现是 __pthread_create_2_1(版本化符号)。它的主要工作包括:
-
处理线程属性 (如果
attr为NULL则获取默认属性)。 -
分配栈和线程描述符 :调用
allocate_stack得到一个struct pthread *pd,这个结构体就是 NPTL 的线程控制块(TCB)。 -
保存用户函数和参数:
c
pd->start_routine = start_routine; // 这里就是 thread_native_entry
pd->arg = arg; // JVM 的 Thread 对象指针
pd->c11 = c11; // 是否 C11 线程
-
初始化各种字段(调度参数、信号掩码等)。
-
调用
create_thread真正创建内核线程。
关键点:此时新线程还没有被创建,但用户函数已经被安全地存放在 pd 中。
三、create_thread:准备 clone 系统调用并指定入口
create_thread 函数(位于 sysdeps/unix/sysv/linux/createthread.c 或 pthread_create.c 内部)负责构造 clone 或 clone3 的参数。其中最重要的两行是:
c
const int clone_flags = ...; // 共享 VM, FS, 文件, 信号等
TLS_DEFINE_INIT_TP (tp, pd);
struct clone_args args = { ... };
int ret = __clone_internal (&args, &start_thread, pd);
注意:__clone_internal 的第二个参数是 &start_thread ,而不是用户提供的 start_routine。第三个参数 pd 会作为参数传递给 start_thread。
也就是说,glibc 不会 直接把 thread_native_entry 交给内核,而是交给内核一个包装函数 start_thread。这个包装函数会在新线程启动后,再从 pd 中取出真正的用户函数并调用。
四、start_thread:真正的线程入口点
start_thread 是一个静态函数,被标记为 _Noreturn,它的核心逻辑如下:
c
static int _Noreturn start_thread (void *arg) {
struct pthread *pd = arg;
// 1. 如果需要启动时同步(比如调试器附加或设置调度属性),先获取锁
if (pd->stopped_start) {
lll_lock (pd->lock, LLL_PRIVATE);
// ... 检查 setup_failed 等
lll_unlock (pd->lock, LLL_PRIVATE);
}
// 2. 初始化 TLS、信号掩码、鲁棒互斥锁等
// ...
// 3. 使用 setjmp 建立取消点处理
// ...
// 4. 最终调用用户提供的函数
void *ret;
if (pd->c11)
ret = (void*)(uintptr_t)((int(*)(void*))pd->start_routine)(pd->arg);
else
ret = pd->start_routine (pd->arg); // 这里就是 thread_native_entry
// 5. 保存返回值,执行线程局部存储析构,然后退出
THREAD_SETMEM(pd, result, ret);
// ... 清理 ...
while (1) INTERNAL_SYSCALL_CALL(exit, 0);
}
关键行 pd->start_routine(pd->arg) 正是 JVM 期待的 thread_native_entry(thread)。至此,新线程终于执行了 Java 线程的底层初始化代码。
五、为什么需要 start_thread 这一层包装?
直接让内核调用 thread_native_entry 不是更简单吗?原因有几点:
-
统一的 POSIX 线程模型 :
pthread_create要求线程结束时调用pthread_exit或返回,并且需要处理线程局部存储析构、清理栈、通知调试器等。这些工作不能由用户函数完成,必须由库代码包裹。 -
同步启动 :有些场景需要新线程创建后暂时挂起,等待创建线程设置调度策略或 CPU 亲和性(
stopped_start机制)。start_thread中的锁同步保证了这一点。 -
取消点与异常处理 :
setjmp和unwind_buf支持 pthread 取消操作,这需要在线程入口处建立上下文。 -
健壮的退出路径 :无论用户函数是正常返回还是调用
pthread_exit,start_thread都能统一清理资源。
因此,glibc 使用 start_thread 作为所有 POSIX 线程的统一入口,而用户函数只是被它调用的一个"有效载荷"。
六、完整调用链回顾
text
JVM: os::create_thread
└─> pthread_create(..., thread_native_entry, thread)
glibc: __pthread_create_2_1
├─> allocate_stack() → struct pthread *pd
├─> pd->start_routine = thread_native_entry
├─> pd->arg = thread
└─> create_thread(pd, ...)
└─> __clone_internal(..., start_thread, pd)
内核: clone / clone3 系统调用
└─> 新线程创建后,用户态从 start_thread 开始执行
glibc: start_thread(pd)
├─> 同步、初始化
└─> pd->start_routine(pd->arg) // 调用 thread_native_entry
JVM: thread_native_entry(thread)
└─> Thread::call_run() → Java 代码
七、总结
从 JVM 的 thread_native_entry 到最终被执行,glibc 扮演了一个"幕后导演"的角色:它并没有简单地将用户函数地址传递给内核,而是插入了一个包装函数 start_thread。这个包装函数负责所有 POSIX 线程必须的初始化、同步、清理工作,并在适当的时机回调用户函数。
理解了这一层设计,就能明白为什么 pthread_create 的入口函数必须遵循 void* (void*) 的签名,以及为什么线程退出时能够自动释放栈和 TLS。对于 JVM 开发者来说,这不仅是底层知识的补充,更是调试线程创建问题(如 pthread_create 失败、线程启动卡死)的有力工具。
当你下一次在 JVM 源码中看到 thread_native_entry 时,不妨想想它背后的 glibc 栈帧------一个默默无闻的 start_thread,正举着锁,准备好了一切,然后轻轻喊出那句:"运行吧,Java 线程。"
##源码
/* CREATE THREAD NOTES:
create_thread must initialize PD->stopped_start. It should be true
if the STOPPED_START parameter is true, or if create_thread needs the
new thread to synchronize at startup for some other implementation
reason. If STOPPED_START will be true, then create_thread is obliged
to lock PD->lock before starting the thread. Then pthread_create
unlocks PD->lock which synchronizes-with create_thread in the
child thread which does an acquire/release of PD->lock as the last
action before calling the user entry point. The goal of all of this
is to ensure that the required initial thread attributes are applied
(by the creating thread) before the new thread runs user code. Note
that the the functions pthread_getschedparam, pthread_setschedparam,
pthread_setschedprio, __pthread_tpp_change_priority, and
__pthread_current_priority reuse the same lock, PD->lock, for a
similar purpose e.g. synchronizing the setting of similar thread
attributes. These functions are never called before the thread is
created, so don't participate in startup synchronization, but given
that the lock is present already and in the unlocked state, reusing
it saves space.
The return value is zero for success or an errno code for failure.
If the return value is ENOMEM, that will be translated to EAGAIN,
so create_thread need not do that. On failure, *THREAD_RAN should
be set to true iff the thread actually started up but before calling
the user code (*PD->start_routine). */
static int _Noreturn start_thread (void *arg);
static int create_thread (struct pthread *pd, const struct pthread_attr *attr,
bool *stopped_start, void *stackaddr,
size_t stacksize, bool *thread_ran)
{
/* Determine whether the newly created threads has to be started
stopped since we have to set the scheduling parameters or set the
affinity. */
bool need_setaffinity = (attr != NULL && attr->extension != NULL
&& attr->extension->cpuset != 0);
if (attr != NULL
&& (__glibc_unlikely (need_setaffinity)
|| __glibc_unlikely ((attr->flags & ATTR_FLAG_NOTINHERITSCHED) != 0)))
*stopped_start = true;
pd->stopped_start = *stopped_start;
if (__glibc_unlikely (*stopped_start))
lll_lock (pd->lock, LLL_PRIVATE);
/* We rely heavily on various flags the CLONE function understands:
CLONE_VM, CLONE_FS, CLONE_FILES
These flags select semantics with shared address space and
file descriptors according to what POSIX requires.
CLONE_SIGHAND, CLONE_THREAD
This flag selects the POSIX signal semantics and various
other kinds of sharing (itimers, POSIX timers, etc.).
CLONE_SETTLS
The sixth parameter to CLONE determines the TLS area for the
new thread.
CLONE_PARENT_SETTID
The kernels writes the thread ID of the newly created thread
into the location pointed to by the fifth parameters to CLONE.
Note that it would be semantically equivalent to use
CLONE_CHILD_SETTID but it is be more expensive in the kernel.
CLONE_CHILD_CLEARTID
The kernels clears the thread ID of a thread that has called
sys_exit() in the location pointed to by the seventh parameter
to CLONE.
The termination signal is chosen to be zero which means no signal
is sent. */
const int clone_flags = (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SYSVSEM
| CLONE_SIGHAND | CLONE_THREAD
| CLONE_SETTLS | CLONE_PARENT_SETTID
| CLONE_CHILD_CLEARTID
| 0);
TLS_DEFINE_INIT_TP (tp, pd);
struct clone_args args =
{
.flags = clone_flags,
.pidfd = (uintptr_t) &pd->tid,
.parent_tid = (uintptr_t) &pd->tid,
.child_tid = (uintptr_t) &pd->tid,
.stack = (uintptr_t) stackaddr,
.stack_size = stacksize,
.tls = (uintptr_t) tp,
};
int ret = __clone_internal (&args, &start_thread, pd);
if (__glibc_unlikely (ret == -1))
return errno;
/* It's started now, so if we fail below, we'll have to let it clean itself
up. */
*thread_ran = true;
/* Now we have the possibility to set scheduling parameters etc. */
if (attr != NULL)
{
/* Set the affinity mask if necessary. */
if (need_setaffinity)
{
assert (*stopped_start);
int res = INTERNAL_SYSCALL_CALL (sched_setaffinity, pd->tid,
attr->extension->cpusetsize,
attr->extension->cpuset);
if (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (res)))
return INTERNAL_SYSCALL_ERRNO (res);
}
/* Set the scheduling parameters. */
if ((attr->flags & ATTR_FLAG_NOTINHERITSCHED) != 0)
{
assert (*stopped_start);
int res = INTERNAL_SYSCALL_CALL (sched_setscheduler, pd->tid,
pd->schedpolicy, &pd->schedparam);
if (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (res)))
return INTERNAL_SYSCALL_ERRNO (res);
}
}
return 0;
}
/* Local function to start thread and handle cleanup. */
static int _Noreturn
start_thread (void *arg)
{
struct pthread *pd = arg;
/* We are either in (a) or (b), and in either case we either own PD already
(2) or are about to own PD (1), and so our only restriction would be that
we can't free PD until we know we have ownership (see CONCURRENCY NOTES
above). */
if (pd->stopped_start)
{
bool setup_failed = false;
/* Get the lock the parent locked to force synchronization. */
lll_lock (pd->lock, LLL_PRIVATE);
/* We have ownership of PD now, for detached threads with setup failure
we set it as joinable so the creating thread could synchronous join
and free any resource prior return to the pthread_create caller. */
setup_failed = pd->setup_failed == 1;
if (setup_failed)
pd->joinid = NULL;
/* And give it up right away. */
lll_unlock (pd->lock, LLL_PRIVATE);
if (setup_failed)
goto out;
}
/* Initialize resolver state pointer. */
__resp = &pd->res;
/* Initialize pointers to locale data. */
__ctype_init ();
/* Name the thread stack if kernel supports it. */
name_stack_maps (pd, true);
/* Register rseq TLS to the kernel. */
{
bool do_rseq = THREAD_GETMEM (pd, flags) & ATTR_FLAG_DO_RSEQ;
if (!rseq_register_current_thread (pd, do_rseq) && do_rseq)
__libc_fatal ("Fatal glibc error: rseq registration failed\n");
}
#ifndef __ASSUME_SET_ROBUST_LIST
if (__nptl_set_robust_list_avail)
#endif
{
/* This call should never fail because the initial call in init.c
succeeded. */
INTERNAL_SYSCALL_CALL (set_robust_list, &pd->robust_head,
sizeof (struct robust_list_head));
}
/* This is where the try/finally block should be created. For
compilers without that support we do use setjmp. */
struct pthread_unwind_buf unwind_buf;
int not_first_call;
DIAG_PUSH_NEEDS_COMMENT;
#if __GNUC_PREREQ (7, 0)
/* This call results in a -Wstringop-overflow warning because struct
pthread_unwind_buf is smaller than jmp_buf. setjmp and longjmp
do not use anything beyond the common prefix (they never access
the saved signal mask), so that is a false positive. */
DIAG_IGNORE_NEEDS_COMMENT (11, "-Wstringop-overflow=");
#endif
not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
DIAG_POP_NEEDS_COMMENT;
/* No previous handlers. NB: This must be done after setjmp since the
private space in the unwind jump buffer may overlap space used by
setjmp to store extra architecture-specific information which is
never used by the cancellation-specific __libc_unwind_longjmp.
The private space is allowed to overlap because the unwinder never
has to return through any of the jumped-to call frames, and thus
only a minimum amount of saved data need be stored, and for example,
need not include the process signal mask information. This is all
an optimization to reduce stack usage when pushing cancellation
handlers. */
unwind_buf.priv.data.prev = NULL;
unwind_buf.priv.data.cleanup = NULL;
/* Allow setxid from now onwards. */
if (__glibc_unlikely (atomic_exchange_acquire (&pd->setxid_futex, 0) == -2))
futex_wake (&pd->setxid_futex, 1, FUTEX_PRIVATE);
if (__glibc_likely (! not_first_call))
{
/* Store the new cleanup handler info. */
THREAD_SETMEM (pd, cleanup_jmp_buf, &unwind_buf);
internal_signal_restore_set (&pd->sigmask);
LIBC_PROBE (pthread_start, 3, (pthread_t) pd, pd->start_routine, pd->arg);
/* Run the code the user provided. */
void *ret;
if (pd->c11)
{
/* The function pointer of the c11 thread start is cast to an incorrect
type on __pthread_create_2_1 call, however it is casted back to correct
one so the call behavior is well-defined (it is assumed that pointers
to void are able to represent all values of int. */
int (*start)(void*) = (int (*) (void*)) pd->start_routine;
ret = (void*) (uintptr_t) start (pd->arg);
}
else
ret = pd->start_routine (pd->arg);
THREAD_SETMEM (pd, result, ret);
}
/* Call destructors for the thread_local TLS variables. */
call_function_static_weak (__call_tls_dtors);
/* Run the destructor for the thread-local data. */
__nptl_deallocate_tsd ();
/* Clean up any state libc stored in thread-local variables. */
__libc_thread_freeres ();
/* Report the death of the thread if this is wanted. */
if (__glibc_unlikely (pd->report_events))
{
/* See whether TD_DEATH is in any of the mask. */
const int idx = __td_eventword (TD_DEATH);
const uint32_t mask = __td_eventmask (TD_DEATH);
if ((mask & (__nptl_threads_events.event_bits[idx]
| pd->eventbuf.eventmask.event_bits[idx])) != 0)
{
/* Yep, we have to signal the death. Add the descriptor to
the list but only if it is not already on it. */
if (pd->nextevent == NULL)
{
pd->eventbuf.eventnum = TD_DEATH;
pd->eventbuf.eventdata = pd;
do
pd->nextevent = __nptl_last_event;
while (atomic_compare_and_exchange_bool_acq (&__nptl_last_event,
pd, pd->nextevent));
}
/* Now call the function which signals the event. See
CONCURRENCY NOTES for the nptl_db interface comments. */
__nptl_death_event ();
}
}
/* The thread is exiting now. Don't set this bit until after we've hit
the event-reporting breakpoint, so that td_thr_get_info on us while at
the breakpoint reports TD_THR_RUN state rather than TD_THR_ZOMBIE. */
atomic_fetch_or_relaxed (&pd->cancelhandling, EXITING_BITMASK);
if (__glibc_unlikely (atomic_fetch_add_relaxed (&__nptl_nthreads, -1) == 1))
/* This was the last thread. */
exit (0);
/* This prevents sending a signal from this thread to itself during
its final stages. This must come after the exit call above
because atexit handlers must not run with signals blocked.
Do not block SIGSETXID. The setxid handshake below expects the
signal to be delivered. (SIGSETXID cannot run application code,
nor does it use pthread_kill.) Reuse the pd->sigmask space for
computing the signal mask, to save stack space. */
internal_sigfillset (&pd->sigmask);
internal_sigdelset (&pd->sigmask, SIGSETXID);
INTERNAL_SYSCALL_CALL (rt_sigprocmask, SIG_BLOCK, &pd->sigmask, NULL,
__NSIG_BYTES);
/* Tell __pthread_kill_internal that this thread is about to exit.
If there is a __pthread_kill_internal in progress, this delays
the thread exit until the signal has been queued by the kernel
(so that the TID used to send it remains valid). */
__libc_lock_lock (pd->exit_lock);
pd->exiting = true;
__libc_lock_unlock (pd->exit_lock);
#ifndef __ASSUME_SET_ROBUST_LIST
/* If this thread has any robust mutexes locked, handle them now. */
# if __PTHREAD_MUTEX_HAVE_PREV
void *robust = pd->robust_head.list;
# else
__pthread_slist_t *robust = pd->robust_list.__next;
# endif
/* We let the kernel do the notification if it is able to do so.
If we have to do it here there for sure are no PI mutexes involved
since the kernel support for them is even more recent. */
if (!__nptl_set_robust_list_avail
&& __builtin_expect (robust != (void *) &pd->robust_head, 0))
{
do
{
struct __pthread_mutex_s *this = (struct __pthread_mutex_s *)
((char *) robust - offsetof (struct __pthread_mutex_s,
__list.__next));
robust = *((void **) robust);
# if __PTHREAD_MUTEX_HAVE_PREV
this->__list.__prev = NULL;
# endif
this->__list.__next = NULL;
atomic_fetch_or_acquire (&this->__lock, FUTEX_OWNER_DIED);
futex_wake ((unsigned int *) &this->__lock, 1,
/* XYZ */ FUTEX_SHARED);
}
while (robust != (void *) &pd->robust_head);
}
#endif
if (!pd->user_stack)
advise_stack_range (pd->stackblock, pd->stackblock_size, (uintptr_t) pd,
pd->guardsize);
if (__glibc_unlikely (pd->cancelhandling & SETXID_BITMASK))
{
/* Some other thread might call any of the setXid functions and expect
us to reply. In this case wait until we did that. */
do
/* XXX This differs from the typical futex_wait_simple pattern in that
the futex_wait condition (setxid_futex) is different from the
condition used in the surrounding loop (cancelhandling). We need
to check and document why this is correct. */
futex_wait_simple (&pd->setxid_futex, 0, FUTEX_PRIVATE);
while (pd->cancelhandling & SETXID_BITMASK);
/* Reset the value so that the stack can be reused. */
pd->setxid_futex = 0;
}
/* If the thread is detached free the TCB. */
if (IS_DETACHED (pd))
/* Free the TCB. */
__nptl_free_tcb (pd);
/* Remove the associated name from the thread stack. */
name_stack_maps (pd, false);
out:
/* We cannot call '_exit' here. '_exit' will terminate the process.
The 'exit' implementation in the kernel will signal when the
process is really dead since 'clone' got passed the CLONE_CHILD_CLEARTID
flag. The 'tid' field in the TCB will be set to zero.
rseq TLS is still registered at this point. Rely on implicit
unregistration performed by the kernel on thread teardown. This is not a
problem because the rseq TLS lives on the stack, and the stack outlives
the thread. If TCB allocation is ever changed, additional steps may be
required, such as performing explicit rseq unregistration before
reclaiming the rseq TLS area memory. It is NOT sufficient to block
signals because the kernel may write to the rseq area even without
signals.
The exit code is zero since in case all threads exit by calling
'pthread_exit' the exit status must be 0 (zero). */
while (1)
INTERNAL_SYSCALL_CALL (exit, 0);
/* NOTREACHED */
}
int
__pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
void *(*start_routine) (void *), void *arg)
{
void *stackaddr = NULL;
size_t stacksize = 0;
/* Avoid a data race in the multi-threaded case, and call the
deferred initialization only once. */
if (__libc_single_threaded_internal)
{
late_init ();
__libc_single_threaded_internal = 0;
/* __libc_single_threaded can be accessed through copy relocations, so
it requires to update the external copy. */
__libc_single_threaded = 0;
}
const struct pthread_attr *iattr = (struct pthread_attr *) attr;
union pthread_attr_transparent default_attr;
bool destroy_default_attr = false;
bool c11 = (attr == ATTR_C11_THREAD);
if (iattr == NULL || c11)
{
int ret = __pthread_getattr_default_np (&default_attr.external);
if (ret != 0)
return ret;
destroy_default_attr = true;
iattr = &default_attr.internal;
}
struct pthread *pd = NULL;
int err = allocate_stack (iattr, &pd, &stackaddr, &stacksize);
int retval = 0;
if (__glibc_unlikely (err != 0))
/* Something went wrong. Maybe a parameter of the attributes is
invalid or we could not allocate memory. Note we have to
translate error codes. */
{
retval = err == ENOMEM ? EAGAIN : err;
goto out;
}
/* Initialize the TCB. All initializations with zero should be
performed in 'get_cached_stack'. This way we avoid doing this if
the stack freshly allocated with 'mmap'. */
#if TLS_TCB_AT_TP
/* Reference to the TCB itself. */
pd->header.self = pd;
/* Self-reference for TLS. */
pd->header.tcb = pd;
#endif
/* Store the address of the start routine and the parameter. Since
we do not start the function directly the stillborn thread will
get the information from its thread descriptor. */
pd->start_routine = start_routine;
pd->arg = arg;
pd->c11 = c11;
/* Copy the thread attribute flags. */
struct pthread *self = THREAD_SELF;
pd->flags = ((iattr->flags & ~(ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET))
| (self->flags & (ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET)));
/* Inherit rseq registration state. Without seccomp filters, rseq
registration will either always fail or always succeed. */
if ((int) THREAD_GETMEM_VOLATILE (self, rseq_area.cpu_id) >= 0)
pd->flags |= ATTR_FLAG_DO_RSEQ;
/* Initialize the field for the ID of the thread which is waiting
for us. This is a self-reference in case the thread is created
detached. */
pd->joinid = iattr->flags & ATTR_FLAG_DETACHSTATE ? pd : NULL;
/* The debug events are inherited from the parent. */
pd->eventbuf = self->eventbuf;
/* Copy the parent's scheduling parameters. The flags will say what
is valid and what is not. */
pd->schedpolicy = self->schedpolicy;
pd->schedparam = self->schedparam;
/* Copy the stack guard canary. */
#ifdef THREAD_COPY_STACK_GUARD
THREAD_COPY_STACK_GUARD (pd);
#endif
/* Copy the pointer guard value. */
#ifdef THREAD_COPY_POINTER_GUARD
THREAD_COPY_POINTER_GUARD (pd);
#endif
/* Setup tcbhead. */
tls_setup_tcbhead (pd);
/* Verify the sysinfo bits were copied in allocate_stack if needed. */
#ifdef NEED_DL_SYSINFO
CHECK_THREAD_SYSINFO (pd);
#endif
/* Determine scheduling parameters for the thread. */
if (__builtin_expect ((iattr->flags & ATTR_FLAG_NOTINHERITSCHED) != 0, 0)
&& (iattr->flags & (ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET)) != 0)
{
/* Use the scheduling parameters the user provided. */
if (iattr->flags & ATTR_FLAG_POLICY_SET)
{
pd->schedpolicy = iattr->schedpolicy;
pd->flags |= ATTR_FLAG_POLICY_SET;
}
if (iattr->flags & ATTR_FLAG_SCHED_SET)
{
/* The values were validated in pthread_attr_setschedparam. */
pd->schedparam = iattr->schedparam;
pd->flags |= ATTR_FLAG_SCHED_SET;
}
if ((pd->flags & (ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET))
!= (ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET))
collect_default_sched (pd);
}
if (__glibc_unlikely (__nptl_nthreads == 1))
_IO_enable_locks ();
/* Pass the descriptor to the caller. */
*newthread = (pthread_t) pd;
LIBC_PROBE (pthread_create, 4, newthread, attr, start_routine, arg);
/* One more thread. We cannot have the thread do this itself, since it
might exist but not have been scheduled yet by the time we've returned
and need to check the value to behave correctly. We must do it before
creating the thread, in case it does get scheduled first and then
might mistakenly think it was the only thread. In the failure case,
we momentarily store a false value; this doesn't matter because there
is no kosher thing a signal handler interrupting us right here can do
that cares whether the thread count is correct. */
atomic_fetch_add_relaxed (&__nptl_nthreads, 1);
/* Our local value of stopped_start and thread_ran can be accessed at
any time. The PD->stopped_start may only be accessed if we have
ownership of PD (see CONCURRENCY NOTES above). */
bool stopped_start = false; bool thread_ran = false;
/* Block all signals, so that the new thread starts out with
signals disabled. This avoids race conditions in the thread
startup. */
internal_sigset_t original_sigmask;
internal_signal_block_all (&original_sigmask);
if (iattr->extension != NULL && iattr->extension->sigmask_set)
/* Use the signal mask in the attribute. The internal signals
have already been filtered by the public
pthread_attr_setsigmask_np interface. */
internal_sigset_from_sigset (&pd->sigmask, &iattr->extension->sigmask);
else
{
/* Conceptually, the new thread needs to inherit the signal mask
of this thread. Therefore, it needs to restore the saved
signal mask of this thread, so save it in the startup
information. */
pd->sigmask = original_sigmask;
/* Reset the cancellation signal mask in case this thread is
running cancellation. */
internal_sigdelset (&pd->sigmask, SIGCANCEL);
}
/* Start the thread. */
if (__glibc_unlikely (report_thread_creation (pd)))
{
stopped_start = true;
/* We always create the thread stopped at startup so we can
notify the debugger. */
retval = create_thread (pd, iattr, &stopped_start, stackaddr,
stacksize, &thread_ran);
if (retval == 0)
{
/* We retain ownership of PD until (a) (see CONCURRENCY NOTES
above). */
/* Assert stopped_start is true in both our local copy and the
PD copy. */
assert (stopped_start);
assert (pd->stopped_start);
/* Now fill in the information about the new thread in
the newly created thread's data structure. We cannot let
the new thread do this since we don't know whether it was
already scheduled when we send the event. */
pd->eventbuf.eventnum = TD_CREATE;
pd->eventbuf.eventdata = pd;
/* Enqueue the descriptor. */
do
pd->nextevent = __nptl_last_event;
while (atomic_compare_and_exchange_bool_acq (&__nptl_last_event,
pd, pd->nextevent)
!= 0);
/* Now call the function which signals the event. See
CONCURRENCY NOTES for the nptl_db interface comments. */
__nptl_create_event ();
}
}
else
retval = create_thread (pd, iattr, &stopped_start, stackaddr,
stacksize, &thread_ran);
/* Return to the previous signal mask, after creating the new
thread. */
internal_signal_restore_set (&original_sigmask);
if (__glibc_unlikely (retval != 0))
{
if (thread_ran)
/* State (c) and we not have PD ownership (see CONCURRENCY NOTES
above). We can assert that STOPPED_START must have been true
because thread creation didn't fail, but thread attribute setting
did. */
{
assert (stopped_start);
/* Signal the created thread to release PD ownership and early
exit so it could be joined. */
pd->setup_failed = 1;
lll_unlock (pd->lock, LLL_PRIVATE);
/* Similar to pthread_join, but since thread creation has failed at
startup there is no need to handle all the steps. */
pid_t tid;
while ((tid = atomic_load_acquire (&pd->tid)) != 0)
__futex_abstimed_wait_cancelable64 ((unsigned int *) &pd->tid,
tid, 0, NULL, LLL_SHARED);
}
/* State (c) or (d) and we have ownership of PD (see CONCURRENCY
NOTES above). */
/* Oops, we lied for a second. */
atomic_fetch_add_relaxed (&__nptl_nthreads, -1);
/* Free the resources. */
__nptl_deallocate_stack (pd);
/* We have to translate error codes. */
if (retval == ENOMEM)
retval = EAGAIN;
}
else
{
/* We don't know if we have PD ownership. Once we check the local
stopped_start we'll know if we're in state (a) or (b) (see
CONCURRENCY NOTES above). */
if (stopped_start)
/* State (a), we own PD. The thread blocked on this lock either
because we're doing TD_CREATE event reporting, or for some
other reason that create_thread chose. Now let it run
free. */
lll_unlock (pd->lock, LLL_PRIVATE);
/* We now have for sure more than one thread. The main thread might
not yet have the flag set. No need to set the global variable
again if this is what we use. */
THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
}
out:
if (destroy_default_attr)
__pthread_attr_destroy (&default_attr.external);
return retval;
}
versioned_symbol (libc, __pthread_create_2_1, pthread_create, GLIBC_2_34);
libc_hidden_ver (__pthread_create_2_1, __pthread_create)
#ifndef SHARED
strong_alias (__pthread_create_2_1, __pthread_create)
#endif