Rust 中的 Pin UnPin Async Await 实现机制1

原文地址

为了保证概念的严谨性,翻译时保留了英文原文。

In this post, we explore cooperative multitasking and the async/await feature of Rust. We take a detailed look at how async/await works in Rust, including the design of the Future trait, the state machine transformation, and pinning. We then add basic support for async/await to our kernel by creating an asynchronous keyboard task and a basic executor.

在这篇文章中,我们将探讨 Rust 的协作多任务处理和异步/等待功能。我们详细了解 async/await 在 Rust 中的工作原理,包括 Future 特征的设计、状态机转换和固定。然后,我们通过创建异步键盘任务和基本执行器来向内核添加对 async/await 的基本支持。

This blog is openly developed on GitHub. If you have any problems or questions, please open an issue there. You can also leave comments at the bottom. The complete source code for this post can be found in the post-12 branch.

该博客是在 GitHub 上公开开发的。如果您有任何问题或疑问,请在那里提出问题。也可以在底部留言评论。这篇文章的完整源代码可以在 post-12 分支中找到。

1. Multitasking 多任务处理

One of the fundamental features of most operating systems is multitasking, which is the ability to execute multiple tasks concurrently. For example, you probably have other programs open while looking at this post, such as a text editor or a terminal window. Even if you have only a single browser window open, there are probably various background tasks for managing your desktop windows, checking for updates, or indexing files.

大多数操作系统的基本功能之一是多​​任务处理,即同时执行多个任务的能力。例如,您在查看本文时可能打开了其他程序,例如文本编辑器或终端窗口。即使您只打开一个浏览器窗口,也可能有各种后台任务用于管理桌面窗口、检查更新或索引文件。

While it seems like all tasks run in parallel, only a single task can be executed on a CPU core at a time. To create the illusion that the tasks run in parallel, the operating system rapidly switches between active tasks so that each one can make a bit of progress. Since computers are fast, we don't notice these switches most of the time.

虽然看起来所有任务都是并行运行的,但一次只能在一个 CPU 核心上执行一个任务。为了制造任务并行运行的假象,操作系统在活动任务之间快速切换,以便每个任务都能取得一点进展。由于计算机速度很快,我们大多数时候都不会注意到这些切换。

While single-core CPUs can only execute a single task at a time, multi-core CPUs can run multiple tasks in a truly parallel way. For example, a CPU with 8 cores can run 8 tasks at the same time. We will explain how to setup multi-core CPUs in a future post. For this post, we will focus on single-core CPUs for simplicity. (It's worth noting that all multi-core CPUs start with only a single active core, so we can treat them as single-core CPUs for now.)

单核 CPU 一次只能执行一个任务,而多核 CPU 可以以真正并行的方式运行多个任务。例如,8核CPU可以同时运行8个任务。我们将在以后的文章中解释如何设置多核 CPU。为了简单起见,在这篇文章中,我们将重点关注单核 CPU。 (值得注意的是,所有多核 CPU 都只有一个活动核心启动,因此我们现在可以将它们视为单核 CPU。)

There are two forms of multitasking: Cooperative multitasking requires tasks to regularly give up control of the CPU so that other tasks can make progress. Preemptive multitasking uses operating system functionality to switch threads at arbitrary points in time by forcibly pausing them. In the following we will explore the two forms of multitasking in more detail and discuss their respective advantages and drawbacks.

多任务处理有两种形式: 协作多任务处理要求任务定期放弃对 CPU 的控制,以便其他任务能够取得进展。抢占式多任务处理使用操作系统功能通过强制暂停线程来在任意时间点切换线程。下面我们将更详细地探讨这两种形式的多任务处理,并讨论它们各自的优点和缺点。

1.1 Preemptive Multitasking 抢占式多任务处理

The idea behind preemptive multitasking is that the operating system controls when to switch tasks. For that, it utilizes the fact that it regains control of the CPU on each interrupt. This makes it possible to switch tasks whenever new input is available to the system. For example, it would be possible to switch tasks when the mouse is moved or a network packet arrives. The operating system can also determine the exact time that a task is allowed to run by configuring a hardware timer to send an interrupt after that time.

抢占式多任务处理背后的想法是操作系统控制何时切换任务。为此,它利用了在每次中断时重新获得 CPU 控制权的事实。这使得每当系统有新输入时就可以切换任务。例如,当鼠标移动或网络数据包到达时,可以切换任务。操作系统还可以通过配置硬件定时器在该时间之后发送中断来确定允许任务运行的确切时间。

The following graphic illustrates the task switching process on a hardware interrupt:

下图说明了硬件中断时的任务切换过程:

In the first row, the CPU is executing task A1 of program A. All other tasks are paused. In the second row, a hardware interrupt arrives at the CPU. As described in the Hardware Interrupts post, the CPU immediately stops the execution of task A1 and jumps to the interrupt handler defined in the interrupt descriptor table (IDT). Through this interrupt handler, the operating system now has control of the CPU again, which allows it to switch to task B1 instead of continuing task A1.

第一行,CPU 正在执行程序 A 的任务 A1 。所有其他任务都暂停。在第二行中,硬件中断到达 CPU。正如硬件中断帖子中所述,CPU 立即停止任务 A1 的执行,并跳转到中断描述符表 (IDT) 中定义的中断处理程序。通过这个中断处理程序,操作系统现在再次控制了 CPU,这允许它切换到任务 B1 而不是继续执行任务 A1

1.1.1 Saving State 保存状态

Since tasks are interrupted at arbitrary points in time, they might be in the middle of some calculations. In order to be able to resume them later, the operating system must backup the whole state of the task, including its call stack and the values of all CPU registers. This process is called a context switch.

由于任务会在任意时间点中断,因此它们可能正在进行某些计算。为了能够稍后恢复它们,操作系统必须备份任务的整个状态,包括其调用堆栈和所有 CPU 寄存器的值。这个过程称为上下文切换。

As the call stack can be very large, the operating system typically sets up a separate call stack for each task instead of backing up the call stack content on each task switch. Such a task with its own stack is called a thread of execution or thread for short. By using a separate stack for each task, only the register contents need to be saved on a context switch (including the program counter and stack pointer). This approach minimizes the performance overhead of a context switch, which is very important since context switches often occur up to 100 times per second.

由于调用堆栈可能非常大,操作系统通常会为每个任务设置一个单独的调用堆栈,而不是在每次任务切换时备份调用堆栈内容。这种拥有自己的堆栈的任务称为执行线程或简称线程。通过为每个任务使用单独的堆栈,上下文切换时只需保存寄存器内容(包括程序计数器和堆栈指针)。这种方法最大限度地减少了上下文切换的性能开销,这一点非常重要,因为上下文切换通常每秒发生多达 100 次。

1.1.2 Discussion 讨论

The main advantage of preemptive multitasking is that the operating system can fully control the allowed execution time of a task. This way, it can guarantee that each task gets a fair share of the CPU time, without the need to trust the tasks to cooperate. This is especially important when running third-party tasks or when multiple users share a system.

抢占式多任务的主要优点是操作系统可以完全控制任务允许的执行时间。这样,就可以保证每个任务获得公平份额的CPU时间,而不需要信任任务之间的协作。当运行第三方任务或多个用户共享系统时,这一点尤其重要。

The disadvantage of preemption is that each task requires its own stack. Compared to a shared stack, this results in higher memory usage per task and often limits the number of tasks in the system. Another disadvantage is that the operating system always has to save the complete CPU register state on each task switch, even if the task only used a small subset of the registers.

抢占的缺点是每个任务都需要自己的堆栈。与共享堆栈相比,这会导致每个任务使用更高的内存,并且通常会限制系统中的任务数量。另一个缺点是,操作系统始终必须在每次任务切换时保存完整的 CPU 寄存器状态,即使任务仅使用寄存器的一小部分。

Preemptive multitasking and threads are fundamental components of an operating system because they make it possible to run untrusted userspace programs. We will discuss these concepts in full detail in future posts. For this post, however, we will focus on cooperative multitasking, which also provides useful capabilities for our kernel.

抢占式多任务处理和线程是操作系统的基本组件,因为它们使得运行不受信任的用户空间程序成为可能。我们将在以后的帖子中详细讨论这些概念。然而,在这篇文章中,我们将重点关注协作多任务处理,这也为我们的内核提供了有用的功能。

1.2 Cooperative Multitasking 协作多任务处理

Instead of forcibly pausing running tasks at arbitrary points in time, cooperative multitasking lets each task run until it voluntarily gives up control of the CPU. This allows tasks to pause themselves at convenient points in time, for example, when they need to wait for an I/O operation anyway.

协作多任务处理不是在任意时间点强制暂停正在运行的任务,而是让每个任务一直运行,直到它自愿放弃对 CPU 的控制。这允许任务在方便的时间点自行暂停,例如,当它们无论如何都需要等待 I/O 操作时。

Cooperative multitasking is often used at the language level, like in the form of coroutines or async/await. The idea is that either the programmer or the compiler inserts yield operations into the program, which give up control of the CPU and allow other tasks to run. For example, a yield could be inserted after each iteration of a complex loop.

协作多任务处理通常在语言级别使用,例如以协程或异步/等待的形式。这个想法是,程序员或编译器将yield操作插入到程序中,从而放弃对CPU的控制并允许其他任务运行。例如,可以在复杂循环的每次迭代之后插入一个yield。

It is common to combine cooperative multitasking with asynchronous operations. Instead of waiting until an operation is finished and preventing other tasks from running during this time, asynchronous operations return a "not ready" status if the operation is not finished yet. In this case, the waiting task can execute a yield operation to let other tasks run.

将协作多任务处理与异步操作相结合是很常见的。如果操作尚未完成,异步操作不会等待操作完成并阻止其他任务在此期间运行,而是返回"未就绪"状态。在这种情况下,等待任务可以执行yield操作来让其他任务运行。

1.2.1 Saving State 保存状态

Since tasks define their pause points themselves, they don't need the operating system to save their state. Instead, they can save exactly the state they need for continuation before they pause themselves, which often results in better performance. For example, a task that just finished a complex computation might only need to backup the final result of the computation since it does not need the intermediate results anymore.

由于任务自己定义其暂停点,因此它们不需要操作系统来保存其状态。相反,他们可以在暂停之前准确保存继续所需的状态,这通常会带来更好的性能。例如,刚刚完成复杂计算的任务可能只需要备份计算的最终结果,因为它不再需要中间结果。

Language-supported implementations of cooperative tasks are often even able to backup the required parts of the call stack before pausing. As an example, Rust's async/await implementation stores all local variables that are still needed in an automatically generated struct (see below). By backing up the relevant parts of the call stack before pausing, all tasks can share a single call stack, which results in much lower memory consumption per task. This makes it possible to create an almost arbitrary number of cooperative tasks without running out of memory.

语言支持的协作任务实现通常甚至能够在暂停之前备份调用堆栈的所需部分。举个例子,Rust 的 async/await 实现存储了自动生成的结构中仍然需要的所有局部变量(见下文)。通过在暂停之前备份调用堆栈的相关部分,所有任务可以共享一个调用堆栈,从而大大降低每个任务的内存消耗。这使得创建几乎任意数量的协作任务而不会耗尽内存成为可能。

1.2.2 Discussion 讨论

The drawback of cooperative multitasking is that an uncooperative task can potentially run for an unlimited amount of time. Thus, a malicious or buggy task can prevent other tasks from running and slow down or even block the whole system. For this reason, cooperative multitasking should only be used when all tasks are known to cooperate. As a counterexample, it's not a good idea to make the operating system rely on the cooperation of arbitrary user-level programs.

协作式多任务处理的缺点是不协作的任务可能会无限期地运行。因此,恶意或有错误的任务可以阻止其他任务运行并减慢甚至阻塞整个系统。因此,只有当所有任务都已知可以协作时,才应使用协作式多任务处理。作为反例,让操作系统依赖于任意用户级程序的协作并不是一个好主意。

However, the strong performance and memory benefits of cooperative multitasking make it a good approach for usage within a program, especially in combination with asynchronous operations. Since an operating system kernel is a performance-critical program that interacts with asynchronous hardware, cooperative multitasking seems like a good approach for implementing concurrency.

然而,协作多任务处理的强大性能和内存优势使其成为在程序中使用的好方法,特别是与异步操作结合使用。由于操作系统内核是与异步硬件交互的性能关键型程序,因此协作多任务处理似乎是实现并发的好方法。

2. Async/Await in Rust

The Rust language provides first-class support for cooperative multitasking in the form of async/await. Before we can explore what async/await is and how it works, we need to understand how futures and asynchronous programming work in Rust.

Rust 语言以 async/await 的形式为协作多任务提供一流的支持。在我们探索什么是 async/await 及其工作原理之前,我们需要了解 Rust 中的 future 和异步编程是如何工作的。

2.1 Futures

A future represents a value that might not be available yet. This could be, for example, an integer that is computed by another task or a file that is downloaded from the network. Instead of waiting until the value is available, futures make it possible to continue execution until the value is needed.

Future代表一种可能尚不可用的价值。例如,这可以是由另一个任务计算的整数或从网络下载的文件。 Future可以继续执行,直到需要该值,而不是等到该值可用为止。

2.1.1 Example 例子

The concept of futures is best illustrated with a small example:

Future的概念可以用一个小例子来最好地说明:

This sequence diagram shows a main function that reads a file from the file system and then calls a function foo. This process is repeated two times: once with a synchronous read_file call and once with an asynchronous async_read_file call.

此序列图显示了一个 main 函数,该函数从文件系统读取文件,然后调用函数 foo 。此过程重复两次:一次使用同步 read_file 调用,一次使用异步 async_read_file 调用。

With the synchronous call, the main function needs to wait until the file is loaded from the file system. Only then can it call the foo function, which requires it to again wait for the result.

通过同步调用, main 函数需要等待,直到文件从文件系统加载。只有这样它才能调用 foo 函数,这需要它再次等待结果。

With the asynchronous async_read_file call, the file system directly returns a future and loads the file asynchronously in the background. This allows the main function to call foo much earlier, which then runs in parallel with the file load. In this example, the file load even finishes before foo returns, so main can directly work with the file without further waiting after foo returns.

通过异步 async_read_file 调用,文件系统直接返回 future 并在后台异步加载文件。这允许 main 函数更早地调用 foo ,然后与文件加载并行运行。在此示例中,文件加载甚至在 foo 返回之前完成,因此 main 可以直接使用该文件,而无需在 foo 返回后进一步等待。

2.1.2 Futures in Rust

In Rust, futures are represented by the Future trait, which looks like this:

在 Rust 中,futures 由 Future 特征表示,如下所示:

rust 复制代码
pub trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output>;
}

The associated type Output specifies the type of the asynchronous value. For example, the async_read_file function in the diagram above would return a Future instance with Output set to File.

关联类型 Output 指定异步值的类型。例如,上图中的 async_read_file 函数将返回一个 Future 实例,其中 Output 设置为 File

The poll method allows to check if the value is already available. It returns a Poll enum, which looks like this:

poll 方法允许检查该值是否已经可用。它返回一个 Poll 枚举,如下所示:

rust 复制代码
pub enum Poll<T> {
    Ready(T),
    Pending,
}

When the value is already available (e.g. the file was fully read from disk), it is returned wrapped in the Ready variant. Otherwise, the Pending variant is returned, which signals to the caller that the value is not yet available.

当该值已经可用时(例如,文件已从磁盘完全读取),它将以 Ready 形式返回。否则,将返回 Pending ,这向调用者发出该值尚不可用的信号。

The poll method takes two arguments: self: Pin<&mut Self> and cx: &mut Context. The former behaves similarly to a normal &mut self reference, except that the Self value is pinned to its memory location. Understanding Pin and why it is needed is difficult without understanding how async/await works first. We will therefore explain it later in this post.

poll 方法采用两个参数: self: Pin<&mut Self>cx: &mut Context 。前者的行为与普通 &mut self 引用类似,只是 Self 值固定到其内存位置。如果不先了解 async/await 的工作原理,就很难理解 Pin 以及为什么需要它。因此,我们将在本文后面对此进行解释。

The purpose of the cx: &mut Context parameter is to pass a Waker instance to the asynchronous task, e.g., the file system load. This Waker allows the asynchronous task to signal that it (or a part of it) is finished, e.g., that the file was loaded from disk. Since the main task knows that it will be notified when the Future is ready, it does not need to call poll over and over again. We will explain this process in more detail later in this post when we implement our own waker type.

cx: &mut Context 参数的用途是将 Waker 实例传递给异步任务,例如文件系统加载。这个 Waker 允许异步任务发出信号,表明它(或其一部分)已完成,例如,文件已从磁盘加载。由于主任务知道当 Future 准备好时它会收到通知,因此不需要一遍又一遍地调用 poll 。当我们实现我们自己的唤醒器类型时,我们将在本文后面更详细地解释这个过程。

2.2 Working with Futures 与 Future合作

We now know how futures are defined and understand the basic idea behind the poll method. However, we still don't know how to effectively work with futures. The problem is that futures represent the results of asynchronous tasks, which might not be available yet. In practice, however, we often need these values directly for further calculations. So the question is: How can we efficiently retrieve the value of a future when we need it?

我们现在知道如何定义 Future 并理解 poll 方法背后的基本思想。然而,我们仍然不知道如何有效地使用 Future 。问题在于 Future 代表异步任务的结果,而这些结果可能尚不可用。然而,在实践中,我们经常直接需要这些值来进行进一步的计算。所以问题是:当我们需要时,我们如何才能有效地检索Future的值?

2.2.1 Waiting on Futures

One possible answer is to wait until a future becomes ready. This could look something like this:

一种可能的答案是等到Future准备好。这可能看起来像这样:

rust 复制代码
let future = async_read_file("foo.txt");
let file_content = loop {
    match future.poll(...) {
        Poll::Ready(value) => break value,
        Poll::Pending => {}, // do nothing
    }
}

Here we actively wait for the future by calling poll over and over again in a loop. The arguments to poll don't matter here, so we omitted them. While this solution works, it is very inefficient because we keep the CPU busy until the value becomes available.

在这里,我们通过在循环中一遍又一遍地调用 poll 来主动等待Future。 poll 的参数在这里并不重要,因此我们省略了它们。虽然此解决方案有效,但效率非常低,因为我们让 CPU 保持忙碌状态,直到该值可用为止。

A more efficient approach could be to block the current thread until the future becomes available. This is, of course, only possible if you have threads, so this solution does not work for our kernel, at least not yet. Even on systems where blocking is supported, it is often not desired because it turns an asynchronous task into a synchronous task again, thereby inhibiting the potential performance benefits of parallel tasks.

更有效的方法可能是阻塞当前线程,直到将来可用。当然,这只有在有线程的情况下才有可能,因此该解决方案不适用于我们的内核,至少目前还不行。即使在支持阻塞的系统上,通常也不需要这样做,因为它会再次将异步任务变成同步任务,从而抑制并行任务的潜在性能优势。

2.2.2 Future Combinators Future组合器

An alternative to waiting is to use future combinators. Future combinators are methods like map that allow chaining and combining futures together, similar to the methods of the Iterator trait. Instead of waiting on the future, these combinators return a future themselves, which applies the mapping operation on poll.

等待的另一种选择是使用Future的组合器。 Future 组合器是像 map 这样的方法,允许将 future 链接和组合在一起,类似于 Iterator 特征的方法。这些组合器本身返回一个 future,而不是等待 future,这会在 poll 上应用映射操作。

As an example, a simple string_len combinator for converting a Future<Output = String> to a Future<Output = usize> could look like this:

例如,用于将 Future<Output = String> 转换为 Future<Output = usize> 的简单 string_len 组合器可能如下所示:

rust 复制代码
struct StringLen<F> {
    inner_future: F,
}

impl<F> Future for StringLen<F> where F: Future<Output = String> {
    type Output = usize;

    fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<T> {
        match self.inner_future.poll(cx) {
            Poll::Ready(s) => Poll::Ready(s.len()),
            Poll::Pending => Poll::Pending,
        }
    }
}

fn string_len(string: impl Future<Output = String>)
    -> impl Future<Output = usize>
{
    StringLen {
        inner_future: string,
    }
}

// Usage
fn file_len() -> impl Future<Output = usize> {
    let file_content_future = async_read_file("foo.txt");
    string_len(file_content_future)
}

This code does not quite work because it does not handle pinning, but it suffices as an example. The basic idea is that the string_len function wraps a given Future instance into a new StringLen struct, which also implements Future. When the wrapped future is polled, it polls the inner future. If the value is not ready yet, Poll::Pending is returned from the wrapped future too. If the value is ready, the string is extracted from the Poll::Ready variant and its length is calculated. Afterwards, it is wrapped in Poll::Ready again and returned.

该代码不太有效,因为它不处理固定,但作为示例就足够了。基本思想是 string_len 函数将给定的 Future 实例包装到新的 StringLen 结构中,该结构还实现 Future 。当对包裹的 future 进行轮询时,它会轮询内部的 future。如果该值尚未准备好,则 Poll::Pending 也会从包装的 future 中返回。如果值已准备好,则从 Poll::Ready 变体中提取字符串并计算其长度。然后再次用 Poll::Ready 包裹并返回。

With this string_len function, we can calculate the length of an asynchronous string without waiting for it. Since the function returns a Future again, the caller can't work directly on the returned value, but needs to use combinator functions again. This way, the whole call graph becomes asynchronous and we can efficiently wait for multiple futures at once at some point, e.g., in the main function.

通过这个 string_len 函数,我们可以计算异步字符串的长度,而无需等待。由于函数再次返回 Future ,因此调用者无法直接处理返回值,而是需要再次使用组合器函数。这样,整个调用图就变得异步了,我们可以在某个时刻(例如在主函数中)有效地同时等待多个 future。

Because manually writing combinator functions is difficult, they are often provided by libraries. While the Rust standard library itself provides no combinator methods yet, the semi-official (and no_std compatible) futures crate does. Its FutureExt trait provides high-level combinator methods such as map or then, which can be used to manipulate the result with arbitrary closures.

由于手动编写组合器函数很困难,因此它们通常由库提供。虽然 Rust 标准库本身尚未提供组合器方法,但半官方(与 no_std 兼容) futures 箱提供了。它的 FutureExt 特性提供了高级组合器方法,例如 mapthen ,可用于通过任意闭包操作结果。

2.2.2.1 Advantages 优点

The big advantage of future combinators is that they keep the operations asynchronous. In combination with asynchronous I/O interfaces, this approach can lead to very high performance. The fact that future combinators are implemented as normal structs with trait implementations allows the compiler to excessively optimize them. For more details, see the Zero-cost futures in Rust post, which announced the addition of futures to the Rust ecosystem.

Future组合器的一大优点是它们保持操作异步。与异步 I/O 接口相结合,这种方法可以带来非常高的性能。事实上,Future的组合器被实现为具有特征实现的普通结构,这使得编译器可以过度优化它们。有关更多详细信息,请参阅 Rust 中的零成本Future 帖子,该帖子宣布在 Rust 生态系统中添加Future。

2.2.2.2 Drawbacks 缺点

While future combinators make it possible to write very efficient code, they can be difficult to use in some situations because of the type system and the closure-based interface. For example, consider code like this:

虽然Future的组合器可以编写非常高效的代码,但由于类型系统和基于闭包的接口,它们在某些情况下可能很难使用。例如,考虑这样的代码:

rust 复制代码
fn example(min_len: usize) -> impl Future<Output = String> {
    async_read_file("foo.txt").then(move |content| {
        if content.len() < min_len {
            Either::Left(async_read_file("bar.txt").map(|s| content + &s))
        } else {
            Either::Right(future::ready(content))
        }
    })
}

(Try it on the playground)

Here we read the file foo.txt and then use the then combinator to chain a second future based on the file content. If the content length is smaller than the given min_len, we read a different bar.txt file and append it to content using the map combinator. Otherwise, we return only the content of foo.txt.

在这里,我们读取文件 foo.txt ,然后使用 then 组合器根据文件内容链接第二个 future。如果内容长度小于给定的 min_len ,我们读取不同的 bar.txt 文件并使用 map 组合器将其附加到 content 。否则,我们仅返回 foo.txt 的内容。

We need to use the move keyword for the closure passed to then because otherwise there would be a lifetime error for min_len. The reason for the Either wrapper is that if and else blocks must always have the same type. Since we return different future types in the blocks, we must use the wrapper type to unify them into a single type. The ready function wraps a value into a future, which is immediately ready. The function is required here because the Either wrapper expects that the wrapped value implements Future.

我们需要使用 move 关键字作为传递给 then 的闭包,否则 min_len 将会出现生命周期错误。使用 Either 包装器的原因是 ifelse 块必须始终具有相同的类型。由于我们在块中返回不同的Future类型,因此我们必须使用包装器类型将它们统一为单一类型。 ready 函数将一个值包装到 future 中,该值立即准备就绪。这里需要该函数,因为 Either 包装器期望包装的值实现 Future

As you can imagine, this can quickly lead to very complex code for larger projects. It gets especially complicated if borrowing and different lifetimes are involved. For this reason, a lot of work was invested in adding support for async/await to Rust, with the goal of making asynchronous code radically simpler to write.

正如您可以想象的那样,这很快就会导致大型项目的代码变得非常复杂。如果涉及借用和不同的生命周期,事情就会变得特别复杂。出于这个原因,我们投入了大量的工作来向 Rust 添加对 async/await 的支持,目的是使异步代码的编写变得更加简单。

2.3 The Async/Await Pattern 异步/等待模式

The idea behind async/await is to let the programmer write code that looks like normal synchronous code, but is turned into asynchronous code by the compiler. It works based on the two keywords async and await. The async keyword can be used in a function signature to turn a synchronous function into an asynchronous function that returns a future:

async/await 背后的想法是让程序员编写看起来像普通同步代码的代码,但由编译器将其转换为异步代码。它基于两个关键字 asyncawait 工作。 async 关键字可以在函数签名中使用,将同步函数转换为返回 future 的异步函数:

rust 复制代码
async fn foo() -> u32 {
    0
}

// the above is roughly translated by the compiler to:
fn foo() -> impl Future<Output = u32> {
    future::ready(0)
}

This keyword alone wouldn't be that useful. However, inside async functions, the await keyword can be used to retrieve the asynchronous value of a future:

单独这个关键字并没有那么有用。但是,在 async 函数内部, await 关键字可用于检索 future 的异步值:

rust 复制代码
async fn example(min_len: usize) -> String {
    let content = async_read_file("foo.txt").await;
    if content.len() < min_len {
        content + &async_read_file("bar.txt").await
    } else {
        content
    }
}

(Try it on the playground)

This function is a direct translation of the example function from above that used combinator functions. Using the .await operator, we can retrieve the value of a future without needing any closures or Either types. As a result, we can write our code like we write normal synchronous code, with the difference that this is still asynchronous code.

该函数是上面使用组合器函数的 example 函数的直接翻译。使用 .await 运算符,我们可以检索 future 的值,而不需要任何闭包或 Either 类型。因此,我们可以像编写普通同步代码一样编写代码,不同之处在于这仍然是异步代码。

2.3.1 State Machine Transformation 状态机

Behind the scenes, the compiler converts the body of the async function into a state machine, with each .await call representing a different state. For the above example function, the compiler creates a state machine with the following four states:

在幕后,编译器将 async 函数的主体转换为状态机,每个 .await 调用代表不同的状态。对于上面的 example 函数,编译器创建了一个具有以下四种状态的状态机:

Each state represents a different pause point in the function. The "Start" and "End" states represent the function at the beginning and end of its execution. The "Waiting on foo.txt" state represents that the function is currently waiting for the first async_read_file result. Similarly, the "Waiting on bar.txt" state represents the pause point where the function is waiting on the second async_read_file result.

每个状态代表函数中不同的暂停点。 "开始"和"结束"状态代表函数执行的开始和结束。 "Waiting on foo.txt"状态表示该函数当前正在等待第一个 async_read_file 结果。类似地,"Waiting on bar.txt"状态表示函数等待第二个 async_read_file 结果的暂停点。

The state machine implements the Future trait by making each poll call a possible state transition:

状态机通过使每个 poll 调用成为可能的状态转换来实现 Future 特征:

The diagram uses arrows to represent state switches and diamond shapes to represent alternative ways. For example, if the foo.txt file is not ready, the path marked with "no" is taken and the "Waiting on foo.txt" state is reached. Otherwise, the "yes" path is taken. The small red diamond without a caption represents the if content.len() < 100 branch of the example function.

该图使用箭头表示状态开关,使用菱形表示替代方式。例如,如果 foo.txt 文件未准备好,则采用标记为"no"的路径,并达到"Waiting on foo.txt"状态。否则,将采取"是"路径。没有标题的红色小菱形代表 example 函数的 if content.len() < 100 分支。

We see that the first poll call starts the function and lets it run until it reaches a future that is not ready yet. If all futures on the path are ready, the function can run till the "End" state, where it returns its result wrapped in Poll::Ready. Otherwise, the state machine enters a waiting state and returns Poll::Pending. On the next poll call, the state machine then starts from the last waiting state and retries the last operation.

我们看到第一个 poll 调用启动了该函数并让它运行,直到到达尚未准备好的 future。如果路径上的所有 future 都已准备好,则该函数可以运行直到"End"状态,并返回包装在 Poll::Ready 中的结果。否则,状态机进入等待状态并返回 Poll::Pending 。在下一次 poll 调用中,状态机将从上次等待状态开始并重试上次操作。

2.3.2 Saving State 保存状态

In order to be able to continue from the last waiting state, the state machine must keep track of the current state internally. In addition, it must save all the variables that it needs to continue execution on the next poll call. This is where the compiler can really shine: Since it knows which variables are used when, it can automatically generate structs with exactly the variables that are needed.

为了能够从最后的等待状态继续,状态机必须在内部跟踪当前状态。此外,它必须保存在下一个 poll 调用中继续执行所需的所有变量。这就是编译器真正发挥作用的地方:因为它知道何时使用哪些变量,所以它可以自动生成包含所需变量的结构。

As an example, the compiler generates structs like the following for the above example function:

例如,编译器为上面的 example 函数生成如下结构:

rust 复制代码
// The `example` function again so that you don't have to scroll up
async fn example(min_len: usize) -> String {
    let content = async_read_file("foo.txt").await;
    if content.len() < min_len {
        content + &async_read_file("bar.txt").await
    } else {
        content
    }
}

// The compiler-generated state structs:

struct StartState {
    min_len: usize,
}

struct WaitingOnFooTxtState {
    min_len: usize,
    foo_txt_future: impl Future<Output = String>,
}

struct WaitingOnBarTxtState {
    content: String,
    bar_txt_future: impl Future<Output = String>,
}

struct EndState {}

In the "start" and "Waiting on foo.txt" states, the min_len parameter needs to be stored for the later comparison with content.len(). The "Waiting on foo.txt" state additionally stores a foo_txt_future, which represents the future returned by the async_read_file call. This future needs to be polled again when the state machine continues, so it needs to be saved.

在"start"和"Waiting on foo.txt"状态下,需要存储 min_len 参数,以便稍后与 content.len() 进行比较。 "Waiting on foo.txt"状态还存储 foo_txt_future ,它表示 async_read_file 调用返回的 future。当状态机继续运行时,这个 future 需要再次轮询,因此需要保存。

The "Waiting on bar.txt" state contains the content variable for the later string concatenation when bar.txt is ready. It also stores a bar_txt_future that represents the in-progress load of bar.txt. The struct does not contain the min_len variable because it is no longer needed after the content.len() comparison. In the "end" state, no variables are stored because the function has already run to completion.

"Waiting on bar.txt"状态包含 content 变量,用于在 bar.txt 准备就绪时进行后续字符串连接。它还存储一个 bar_txt_future 表示正在进行的加载 bar.txt 。该结构不包含 min_len 变量,因为在 content.len() 比较之后不再需要它。在"结束"状态下,不会存储任何变量,因为函数已经运行完成。

Keep in mind that this is only an example of the code that the compiler could generate. The struct names and the field layout are implementation details and might be different.

请记住,这只是编译器可以生成的代码的示例。结构名称和字段布局是实现细节,可能会有所不同。

2.3.3 The Full State Machine Type

完整状态机类型

While the exact compiler-generated code is an implementation detail, it helps in understanding to imagine how the generated state machine could look for the example function. We already defined the structs representing the different states and containing the required variables. To create a state machine on top of them, we can combine them into an enum:

虽然编译器生成的确切代码是实现细节,但它有助于理解想象生成的状态机如何查找 example 函数。我们已经定义了表示不同状态并包含所需变量的结构。要在它们之上创建状态机,我们可以将它们组合成 enum

rust 复制代码
enum ExampleStateMachine {
    Start(StartState),
    WaitingOnFooTxt(WaitingOnFooTxtState),
    WaitingOnBarTxt(WaitingOnBarTxtState),
    End(EndState),
}

We define a separate enum variant for each state and add the corresponding state struct to each variant as a field. To implement the state transitions, the compiler generates an implementation of the Future trait based on the example function:

我们为每个状态定义一个单独的枚举变体,并将相应的状态结构作为字段添加到每个变体中。为了实现状态转换,编译器根据 example 函数生成 Future 特征的实现:

rust 复制代码
impl Future for ExampleStateMachine {
    type Output = String; // return type of `example`

    fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
        loop {
            match self { // TODO: handle pinning
                ExampleStateMachine::Start(state) => {...}
                ExampleStateMachine::WaitingOnFooTxt(state) => {...}
                ExampleStateMachine::WaitingOnBarTxt(state) => {...}
                ExampleStateMachine::End(state) => {...}
            }
        }
    }
}

The Output type of the future is String because it's the return type of the example function. To implement the poll function, we use a match statement on the current state inside a loop. The idea is that we switch to the next state as long as possible and use an explicit return Poll::Pending when we can't continue.

future 的 Output 类型是 String ,因为它是 example 函数的返回类型。为了实现 poll 函数,我们在 loop 内的当前状态上使用 match 语句。我们的想法是,我们尽可能长时间地切换到下一个状态,并在无法继续时使用显式 return Poll::Pending

For simplicity, we only show simplified code and don't handle pinning, ownership, lifetimes, etc. So this and the following code should be treated as pseudo-code and not used directly. Of course, the real compiler-generated code handles everything correctly, albeit possibly in a different way.

为了简单起见,我们只显示简化的代码,不处理固定、所有权、生命周期等。因此,这段代码和下面的代码应该被视为伪代码,而不是直接使用。当然,真正的编译器生成的代码可以正确处理所有内容,尽管可能以不同的方式。

To keep the code excerpts small, we present the code for each match arm separately. Let's begin with the Start state:

为了保持代码篇幅短点,我们分别呈现每个 match 臂的代码。让我们从 Start 状态开始:

rust 复制代码
ExampleStateMachine::Start(state) => {
    // from body of `example`
    let foo_txt_future = async_read_file("foo.txt");
    // `.await` operation
    let state = WaitingOnFooTxtState {
        min_len: state.min_len,
        foo_txt_future,
    };
    *self = ExampleStateMachine::WaitingOnFooTxt(state);
}

The state machine is in the Start state when it is right at the beginning of the function. In this case, we execute all the code from the body of the example function until the first .await. To handle the .await operation, we change the state of the self state machine to WaitingOnFooTxt, which includes the construction of the WaitingOnFooTxtState struct.

当状态机位于函数开头时,状态机处于 Start 状态。在本例中,我们执行从 example 函数主体到第一个 .await 的所有代码。为了处理 .await 操作,我们将 self 状态机的状态更改为 WaitingOnFooTxt ,其中包括 WaitingOnFooTxtState 结构体的构造。

Since the match self {...} statement is executed in a loop, the execution jumps to the WaitingOnFooTxt arm next:

由于 match self {...} 语句是在循环中执行的,所以接下来执行会跳转到 WaitingOnFooTxt 分支:

rust 复制代码
ExampleStateMachine::WaitingOnFooTxt(state) => {
    match state.foo_txt_future.poll(cx) {
        Poll::Pending => return Poll::Pending,
        Poll::Ready(content) => {
            // from body of `example`
            if content.len() < state.min_len {
                let bar_txt_future = async_read_file("bar.txt");
                // `.await` operation
                let state = WaitingOnBarTxtState {
                    content,
                    bar_txt_future,
                };
                *self = ExampleStateMachine::WaitingOnBarTxt(state);
            } else {
                *self = ExampleStateMachine::End(EndState);
                return Poll::Ready(content);
            }
        }
    }
}

In this match arm, we first call the poll function of the foo_txt_future. If it is not ready, we exit the loop and return Poll::Pending. Since self stays in the WaitingOnFooTxt state in this case, the next poll call on the state machine will enter the same match arm and retry polling the foo_txt_future.

在此 match 臂中,我们首先调用 foo_txt_futurepoll 函数。如果还没有准备好,我们退出循环并返回 Poll::Pending 。由于在这种情况下 self 保持在 WaitingOnFooTxt 状态,因此状态机上的下一个 poll 调用将进入相同的 match 臂,并且重试轮询 foo_txt_future

When the foo_txt_future is ready, we assign the result to the content variable and continue to execute the code of the example function: If content.len() is smaller than the min_len saved in the state struct, the bar.txt file is read asynchronously. We again translate the .await operation into a state change, this time into the WaitingOnBarTxt state. Since we're executing the match inside a loop, the execution directly jumps to the match arm for the new state afterward, where the bar_txt_future is polled.

foo_txt_future 准备好后,我们将结果赋值给 content 变量,并继续执行 example 函数的代码: If content.len() 小于状态结构体中保存的 min_len ,则异步读取 bar.txt 文件。我们再次将 .await 操作转换为状态更改,这次转换为 WaitingOnBarTxt 状态。由于我们在循环内执行 match ,因此执行会直接跳转到 match 臂以获取新状态,其中 bar_txt_future 被轮询。

In case we enter the else branch, no further .await operation occurs. We reach the end of the function and return content wrapped in Poll::Ready. We also change the current state to the End state.

如果我们进入 else 分支,则不会发生进一步的 .await 操作。我们到达函数末尾并返回包裹在 Poll::Ready 中的 content 。我们还将当前状态更改为 End 状态。

The code for the WaitingOnBarTxt state looks like this:

WaitingOnBarTxt 状态的代码如下所示:

rust 复制代码
ExampleStateMachine::WaitingOnBarTxt(state) => {
    match state.bar_txt_future.poll(cx) {
        Poll::Pending => return Poll::Pending,
        Poll::Ready(bar_txt) => {
            *self = ExampleStateMachine::End(EndState);
            // from body of `example`
            return Poll::Ready(state.content + &bar_txt);
        }
    }
}

Similar to the WaitingOnFooTxt state, we start by polling the bar_txt_future. If it is still pending, we exit the loop and return Poll::Pending. Otherwise, we can perform the last operation of the example function: concatenating the content variable with the result from the future. We update the state machine to the End state and then return the result wrapped in Poll::Ready.

WaitingOnFooTxt 状态类似,我们首先轮询 bar_txt_future 。如果它仍然处于待处理状态,我们退出循环并返回 Poll::Pending 。否则,我们可以执行 example 函数的最后一个操作:将 content 变量与Future的结果连接起来。我们将状态机更新为 End 状态,然后返回包装在 Poll::Ready 中的结果。

Finally, the code for the End state looks like this:

最后, End 状态的代码如下所示:

rust 复制代码
ExampleStateMachine::End(_) => {
    panic!("poll called after Poll::Ready was returned");
}

Futures should not be polled again after they returned Poll::Ready, so we panic if poll is called while we are already in the End state.

Future 在返回 Poll::Ready 后不应再次轮询,因此如果在我们已经处于 End 状态时调用 poll ,我们会感到恐慌。

We now know what the compiler-generated state machine and its implementation of the Future trait could look like. In practice, the compiler generates code in a different way. (In case you're interested, the implementation is currently based on generators, but this is only an implementation detail.)

我们现在知道编译器生成的状态机及其 Future 特征的实现可能是什么样子。实际上,编译器以不同的方式生成代码。 (如果您感兴趣,目前的实现是基于生成器的,但这只是一个实现细节。)

The last piece of the puzzle is the generated code for the example function itself. Remember, the function header was defined like this:

最后一个难题是为 example 函数本身生成的代码。请记住,函数头的定义如下:

rust 复制代码
async fn example(min_len: usize) -> String

Since the complete function body is now implemented by the state machine, the only thing that the function needs to do is to initialize the state machine and return it. The generated code for this could look like this:

由于现在完整的函数体是由状态机实现的,因此该函数唯一需要做的就是初始化状态机并返回它。生成的代码可能如下所示:

rust 复制代码
fn example(min_len: usize) -> ExampleStateMachine {
    ExampleStateMachine::Start(StartState {
        min_len,
    })
}

The function no longer has an async modifier since it now explicitly returns an ExampleStateMachine type, which implements the Future trait. As expected, the state machine is constructed in the Start state and the corresponding state struct is initialized with the min_len parameter.

该函数不再具有 async 修饰符,因为它现在显式返回 ExampleStateMachine 类型,该类型实现 Future 特征。正如预期的那样,状态机在 Start 状态下构建,并使用 min_len 参数初始化相应的状态结构。

Note that this function does not start the execution of the state machine. This is a fundamental design decision of futures in Rust: they do nothing until they are polled for the first time.

请注意,此函数不会启动状态机的执行。这是 Rust 中 future 的一个基本设计决策:在第一次轮询之前它们不执行任何操作。

2.4 Pinning 固定

We already stumbled across pinning multiple times in this post. Now is finally the time to explore what pinning is and why it is needed.

我们已经在这篇文章中多次偶然发现固定。现在终于是时候探讨什么是固定以及为什么需要它了。

2.4.1 Self-Referential Structs 自引用结构

As explained above, the state machine transformation stores the local variables of each pause point in a struct. For small examples like our example function, this was straightforward and did not lead to any problems. However, things become more difficult when variables reference each other. For example, consider this function:

如上所述,状态机转换将每个暂停点的局部变量存储在结构中。对于像我们的 example 函数这样的小例子,这很简单并且不会导致任何问题。然而,当变量相互引用时,事情会变得更加困难。例如,考虑这个函数:

rust 复制代码
async fn pin_example() -> i32 {
    let array = [1, 2, 3];
    let element = &array[2];
    async_write_file("foo.txt", element.to_string()).await;
    *element
}

This function creates a small array with the contents 1, 2, and 3. It then creates a reference to the last array element and stores it in an element variable. Next, it asynchronously writes the number converted to a string to a foo.txt file. Finally, it returns the number referenced by element.

此函数创建一个小的 array ,其内容为 123 。然后,它创建对最后一个数组元素的引用并将其存储在 element 变量中。接下来,它将转换为字符串的数字异步写入 foo.txt 文件。最后,它返回 element 引用的数字。

Since the function uses a single await operation, the resulting state machine has three states: start, end, and "waiting on write". The function takes no arguments, so the struct for the start state is empty. Like before, the struct for the end state is empty because the function is finished at this point. The struct for the "waiting on write" state is more interesting:

由于该函数使用单个 await 操作,因此生成的状态机具有三种状态:开始、结束和"等待写入"。该函数不带参数,因此起始状态的结构体为空。和之前一样,结束状态的结构体是空的,因为函数此时已完成。 "等待写入"状态的结构更有趣:

rust 复制代码
struct WaitingOnWriteState {
    array: [1, 2, 3],
    element: 0x1001c, // address of the last array element
}

We need to store both the array and element variables because element is required for the return value and array is referenced by element. Since element is a reference, it stores a pointer (i.e., a memory address) to the referenced element. We used 0x1001c as an example memory address here. In reality, it needs to be the address of the last element of the array field, so it depends on where the struct lives in memory. Structs with such internal pointers are called self-referential structs because they reference themselves from one of their fields.

我们需要存储 arrayelement 变量,因为返回值需要 element ,而 arrayelement 。由于 element 是一个引用,因此它存储指向所引用元素的指针(即内存地址)。我们在这里使用 0x1001c 作为示例内存地址。实际上,它需要是 array 字段最后一个元素的地址,因此它取决于结构在内存中的位置。具有此类内部指针的结构称为自引用结构,因为它们从其字段之一引用自身。

2.4.2 The Problem with Self-Referential Structs 自引用结构的问题

The internal pointer of our self-referential struct leads to a fundamental problem, which becomes apparent when we look at its memory layout:

自引用结构的内部指针导致了一个基本问题,当我们查看其内存布局时,这个问题就变得显而易见:

The array field starts at address 0x10014 and the element field at address 0x10020. It points to address 0x1001c because the last array element lives at this address. At this point, everything is still fine. However, an issue occurs when we move this struct to a different memory address:

array 字段从地址 0x10014 开始, element 字段从地址 0x10020 开始。它指向地址 0x1001c,因为最后一个数组元素位于该地址。此时,一切都还好。但是,当我们将此结构移动到不同的内存地址时,就会出现问题:

We moved the struct a bit so that it starts at address 0x10024 now. This could, for example, happen when we pass the struct as a function argument or assign it to a different stack variable. The problem is that the element field still points to address 0x1001c even though the last array element now lives at address 0x1002c. Thus, the pointer is dangling, with the result that undefined behavior occurs on the next poll call.

我们稍微移动了结构体,使其现在从地址 0x10024 开始。例如,当我们将结构体作为函数参数传递或将其分配给不同的堆栈变量时,可能会发生这种情况。问题在于,即使最后一个 array 元素现在位于地址 0x1002c 处, element 字段仍然指向地址 0x1001c 。因此,指针悬空,导致下一次 poll 调用时出现未定义的行为。

2.4.3 Possible Solutions 可能的解决方案

There are three fundamental approaches to solving the dangling pointer problem:

解决悬空指针问题的基本方法有以下三种:

  • Update the pointer on move: The idea is to update the internal pointer whenever the struct is moved in memory so that it is still valid after the move. Unfortunately, this approach would require extensive changes to Rust that would result in potentially huge performance losses. The reason is that some kind of runtime would need to keep track of the type of all struct fields and check on every move operation whether a pointer update is required.

    移动时更新指针:这个想法是每当结构体在内存中移动时更新内部指针,以便它在移动后仍然有效。不幸的是,这种方法需要对 Rust 进行大量更改,这可能会导致巨大的性能损失。原因是某种运行时需要跟踪所有结构体字段的类型,并检查每个移动操作是否需要更新指针。

  • Store an offset instead of self-references: : To avoid the requirement for updating pointers, the compiler could try to store self-references as offsets from the struct's beginning instead. For example, the element field of the above WaitingOnWriteState struct could be stored in the form of an element_offset field with a value of 8 because the array element that the reference points to starts 8 bytes after the struct's beginning. Since the offset stays the same when the struct is moved, no field updates are required.

    存储偏移量而不是自引用:为了避免更新指针的要求,编译器可以尝试将自引用存储为距结构开头的偏移量。例如,上述 WaitingOnWriteState 结构体的 element 字段可以以值为 8 的 element_offset 字段的形式存储,因为该数组元素引用指向结构体开始后 8 个字节的位置。由于移动结构时偏移量保持不变,因此不需要更新字段。

    The problem with this approach is that it requires the compiler to detect all self-references. This is not possible at compile-time because the value of a reference might depend on user input, so we would need a runtime system again to analyze references and correctly create the state structs. This would not only result in runtime costs but also prevent certain compiler optimizations, so that it would cause large performance losses again.

    这种方法的问题在于它要求编译器检测所有自引用。这在编译时是不可能的,因为引用的值可能取决于用户输入,因此我们再次需要一个运行时系统来分析引用并正确创建状态结构。这不仅会导致运行时成本,还会阻止某些编译器优化,从而再次造成较大的性能损失。

  • Forbid moving the struct: As we saw above, the dangling pointer only occurs when we move the struct in memory. By completely forbidding move operations on self-referential structs, the problem can also be avoided. The big advantage of this approach is that it can be implemented at the type system level without additional runtime costs. The drawback is that it puts the burden of dealing with move operations on possibly self-referential structs on the programmer.

    禁止移动结构体:正如我们在上面看到的,悬空指针仅当我们在内存中移动结构体时才会出现。通过完全禁止自引用结构上的移动操作,也可以避免该问题。这种方法的一大优点是它可以在类型系统级别实现,而无需额外的运行时成本。缺点是它给程序员带来了处理可能自引用结构上的移动操作的负担。

Rust chose the third solution because of its principle of providing zero cost abstractions , which means that abstractions should not impose additional runtime costs. The pinning API was proposed for this purpose in RFC 2349. In the following, we will give a short overview of this API and explain how it works with async/await and futures.

Rust 选择第三种解决方案是因为它提供零成本抽象的原则,这意味着抽象不应该施加额外的运行时成本。 RFC 2349 中为此目的提出了固定 API。下面,我们将简要概述该 API,并解释它如何与 async/await 和 future 配合使用。

2.4.4 Heap Values 堆上分配的值

The first observation is that heap-allocated values already have a fixed memory address most of the time. They are created using a call to allocate and then referenced by a pointer type such as Box<T>. While moving the pointer type is possible, the heap value that the pointer points to stays at the same memory address until it is freed through a deallocate call again.

第一个观察结果是,堆分配的值在大多数情况下已经具有固定的内存地址。它们是通过调用 allocate 创建的,然后由诸如 Box<T> 之类的指针类型引用。虽然可以移动指针类型,但指针指向的堆值将保留在同一内存地址,直到再次通过 deallocate 调用将其释放。

Using heap allocation, we can try to create a self-referential struct:

使用堆分配,我们可以尝试创建一个自引用结构:

rust 复制代码
fn main() {
    let mut heap_value = Box::new(SelfReferential {
        self_ptr: 0 as *const _,
    });
    let ptr = &*heap_value as *const SelfReferential;
    heap_value.self_ptr = ptr;
    println!("heap value at: {:p}", heap_value);
    println!("internal reference: {:p}", heap_value.self_ptr);
}

struct SelfReferential {
    self_ptr: *const Self,
}

(Try it on the playground)

We create a simple struct named SelfReferential that contains a single pointer field. First, we initialize this struct with a null pointer and then allocate it on the heap using Box::new. We then determine the memory address of the heap-allocated struct and store it in a ptr variable. Finally, we make the struct self-referential by assigning the ptr variable to the self_ptr field.

我们创建一个名为 SelfReferential 的简单结构,其中包含单个指针字段。首先,我们使用空指针初始化该结构,然后使用 Box::new 在堆上分配它。然后,我们确定堆分配结构的内存地址并将其存储在 ptr 变量中。最后,我们通过将 ptr 变量分配给 self_ptr 字段来使结构体自引用。

When we execute this code on the playground, we see that the address of the heap value and its internal pointer are equal, which means that the self_ptr field is a valid self-reference. Since the heap_value variable is only a pointer, moving it (e.g., by passing it to a function) does not change the address of the struct itself, so the self_ptr stays valid even if the pointer is moved.

当我们在 Playground 上执行此代码时,我们看到堆值的地址与其内部指针相等,这意味着 self_ptr 字段是有效的自引用。由于 heap_value 变量只是一个指针,移动它(例如,通过将其传递给函数)不会更改结构本身的地址,因此 self_ptr 保持有效,即使指针被移动。

However, there is still a way to break this example: We can move out of a Box<T> or replace its content:

然而,仍然有一种方法可以打破这个例子:我们可以移出 Box<T> 或替换它的内容:

rust 复制代码
let stack_value = mem::replace(&mut *heap_value, SelfReferential {
    self_ptr: 0 as *const _,
});
println!("value at: {:p}", &stack_value);
println!("internal reference: {:p}", stack_value.self_ptr);

(Try it on the playground)

Here we use the mem::replace function to replace the heap-allocated value with a new struct instance. This allows us to move the original heap_value to the stack, while the self_ptr field of the struct is now a dangling pointer that still points to the old heap address. When you try to run the example on the playground, you see that the printed "value at:" and "internal reference:" lines indeed show different pointers. So heap allocating a value is not enough to make self-references safe.

这里我们使用 mem::replace 函数将堆分配的值替换为新的结构体实例。这允许我们将原始的 heap_value 移动到堆栈,而结构的 self_ptr 字段现在是一个悬空指针,仍然指向旧的堆地址。当您尝试在操场上运行该示例时,您会看到打印的"value at:"和"internal reference:"行确实显示了不同的指针。因此,堆分配值不足以使自引用安全。

The fundamental problem that allowed the above breakage is that Box<T> allows us to get a &mut T reference to the heap-allocated value. This &mut reference makes it possible to use methods like mem::replace or mem::swap to invalidate the heap-allocated value. To resolve this problem, we must prevent &mut references to self-referential structs from being created.

导致上述损坏的根本问题是 Box<T> 允许我们获取对堆分配值的 &mut T 引用。此 &mut 引用使得可以使用 mem::replacemem::swap 等方法来使堆分配的值无效。要解决此问题,我们必须防止创建对自引用结构的 &mut 引用。

2.4.5 Pin<Box<T>> and Unpin

The pinning API provides a solution to the &mut T problem in the form of the Pin wrapper type and the Unpin marker trait. The idea behind these types is to gate all methods of Pin that can be used to get &mut references to the wrapped value (e.g. get_mut or deref_mut) on the Unpin trait. The Unpin trait is an auto trait, which is automatically implemented for all types except those that explicitly opt-out. By making self-referential structs opt-out of Unpin, there is no (safe) way to get a &mut T from a Pin<Box<T>> type for them. As a result, their internal self-references are guaranteed to stay valid.

固定 API 以 Pin 包装类型和 Unpin 标记特征的形式提供了 &mut T 问题的解决方案。这些类型背后的想法是控制 Pin某些方法 ,而这些方法 可用于获取对包装值的 &mut 引用(例如 Unpin特征上的 get_mutderef_mut 方法)。 Unpin 特征是一个自动特征,它会自动为所有类型实现 ,除了那些明确选择退出的类型 。通过使自引用结构选择退出 Unpin ,没有(安全)方法从 Pin<Box<T>> 类型中获取 &mut T 。因此,他们的内部自我引用保证保持有效。

UnPin 直接翻译过来就是不固定,也就是容许 移动(move), 基本上所有的类型默认都是 UnPin的,也就是可以移动的

As an example, let's update the SelfReferential type from above to opt-out of Unpin:

例如,让我们将上面的 SelfReferential 类型更新为选择退出 Unpin

rust 复制代码
use core::marker::PhantomPinned;

struct SelfReferential {
    self_ptr: *const Self,
    _pin: PhantomPinned,
}

We opt-out by adding a second _pin field of type PhantomPinned. This type is a zero-sized marker type whose only purpose is to not implement the Unpin trait. Because of the way auto traits work, a single field that is not Unpin suffices to make the complete struct opt-out of Unpin.

我们通过添加第二个 PhantomPinned 类型的 _pin 字段来选择退出。此类型是零大小的标记类型,其唯一目的是不实现 Unpin 特征。由于自动特征的工作方式,非 Unpin 的单个字段足以使 Unpin 的完整结构选择退出。

The second step is to change the Box<SelfReferential> type in the example to a Pin<Box<SelfReferential>> type. The easiest way to do this is to use the Box::pin function instead of Box::new for creating the heap-allocated value:

第二步是将示例中的 Box<SelfReferential> 类型更改为 Pin<Box<SelfReferential>> 类型。最简单的方法是使用 Box::pin 函数而不是 Box::new 来创建堆分配值:

rust 复制代码
let mut heap_value = Box::pin(SelfReferential {
    self_ptr: 0 as *const _,
    _pin: PhantomPinned,
});

In addition to changing Box::new to Box::pin, we also need to add the new _pin field in the struct initializer. Since PhantomPinned is a zero-sized type, we only need its type name to initialize it.

除了将 Box::new 更改为 Box::pin 之外,我们还需要在结构初始值设定项中添加新的 _pin 字段。由于 PhantomPinned 是一个零大小的类型,因此我们只需要它的类型名称来初始化它。

When we try to run our adjusted example now, we see that it no longer works:

当我们现在尝试运行调整后的示例时,我们发现它不再起作用:

rust 复制代码
error[E0594]: cannot assign to data in a dereference of `std::pin::Pin<std::boxed::Box<SelfReferential>>`
  --> src/main.rs:10:5
   |
10 |     heap_value.self_ptr = ptr;
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^ cannot assign
   |
   = help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `std::pin::Pin<std::boxed::Box<SelfReferential>>`

error[E0596]: cannot borrow data in a dereference of `std::pin::Pin<std::boxed::Box<SelfReferential>>` as mutable
  --> src/main.rs:16:36
   |
16 |     let stack_value = mem::replace(&mut *heap_value, SelfReferential {
   |                                    ^^^^^^^^^^^^^^^^ cannot borrow as mutable
   |
   = help: trait `DerefMut` is required to modify through a dereference, but it is not implemented for `std::pin::Pin<std::boxed::Box<SelfReferential>>`

Both errors occur because the Pin<Box<SelfReferential>> type no longer implements the DerefMut trait. This is exactly what we wanted because the DerefMut trait would return a &mut reference, which we wanted to prevent. This only happens because we both opted-out of Unpin and changed Box::new to Box::pin.

这两个错误的发生是因为 Pin<Box<SelfReferential>> 类型不再实现 DerefMut 特征。这正是我们想要的,因为 DerefMut 特征将返回 &mut 引用,这是我们想要阻止的。这只发生是因为我们都选择退出 Unpin 并将 Box::new 更改为 Box::pin

The problem now is that the compiler does not only prevent moving the type in line 16, but also forbids initializing the self_ptr field in line 10. This happens because the compiler can't differentiate between valid and invalid uses of &mut references. To get the initialization working again, we have to use the unsafe get_unchecked_mut method:

现在的问题是,编译器不仅阻止移动第 16 行中的类型,还禁止初始化第 10 行中的 self_ptr 字段。发生这种情况是因为编译器无法区分有效和无效的使用 &mut 参考文献。为了让初始化再次工作,我们必须使用不安全的 get_unchecked_mut 方法:

rust 复制代码
use std::mem;
use std::marker::PhantomPinned;
use std::pin::Pin;

fn main() {
    let mut heap_value = Box::pin(SelfReferential {
        self_ptr: 0 as *const _,
        _pin: PhantomPinned,
    });
    let ptr = &*heap_value as *const SelfReferential;
    
    // safe because modifying a field doesn't move the whole struct
    unsafe {
        let mut_ref = Pin::as_mut(&mut heap_value);
        Pin::get_unchecked_mut(mut_ref).self_ptr = ptr;
    }

    println!("heap value at: {:p}", heap_value);
    println!("internal reference: {:p}", heap_value.self_ptr);
    
    // break it
    
    let stack_value = mem::replace(&mut *heap_value, SelfReferential {
        self_ptr: 0 as *const _,
        _pin: PhantomPinned,
    });
    println!("value at: {:p}", &stack_value);
    println!("internal reference: {:p}", stack_value.self_ptr);
}

struct SelfReferential {
    self_ptr: *const Self,
    _pin: PhantomPinned,
}

(Try it on the playground)

The get_unchecked_mut function works on a Pin<&mut T> instead of a Pin<Box<T>>, so we have to use Pin::as_mut for converting the value. Then we can set the self_ptr field using the &mut reference returned by get_unchecked_mut.

get_unchecked_mut 函数适用于 Pin<&mut T> 而不是 Pin<Box<T>> ,因此我们必须使用 Pin::as_mut 来转换值。然后我们可以使用 get_unchecked_mut 返回的 &mut 引用来设置 self_ptr 字段。

Now the only error left is the desired error on mem::replace. Remember, this operation tries to move the heap-allocated value to the stack, which would break the self-reference stored in the self_ptr field. By opting out of Unpin and using Pin<Box<T>>, we can prevent this operation at compile time and thus safely work with self-referential structs. As we saw, the compiler is not able to prove that the creation of the self-reference is safe (yet), so we need to use an unsafe block and verify the correctness ourselves.

现在剩下的唯一错误是 mem::replace 上所需的错误。请记住,此操作尝试将堆分配的值移动到堆栈,这会破坏 self_ptr 字段中存储的自引用。通过选择退出 Unpin 并使用 Pin<Box<T>> ,我们可以在编译时阻止此操作,从而安全地使用自引用结构。正如我们所看到的,编译器还无法证明自引用的创建是安全的,因此我们需要使用不安全的块并自己验证正确性。

2.4.6 Stack Pinning and Pin<&mut T>

In the previous section, we learned how to use Pin<Box<T>> to safely create a heap-allocated self-referential value. While this approach works fine and is relatively safe (apart from the unsafe construction), the required heap allocation comes with a performance cost. Since Rust strives to provide zero-cost abstractions whenever possible, the pinning API also allows to create Pin<&mut T> instances that point to stack-allocated values.

在上一节中,我们学习了如何使用 Pin<Box<T>> 安全地创建堆分配的自引用值。虽然这种方法工作正常并且相对安全(除了不安全的构造),但所需的堆分配会带来性能成本。由于 Rust 致力于尽可能提供零成本抽象,因此固定(pinning ) API 还允许创建指向栈分配值的 Pin<&mut T> 实例。

Unlike Pin<Box<T>> instances, which have ownership of the wrapped value, Pin<&mut T> instances only temporarily borrow the wrapped value. This makes things more complicated, as it requires the programmer to ensure additional guarantees themselves. Most importantly, a Pin<&mut T> must stay pinned for the whole lifetime of the referenced T, which can be difficult to verify for stack-based variables. To help with this, crates like pin-utils exist, but I still wouldn't recommend pinning to the stack unless you really know what you're doing.

拥有包装值所有权Pin<Box<T>> 实例不同, Pin<&mut T> 实例仅临时借用 包装值。这使事情变得更加复杂,因为它需要程序员自己确保额外的保证。最重要的是, Pin<&mut T> 必须在引用的 T 的整个生命周期内保持固定,这对于基于栈的变量来说很难验证。为了帮助解决这个问题,存在像 pin-utils 这样的包,但我仍然不建议固定到栈,除非你真的知道自己在做什么。

For further reading, check out the documentation of the pin module and the Pin::new_unchecked method.

如需进一步阅读,请查看 pin 模块和 Pin::new_unchecked 方法的文档。

2.4.7 Pinning and Futures

As we already saw in this post, the Future::poll method uses pinning in the form of a Pin<&mut Self> parameter:

正如我们在这篇文章中已经看到的, Future::poll 方法以 Pin<&mut Self> 参数的形式使用固定:

rust 复制代码
fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output>

The reason that this method takes self: Pin<&mut Self> instead of the normal &mut self is that future instances created from async/await are often self-referential, as we saw above. By wrapping Self into Pin and letting the compiler opt-out of Unpin for self-referential futures generated from async/await, it is guaranteed that the futures are not moved in memory between poll calls. This ensures that all internal references are still valid.

此方法采用 self: Pin<&mut Self> 而不是正常的 &mut self 的原因是,从 async/await 创建的Future实例通常是自引用的,正如我们上面所看到的。通过将 Self 包装到 Pin 中,并让编译器针对从 async/await 生成的自引用 Future 并且显式退出 Unpin ,可以保证 Future 在 poll 调用之间不会在内存中移动。这可确保所有内部引用仍然有效。

It is worth noting that moving futures before the first poll call is fine. This is a result of the fact that futures are lazy and do nothing until they're polled for the first time. The start state of the generated state machines therefore only contains the function arguments but no internal references. In order to call poll, the caller must wrap the future into Pin first, which ensures that the future cannot be moved in memory anymore. Since stack pinning is more difficult to get right, I recommend to always use Box::pin combined with Pin::as_mut for this.

值得注意的是,在第一个 poll 调用之前 Future 是可以的移动的。这是因为 Future 很懒惰,在第一次轮询之前什么都不做。因此,生成的状态机的 start 状态仅包含函数参数,但不包含内部引用,自然也不会包含自引用 。为了调用 poll ,调用者必须首先将 Future 包装到 Pin 中,这确保 Future 不能再在内存中移动。由于 栈固定更难正确执行,因此我建议始终将 Box::pinPin::as_mut 结合使用。

In case you're interested in understanding how to safely implement a future combinator function using stack pinning yourself, take a look at the relatively short source of the map combinator method of the futures crate and the section about projections and structural pinning of the pin documentation.

如果您有兴趣了解如何使用栈固定来安全地实Future的组合器函数,请查看 futures crate 的 map 组合器方法的源码以及文档 projections and structural pinning 中有关Pin的部分。

2.5 Executors and Wakers 执行器和唤醒器

Using async/await, it is possible to ergonomically work with futures in a completely asynchronous way. However, as we learned above, futures do nothing until they are polled. This means we have to call poll on them at some point, otherwise the asynchronous code is never executed.

使用 async/await,可以以完全异步的方式符合人体工程学地处理 Future。然而,正如我们在上面了解到的,Future在被轮询之前不会执行任何操作。这意味着我们必须在某个时刻对它们调用 poll ,否则异步代码永远不会执行。

With a single future, we can always wait for each future manually using a loop as described above. However, this approach is very inefficient and not practical for programs that create a large number of futures. The most common solution to this problem is to define a global executor that is responsible for polling all futures in the system until they are finished.

对于单个 Future,我们总是可以使用如上所述的循环手动等待每个 future。然而,这种方法效率非常低,对于创建大量Future的程序来说并不实用。这个问题最常见的解决方案是定义一个全局执行器,负责轮询系统中的所有 Future,直到它们完成。

2.5.1 Executors 执行器

The purpose of an executor is to allow spawning futures as independent tasks, typically through some sort of spawn method. The executor is then responsible for polling all futures until they are completed. The big advantage of managing all futures in a central place is that the executor can switch to a different future whenever a future returns Poll::Pending. Thus, asynchronous operations are run in parallel and the CPU is kept busy.

执行器的目的是允许将 Future 作为独立任务生成,通常通过某种 spawn 方法。然后,执行者负责轮询所有 future,直到完成。集中管理所有 Future 的一大优点是,每当 Future 返回 Poll::Pending 时,执行者就可以切换到不同的 Future 。因此,异步操作并行运行,CPU 保持忙碌状态。

Many executor implementations can also take advantage of systems with multiple CPU cores. They create a thread pool that is able to utilize all cores if there is enough work available and use techniques such as work stealing to balance the load between cores. There are also special executor implementations for embedded systems that optimize for low latency and memory overhead.

许多执行器实现还可以利用具有多个 CPU 核心的系统。他们创建了一个线程池,如果有足够的可用工作,则能够利用所有核心,并使用工作窃取等技术来平衡核心之间的负载。还有针对嵌入式系统的特殊执行器实现,可优化低延迟和内存开销。

To avoid the overhead of polling futures repeatedly, executors typically take advantage of the waker API supported by Rust's futures.

为了避免重复轮询 Future 的开销,执行器通常利用 Rust Future 支持的 waker API。

2.5.2 Wakers 唤醒器

The idea behind the waker API is that a special Waker type is passed to each invocation of poll, wrapped in the Context type. This Waker type is created by the executor and can be used by the asynchronous task to signal its (partial) completion. As a result, the executor does not need to call poll on a future that previously returned Poll::Pending until it is notified by the corresponding waker.

waker API 背后的想法是将特殊的 Waker 类型传递给 poll 的每次调用,并封装在 Context 类型中。此 Waker 类型由执行器创建,异步任务可以使用它来指示其(部分)完成。因此,执行器不需要在之前返回 Poll::Pending 的 Future 上调用 poll ,直到收到相应 Waker 的通知。

This is best illustrated by a small example:

通过一个小例子可以最好地说明这一点:

rust 复制代码
async fn write_file() {
    async_write_file("foo.txt", "Hello").await;
}

This function asynchronously writes the string "Hello" to a foo.txt file. Since hard disk writes take some time, the first poll call on this future will likely return Poll::Pending. However, the hard disk driver will internally store the Waker passed to the poll call and use it to notify the executor when the file is written to disk. This way, the executor does not need to waste any time trying to poll the future again before it receives the waker notification.

此函数将字符串"Hello"异步写入 foo.txt 文件。由于硬盘写入需要一些时间,因此对该 Future 的第一个 poll 调用可能会返回 Poll::Pending 。但是,硬盘驱动程序将在内部保存传递给 poll 调用的 Waker ,并在文件写入磁盘时使用它来通知执行器。这样,执行器在收到唤醒通知之前就不需要浪费任何时间再次尝试 poll Future。

We will see how the Waker type works in detail when we create our own executor with waker support in the implementation section of this post.

当我们在本文的实现部分创建自己的具有唤醒支持的执行器时,我们将详细了解 Waker 类型的工作原理。

2.6 Cooperative Multitasking? 协作多任务处理?

At the beginning of this post, we talked about preemptive and cooperative multitasking. While preemptive multitasking relies on the operating system to forcibly switch between running tasks, cooperative multitasking requires that the tasks voluntarily give up control of the CPU through a yield operation on a regular basis. The big advantage of the cooperative approach is that tasks can save their state themselves, which results in more efficient context switches and makes it possible to share the same call stack between tasks.

在这篇文章的开头,我们讨论了抢占式和协作式多任务处理。抢占式多任务处理依赖操作系统在正在运行的任务之间强制切换,而协作式多任务处理则要求任务定期通过让出操作自愿放弃对 CPU 的控制。协作方法的一大优点是任务可以自己保存其状态,这会导致更有效的上下文切换,并使任务之间共享相同的调用堆栈成为可能。

It might not be immediately apparent, but futures and async/await are an implementation of the cooperative multitasking pattern:

它可能不会立即显而易见,但 Future 和 async/await 是协作多任务模式的实现:

  • Each future that is added to the executor is basically a cooperative task.

    添加到执行器的每个 Future 基本上都是一个合作任务。

  • Instead of using an explicit yield operation, futures give up control of the CPU core by returning Poll::Pending (or Poll::Ready at the end).

    Future 不使用显式的让出 (yield) 操作,而是通过返回 Poll::Pending (或最后的 Poll::Ready )来放弃对 CPU 核心的控制。

    • There is nothing that forces futures to give up the CPU. If they want, they can never return from poll, e.g., by spinning endlessly in a loop.

      没有什么可以迫使Future放弃CPU。如果他们愿意,他们永远无法从 poll 返回,例如,通过循环无休止地自旋。

    • Since each future can block the execution of the other futures in the executor, we need to trust them to not be malicious.

      由于每个 Future 都可以阻止执行器中其他 Future 的执行,因此我们需要相信它们不是恶意的。

  • Futures internally store all the state they need to continue execution on the next poll call. With async/await, the compiler automatically detects all variables that are needed and stores them inside the generated state machine.

    Future 内部存储在下一个 poll 调用中继续执行所需的所有状态 。使用 async/await,编译器会自动检测所需的所有变量并将它们存储在生成的状态机中

    • Only the minimum state required for continuation is saved.

      仅保存继续所需的最低状态。

    • Since the poll method gives up the call stack when it returns, the same stack can be used for polling other futures.

      由于 poll 方法在返回时放弃调用堆栈,因此可以使用相同的堆栈来轮询其他 Future。

We see that futures and async/await fit the cooperative multitasking pattern perfectly; they just use some different terminology. In the following, we will therefore use the terms "task" and "future" interchangeably.

我们看到 Futures 和 async/await 非常适合协作多任务模式;他们只是使用一些不同的术语。因此,在下文中,我们将互换使用术语"Task"和"Future"。

相关推荐
Benaso1 小时前
Rust 快速入门(一)
开发语言·后端·rust
姜学迁20 小时前
Rust-枚举
开发语言·后端·rust
凌云行者20 小时前
rust的迭代器方法——collect
开发语言·rust
QMCY_jason1 天前
Ubuntu 安装RUST
linux·ubuntu·rust
碳苯1 天前
【rCore OS 开源操作系统】Rust 枚举与模式匹配
开发语言·人工智能·后端·rust·操作系统·os
zaim11 天前
计算机的错误计算(一百一十四)
java·c++·python·rust·go·c·多项式
凌云行者2 天前
使用rust写一个Web服务器——单线程版本
服务器·前端·rust
cyz1410012 天前
vue3+vite@4+ts+elementplus创建项目详解
开发语言·后端·rust
超人不怕冷2 天前
[rust]多线程通信之通道
rust
逢生博客2 天前
Rust 语言开发 ESP32C3 并在 Wokwi 电子模拟器上运行(esp-hal 非标准库、LCD1602、I2C)
开发语言·后端·嵌入式硬件·rust