c++20协程浅析 - 技术栈

笔者被诸多协程文章视频误会，以致于一直囫囵吞枣，半懂不懂，遂即写下一点心得，希望后来人能有所感.

本文将会聚焦于协程的基本原理和基本用法，来帮助你理解和掌握c++20的协程，在阅读之前，请确保你已经了解一些c++20协程的使用

文章目录

协程浅析
- 函数、线程和协程
- 协程的等待体
协程切换
- 优点
- 上下文切换
实际应用：epoll+协程
- 编译运行

协程浅析

主流的文章开篇不是介绍协程的概念、对称非对称，有栈无栈，就是在讲c++20协程的函数，类型，用法和注意点。

可是笔者我更功利主义一点，上面的话都太空了，如果作为不太了解协程的人，扑面而来就是有栈无栈，对称非对称的概念，那在我看来就是死读书。

而如果花一堆功夫来介绍c++20的用法，又太浅显了。所以，如果可以，我想你明白协程的执行流程，其他协程和普通的程序代码没有什么太大的差别。

函数、线程和协程

我们先看代码，一段汇编语言实现的函数调用：

plain 复制代码

section .text
    global _start   ; 程序入口

; 函数：eax_to_ebx
; 功能：将eax寄存器的值复制到ebx寄存器
; 参数：无（直接操作寄存器）
; 返回：无
eax_to_ebx:
    push ebp        ; 保存旧的栈基址（建立栈帧）
    mov  ebp, esp   ; 设置新的栈基址
    
    mov  ebx, eax   ; 核心操作：将eax的值读入ebx
    
    pop  ebp        ; 恢复旧的栈基址
    ret             ; 函数返回（弹出返回地址到eip）

; 主程序入口
_start:
    ; 步骤1：给eax赋测试值
    mov  eax, 0x12345678
    
    ; 步骤2：调用函数，将eax的值复制到ebx
    call eax_to_ebx
    
    ; 此时ebx的值已等于eax（0x12345678），可通过系统调用退出验证
    mov  eax, 1     ; Linux系统调用号：exit(1)
    int  0x80       ; 触发系统调用（ebx作为exit的返回码，这里返回0x12345678）

这段汇编语言毫无营养，纯粹给你展示函数调用的本质

保存栈帧
建立栈帧
局部变量入栈（如果有）
恢复栈
函数返回（操作ip寄存器）

所以：函数的本质是线性、同步、一次性的执行模式，调用者必须等待被调用方法执行完成后返回。

当程序执行到call eax_to_ebx 的时候，执行流程转到函数体里，直到整个函数执行完毕，才会回到调用者的代码。

注意，函数调用的执行流程必须直到整个函数执行完毕才行(异常、返回和信号等情况忽略）

再看代码：

cpp 复制代码

#include <thread>
#include <iostream>

using namespace std;

int main()
{
	jthread t([] {
		cout << "hello thread" << endl;
	});
	return 0;
}

如你所见，这段代码会创建一个线程，运行一个lambda函数，注意由于创建了一个新的线程，这个线程会运行这个lambda函数，因此main函数和lambda函数的执行流程是独立的，二者是互不干扰的。

如果，在main函数里，需要实现和lambda函数的同步、互斥的操作，则需要引入锁、条件变量等机制，来控制函数的执行流程。

例如，使用锁可以实现函数的阻塞，解锁之后，再继续运行。

再来看协程代码，简单的猜数字：

cpp 复制代码

#include <iostream>
#include <coroutine>
#include <thread>
#include <cstdint>
#include <chrono>

using namespace std;

struct CoRet
{
    struct promise_type
    {
        int _out;
        int _res;
        std::exception_ptr exception_;

        suspend_always initial_suspend() { return {}; }
        suspend_always final_suspend() noexcept { return {}; }
        void unhandled_exception()
        {
            exception_ = std::current_exception();
        }
        CoRet get_return_object()
        {
            return
            {
                coroutine_handle<promise_type>::from_promise(*this)
            };
        }
        suspend_never yield_value(int r)
        {
            _out = r;
            return {};
        }
        void return_value(int r)
        {
            _res = r;
            cout << "coroutine: set res " << r << endl;
        }
    };

    struct Note
    {
        int guess;
    };

    struct Input
    {
        Note& _in;

        bool await_ready() { return false; }
        void await_suspend(coroutine_handle<CoRet::promise_type> h)
        {
            std::thread([this, h]() {
                std::cin >> _in.guess;
                cout << "suspend finish: You input: " << _in.guess << endl;
                h.resume();
            }).detach();
        }
        int await_resume() { return _in.guess; }
    };

    coroutine_handle<promise_type> _h;

};


CoRet Guess()
{
    int res = (rand() % 30) + 1;
    CoRet::Note note = {};
    CoRet::Input input{ note };
    cout << "coroutine: Init Finish" << endl;
    while (true)
    {
        int g = co_await input;
        cout << "coroutine: You guess " << g << ", res: " << res << endl;
        int result = res < g ? 1 : (res == g ? 0 : -1);
        cout << "coroutine: result is " <<
            ((result == 1) ? "larger" :
                ((result == 0) ? "the same" : "smaller")) << endl;
        if (result == 0)
        {
            break;
        }
    }

    cout << "coroutine: the game is finish" << endl;
    co_return res;
}

int main()
{
    srand((uint8_t)time(nullptr));
    auto coroutine = Guess();
    cout << "main: make a guess ..." << endl;
    // Start coroutine...
    coroutine._h.resume();
    int count = 0;
    while (true)
    {
        if (coroutine._h.done())
        {
            cout << "main: the coroutine result is " << coroutine._h.promise()._res << endl;
            coroutine._h.destroy();
            break;
        }
        count += 5;
        std::this_thread::sleep_for(std::chrono::seconds(1));
        cout << "main: sleep wait coroutine " << count << " s" << endl;
    }
    return 0;
}

附上执行流程，帮助你理解这些代码是如何组织的

其实就是协程函数被不断挂起，然后恢复执行，又被挂起，又恢复执行的流程，期间没用到任何同步和互斥的机制

所以协程在调用（call）和返回（return）的基础上，额外增加了暂停（Suspend） 、**恢复（Resume）**操作。

协程在编译器（或汇编角度）的角度来看，是**一个拥有一个或多个暂停点（suspend points）**的方法。当它运行到暂停点后，协程将会停止运行，并且跳回到调用者。一个暂停的协程可以继续执行，也可以直接被销毁。

cpp 复制代码

int func()
{
    auto coro_handle = coro_func();
    coro_handle.resume();
    if (/* your code*/)
    {
        coro_handle.resume();
    }
}

当协程运行时 ，除了和普通函数调用一样会产生新的函数栈帧 ，它还会使用额外的存储区域来保存协程的状态，让协程在暂停执行后，自身的"上下文"信息将能够保留，这块空间被称为协程帧（coroutine frame） 。协程帧在协程第一次被调用时创建，在协程运行结束或调用者销毁时被销毁。

协程的等待体

协程说穿了就是挂起，然后等待被恢复，但是c++20的协程里，有个等待体（awaiter）的概念

需要实现三个接口：

cpp 复制代码

1. bool await_ready() const; 
2. void await_suspend(std::coroutine_handle<>) const;  （或其重载）
3. auto await_resume() const;

await_ready：等待体是否准备好了，没准备好return false 就调用await_suspend
`await_suspend：等待体挂起如何操作。参数为调用其的协程句柄。
await_resume：协程挂起后恢复时，调用的接口，同时返回其结果，作为co_await的返回值。
不少代码的例子都是在await_suspend 函数中，直接把handle.resume()，就是说这些例子都是在挂起时就理解恢复了协程运行，这样的例子貌似什么异步的感觉都没有，没有体现任何异步操作的效果和优势。

这样能用来干啥就有点让我好奇了。我的直觉是等待体await_suspend 应该就是记录协程句柄，同时发起一个异步操作（比如用一个线程完成文件读写），然后在异步操作完成后，恢复协程的运行，告知协程读写的结果

co_await awaiter的在未来应该会有很多种等待体，比如AIO，异步网络，异步读写数据库等。这也应该是未来C++协程重点反正发展地方。

co_await 可以呈现出不少形式

cpp 复制代码

co_ret = co_await  awaiter;

co_await 调用 awaiter的接口。co_ret 是从awaiter 里面的await_resume 接口的返回值。

cpp 复制代码

co_ret = co_await  fun();

fun() 函数返回值是awaiter 对象，co_ret 是从awaiter 里面的await_resume 接口的返回值。

协程切换

优点

协程优点如下

高效资源利用：协程比线程更轻量。上下文切换和创建的资源开销都小于线程，就是因为协程使用的是线程的栈空间
简化异步编程：协程编写异步代码的时候，就像使用同步代码来实现异步一样，可读性和维护性都好上不少
非阻塞操作：协程运行非阻塞操作，使得程序可以在等待IO或者其他耗时任务的时候，可以继续执行其他任务
适用于高并发场景：由于开销小，于是高并发等大量任务的时候，就天然适合这种IO密集型场景

上下文切换

协程的栈布局与普通函数调用栈类似，但具有独特的特征：

cpp 复制代码

高地址
+------------------+
| 参数区域          |  <- 协程参数存储区
+------------------+
| 返回地址          |  <- 协程函数入口地址
+------------------+
| 局部变量          |  <- 协程私有数据
+------------------+
| 保存的寄存器      |  <- 上下文切换时保存
+------------------+
| 栈帧指针(RBP)     |  <- 当前栈帧基址
+------------------+  <- RSP指向此处
低地址

86_64架构要求栈按16字节对齐：

函数调用前RSP必须是16字节对齐的
对齐确保了SIMD指令的正确执行

再看x86架构的切换代码：

cpp 复制代码

# 保存当前上下文到第一个参数(%rdi)
leaq (%rsp),%rax        # 获取当前栈指针
movq %rax, 104(%rdi)    # 保存RSP到regs[13]
movq %rbx, 96(%rdi)     # 保存RBX到regs[12]
movq %rcx, 88(%rdi)     # 保存RCX到regs[11]
movq %rdx, 80(%rdi)     # 保存RDX到regs[10]
movq 0(%rax), %rax      # 获取返回地址
movq %rax, 72(%rdi)     # 保存返回地址到regs[9]
movq %rsi, 64(%rdi)     # 保存RSI到regs[8]
movq %rdi, 56(%rdi)     # 保存RDI到regs[7]
movq %rbp, 48(%rdi)     # 保存RBP到regs[6]
movq %r8, 40(%rdi)      # 保存R8到regs[5]
movq %r9, 32(%rdi)      # 保存R9到regs[4]
movq %r12, 24(%rdi)     # 保存R12到regs[3]
movq %r13, 16(%rdi)     # 保存R13到regs[2]
movq %r14, 8(%rdi)      # 保存R14到regs[1]
movq %r15, (%rdi)       # 保存R15到regs[0]

# 恢复目标上下文从第二个参数(%rsi)
movq 48(%rsi), %rbp     # 恢复RBP
movq 104(%rsi), %rsp    # 恢复RSP
movq (%rsi), %r15       # 恢复R15
# ... 其他寄存器恢复 ...
leaq 8(%rsp), %rsp      # 调整栈指针
pushq 72(%rsi)          # 压入返回地址
movq 64(%rsi), %rsi     # 最后恢复RSI
ret                     # 跳转执行

所以，如果我们要实现协程的切换逻辑，首先需要一个描述协程上下文的类型：

cpp 复制代码

struct coctx_t
{
#if defined(__i386__)
    void *regs[8];      // 32位架构：8个寄存器槽位
#else
    void *regs[14];     // 64位架构：14个寄存器槽位
#endif
    size_t ss_size;     // 栈大小
    char *ss_sp;        // 栈指针
};

假设有两个协程

cpp 复制代码

coctx_t* ctx1 = new coctx_t(fun1);  // 协程1
coctx_t* ctx2 = new coctx_t(fun2);  // 协程2
void fun1() 
{
    coctx_swap(ctx1, ctx2);
}
void fun2() 
{

}
int main() 
{
    fun1();
}

那么当从ctx1切换到目标协程ctx2时候，分为两步:

第一步：保存当前函数栈状态、寄存器到ctx1中。

结合上一张函数调用相关知识，协程切换的分解图如下：

第二步：从ctx2取出状态恢复寄存器、函数栈，跳转到新协程的代码去执行

由于操作是一样的，就不画分解图了。就相当于第一步的反向过程而已，从regs数组里面取出各个寄存器的值赋值给对应的寄存器即可。

只需注意一点，函数是怎么执行到新协程的代码的：

cpp 复制代码

leaq 8(%rsp), %rsp      ; 将栈顶指针上移8位
pushq 72(%rsi)          ; 把第二个协程的 返回地址 压入栈
movq 64(%rsi), %rsi     ; 恢复rsi寄存器
ret

我们知道调用 coctx_swap 时会将 调用方函数 (即这里的fun1函数,fun1函数其实就是协程1)的返回地址压栈 ，所以此时 rsp寄存器指向的内存地址里面所存的值就是fun1函数的返回地址，这个值将保存在ctx1.reg[9]中。

这里先将rsp指针上移8位再将 ctx2.regs[9]压入栈,即用新协程的返回地址，然后 ret执行就会将该地址赋值给rip寄存器，程序也就跳转到新协程的代码去执行了。

实际应用：epoll+协程

集成和自定义调度器：

cpp 复制代码

#pragma once

#include <sys/epoll.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <fcntl.h>
#include <coroutine>
#include <vector>
#include <unordered_map>
#include <stdexcept>
#include <span>
#include <cstring>

// 前向声明
class Scheduler;

// --------------- 调度器 ---------------
class Scheduler {
public:
    Scheduler() {
        epfd_ = epoll_create1(EPOLL_CLOEXEC);
        if (epfd_ == -1) throw std::runtime_error("epoll_create1");
    }
    Scheduler(const Scheduler&) = delete;
    Scheduler& operator=(const Scheduler&) = delete;
    ~Scheduler() { close(epfd_); }

    void run();
    void arm(int fd, bool read, std::coroutine_handle<> h);

private:
    int epfd_;
    std::unordered_map<int, std::coroutine_handle<>> waiters_;
};

inline void Scheduler::arm(int fd, bool read, std::coroutine_handle<> h) {
    epoll_event ev{};
    ev.events   = read ? EPOLLIN : EPOLLOUT;
    ev.data.fd  = fd;

    if (epoll_ctl(epfd_, EPOLL_CTL_ADD, fd, &ev) == -1) {
        if (errno == EEXIST) {        
            epoll_ctl(epfd_, EPOLL_CTL_MOD, fd, &ev);
        } else {
            throw std::runtime_error("epoll_ctl arm");
        }
    }
    waiters_[fd] = h;                 
}

inline void Scheduler::run() {
    std::vector<epoll_event> evs(1024);
    while (!waiters_.empty()) {
        int n = epoll_wait(epfd_, evs.data(), evs.size(), -1);
        for (int i = 0; i < n; ++i) {
            int fd = evs[i].data.fd;
            auto it = waiters_.find(fd);
            if (it != waiters_.end()) {
                auto h = it->second;  
                waiters_.erase(it);    
                h.resume();           
            }
        }
    }
}

// --------------- Awaiter 基类 ---------------
namespace detail {
    template<bool IsRead>
    struct IoAwaiter {
        Scheduler* sched;
        int        fd;
        std::span<char> buf;
        ssize_t    result = 0;
    
        bool await_ready() const noexcept { return false; }
    
        void await_suspend(std::coroutine_handle<> h) {
            sched->arm(fd, IsRead, h);      
        }
    
        ssize_t await_resume() noexcept {   
            if constexpr (IsRead)
                result = ::read(fd, buf.data(), buf.size());
            else
                result = ::write(fd, buf.data(), buf.size());
            return result;
        }
    };
} // namespace detail

inline auto async_read(Scheduler& s, int fd, std::span<char> buf) {
    detail::IoAwaiter<true> aw{};
    aw.sched = &s;
    aw.fd    = fd;
    aw.buf   = buf;
    return aw;
}

inline auto async_write(Scheduler& s, int fd, std::span<const char> buf) {
    detail::IoAwaiter<false> aw{};
    aw.sched = &s;
    aw.fd    = fd;
    aw.buf   = std::span<char>(const_cast<char*>(buf.data()), buf.size());
    return aw;
}

struct Task {
    struct promise_type {
        Task get_return_object() { return {}; }
        std::suspend_never initial_suspend() noexcept { return {}; }
        std::suspend_always final_suspend() noexcept { return {}; }
        void return_void() {}
        void unhandled_exception() { std::terminate(); }
    };
};


struct AcceptAwaiter {
    Scheduler* sched;
    int        lfd;
    int        fd = -1;

    bool await_ready() const noexcept { return false; }
    void await_suspend(std::coroutine_handle<> h) {
        sched->arm(lfd, true, h);          
    }
    int await_resume() noexcept {         
        sockaddr_in addr{};
        socklen_t len = sizeof(addr);
        fd = ::accept(lfd, (sockaddr*)&addr, &len);
        if (fd != -1) {
            int fl = fcntl(fd, F_GETFL, 0);
            fcntl(fd, F_SETFL, fl | O_NONBLOCK);
        }
        return fd;
    }
};

cpp 复制代码

#include "async_epoll.hpp"
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <cstring>
#include <iostream>
#include <unistd.h>

static void set_nonblock(int fd) {
    int fl = fcntl(fd, F_GETFL, 0);
    fcntl(fd, F_SETFL, fl | O_NONBLOCK);
}

Task session(Scheduler& sched, int fd) {
    char buf[1024];
    std::cout << "[+] fd=" << fd << " connected\n" << std::flush;
    for (;;) {
        ssize_t n = co_await async_read(sched, fd, std::span<char>(buf, sizeof(buf)));
        if (n <= 0) break;                 // 对端关闭或出错
        std::cout << "[<] fd=" << fd << " " << std::string_view(buf, n) << std::flush;
        co_await async_write(sched, fd, std::span<const char>(buf, n));
    }
    std::cout << "[-] fd=" << fd << " disconnected\n" << std::flush;
    close(fd);
    co_return;
}

Task listener(Scheduler& sched, int lfd) {
    for (;;) {
        int fd = co_await AcceptAwaiter{&sched, lfd};
        if (fd == -1) {            
            std::cerr << "accept: " << strerror(errno) << '\n';
            continue;
        }
        session(sched, fd);
    }
}

int main() {
    try {
        Scheduler sched;

        int lfd = socket(AF_INET, SOCK_STREAM, 0);
        if (lfd == -1) throw std::runtime_error("socket");

        int one = 1;
        setsockopt(lfd, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one));
        set_nonblock(lfd);               // 关键：监听 fd 也要非阻塞

        sockaddr_in addr{};
        addr.sin_family      = AF_INET;
        addr.sin_port        = htons(8080);
        addr.sin_addr.s_addr = INADDR_ANY;

        if (bind(lfd, (sockaddr*)&addr, sizeof(addr)) == -1)
            throw std::runtime_error("bind");
        if (listen(lfd, 128) == -1)
            throw std::runtime_error("listen");

        std::cout << "echo server listen on 0.0.0.0:8080 ...\n" << std::flush;

        listener(sched, lfd);  
        sched.run();           
    } catch (const std::exception& ex) {
        std::cerr << "exception: " << ex.what() << '\n';
    }
}

编译运行

shell 复制代码

g++ -std=c++20 -O2 -pthread main.cpp -o demo
./demo
nc 127.0.0.1 8080