C++11并发编程

一、线程基础：`std::thread`

C++11 用 std::thread 对象代表一个执行线程，告别了平台特定的 API（如 pthread）。

cpp 复制代码

#include <thread>
void task() { std::cout << "Hello from thread " << std::this_thread::get_id() << '\n'; }

int main() {
    // 创建并启动线程
    std::thread t1(task);
    // 主线程必须选择：等待(t.join) 或 分离(t.detach)
    t1.join(); // 等待 t1 完成
    return 0;
}

要点：

移动语义 ：thread 对象只能移动 (std::move)，不能拷贝。
参数传递 ：所有参数被拷贝到线程的独立存储中。若需传递引用，使用 std::ref 包装。
生命周期管理 ：必须在 thread 对象析构前调用 join()（等待其结束）或 detach()（允许其独立运行），否则程序会终止 (std::terminate)。

二、同步基石：互斥锁 (`mutex`) 与锁守卫

多个线程访问共享数据会导致数据竞争。互斥锁用于确保一次只有一个线程进入临界区。

cpp 复制代码

#include <mutex>
std::mutex g_mutex;
int shared_data = 0;

void unsafe_increment() { ++shared_data; } // no 有数据竞争风险
void safe_increment() {
    std::lock_guard<std::mutex> lock(g_mutex); // yes 构造时加锁，析构时自动解锁
    ++shared_data;
}

C++11 提供了几种锁，核心区别如下：

类型	特性	适用场景
`std::mutex`	基本互斥锁，不可重入。	通用场景。
`std::recursive_mutex`	允许同一线程多次加锁。	递归函数或可重入类。
`std::timed_mutex`	提供 `try_lock_for/until`。	需要尝试加锁或超时等待。

RAII 锁管理器（关键工具）：

std::lock_guard：最简单的 RAII 包装器，构造时加锁，析构时解锁。不支持手动解锁或转移所有权 。
cpp 复制代码
```
{
    std::lock_guard<std::mutex> lock(mtx);
    // 临界区
} // 离开作用域，自动解锁
```

std::unique_lock：功能更强的 RAII 包装器。支持延迟加锁、手动解锁、转移所有权，并可用于条件变量 。

cpp 复制代码

std::unique_lock<std::mutex> lock(mtx, std::defer_lock); // 仅关联，不加锁
std::lock(lock); // 现在才加锁
lock.unlock(); // 可以手动解锁
// ... 执行非临界区代码
lock.lock(); // 重新加锁

避免死锁的技巧：

固定顺序加锁：所有线程按相同顺序（如先锁 A，再锁 B）获取锁。

使用 std::lock ：一次性锁定多个互斥量，避免因顺序导致的死锁。

cpp 复制代码

std::unique_lock<std::mutex> lock1(mtx1, std::defer_lock);
std::unique_lock<std::mutex> lock2(mtx2, std::defer_lock);
std::lock(lock1, lock2); // 同时锁定，原子操作

三、高级同步：条件变量 (`condition_variable`)

互斥锁用于互斥访问，条件变量用于线程间的协作通信（"等待-通知"机制）。它允许一个线程阻塞，直到被另一个线程通知某个条件可能为真。

典型模式：生产者-消费者

复制代码

    生产者线程                共享队列 (mutex保护)               消费者线程
        |                            |                               |
        v                            |                               |
    生产数据 --------------------> 放入队列 --------------------> 取出数据
        |      (notify_one/all)      |          (wait)               |
        |-------------------------> 唤醒 ---------------------------->|

cpp 复制代码

std::queue<int> data_queue;
std::mutex mtx;
std::condition_variable cv;
bool finished = false;

// 生产者
void producer() {
    for(int i=0; i<10; ++i) {
        std::lock_guard<std::mutex> lock(mtx);
        data_queue.push(i);
        cv.notify_one(); // 通知一个等待的消费者
    }
    finished = true;
    cv.notify_all(); // 通知所有消费者结束
}

// 消费者
void consumer(int id) {
    while(true) {
        std::unique_lock<std::mutex> lock(mtx);
        // wait 会在阻塞前释放锁，被唤醒后重新获取锁
        cv.wait(lock, []{ return !data_queue.empty() || finished; });
        if(finished && data_queue.empty()) break;
        int val = data_queue.front(); data_queue.pop();
        lock.unlock(); // 尽早释放锁
        std::cout << id << " consumed " << val << '\n';
    }
}

关键点 ：wait 的第二个参数（谓词）是为了防止虚假唤醒。其等价于：

cpp 复制代码

while (!pred()) { // 检查条件
    cv.wait(lock); // 条件不满足，释放锁并等待
}
// 条件满足，继续执行

四、一次性初始化：`std::call_once`

确保某个操作（如初始化全局资源）在多线程环境下只执行一次。

cpp 复制代码

std::once_flag init_flag;
void init_resource() { /* 只会执行一次 */ }

void worker() {
    std::call_once(init_flag, init_resource);
    // 之后可以安全使用资源
}

五、无锁编程基石：`std::atomic`

对于简单的共享变量（如计数器），使用互斥锁开销过大。std::atomic 模板提供了无需锁的、线程安全的原子操作。

cpp 复制代码

#include <atomic>
std::atomic<int> counter{0};

void increment() {
    for(int i=0; i<10000; ++i) {
        ++counter; // 原子操作，线程安全
        // counter.fetch_add(1, std::memory_order_relaxed);
    }
}
// 启动两个线程后，counter 结果保证是 20000

核心原理：CAS (Compare-And-Swap)

原子类型的许多操作（如 exchange, compare_exchange_strong/weak）底层基于 CAS，它是一个关键的无锁编程原语。

cpp 复制代码

bool compare_exchange_weak(T& expected, T desired) {
    if (this->value == expected) {
        this->value = desired;
        return true;
    } else {
        expected = this->value;
        return false;
    }
}
// 典型使用模式（实现无锁栈 pop）：
std::atomic<Node*> head;
Node* old_head = head.load();
do {
    if(!old_head) return nullptr;
    Node* new_head = old_head->next;
} while(!head.compare_exchange_weak(old_head, new_head));

六、内存模型：控制指令重排

现代编译器和 CPU 会对指令进行重排序以优化性能。但在多线程中，不恰当的重排会导致逻辑错误。C++11 定义了 6 种内存顺序（std::memory_order），让程序员控制同步的严格程度。

最常用的三种：

memory_order_seq_cst：顺序一致性。默认选项，最严格，保证所有线程看到相同的操作顺序。性能开销最大。

memory_order_acquire / memory_order_release ：获取-释放 语义。更高效的同步模式。

release 操作：保证之前的所有读写操作不会重排到它之后。
acquire 操作：保证之后的所有读写操作不会重排到它之前。
一个线程的 release 操作同步于 另一个线程对同一原子变量进行的 acquire 操作。

cpp 复制代码

// 线程 A (数据生产者)
data = 42;                          // 1. 准备数据
ready.store(true, std::memory_order_release); // 2. 发布标志 (屏障，阻止1重排到2之后)

// 线程 B (数据消费者)
while(!ready.load(std::memory_order_acquire)); // 3. 获取标志 (屏障，阻止4重排到3之前)
std::cout << data;                   // 4. 此时一定能看到 data==42

memory_order_relaxed：松散顺序。只保证原子性和修改顺序一致性，不提供同步关系。适用于累加器等场景。

七、异步操作：`std::future`, `std::promise`, `std::async`

这些工具用于处理在未来某个时间点可获得的结果，将任务的启动与结果的获取解耦。

1. std::async：最简单的异步任务

cpp 复制代码

#include <future>
int compute() { /* 耗时计算 */ return 42; }

int main() {
    // 启动异步任务，可能在新线程中执行
    std::future<int> fut = std::async(std::launch::async, compute);
    // ... 做其他事情
    int result = fut.get(); // 阻塞直到结果就绪，并获取值
    std::cout << "Result: " << result; // 42
}

2. std::promise & std::future：显式设置结果
promise/future 是一对通信信道。promise 用于设置值（或异常），future 用于获取值。

cpp 复制代码

void worker(std::promise<int> prom) {
    std::this_thread::sleep_for(1s);
    prom.set_value(99); // 设置结果
}
int main() {
    std::promise<int> prom;
    std::future<int> fut = prom.get_future(); // 获取关联的 future
    std::thread t(worker, std::move(prom));
    int value = fut.get(); // 等待并获取结果
    t.join();
}

3. std::packaged_task：将可调用对象包装为异步任务

它将函数和 future 绑定，方便将任务加入队列或传递给线程。

cpp 复制代码

std::packaged_task<int()> task([](){ return 7*6; });
std::future<int> fut = task.get_future();
std::thread t(std::move(task)); // 在另一线程执行
int result = fut.get(); // 42
t.join();

4. std::shared_future：可共享的 future

普通的 future 只能 get() 一次。shared_future 可以拷贝，允许多个线程等待并获取同一个结果。

cpp 复制代码

std::promise<int> prom;
std::shared_future<int> sf = prom.get_future().share(); // 转换为 shared_future
// 多个线程都可以 sf.get()

总结与选择建议

工具	核心用途	一句话比喻
`std::thread`	创建和管理线程。	雇佣工人。
`std::mutex` + `lock_guard`	保护共享数据，防止同时访问。	厕所的门锁。
`std::condition_variable`	线程间等待特定条件成立。	餐厅等位叫号器。
`std::atomic`	无锁的简单共享变量操作。	不可分割的自动动作。
`std::async`	最简单的方式启动异步任务。	叫一次性的外卖。
`std::promise`/`std::future`	在线程间精确传递一次结果。	单次使用的快递包裹。
`std::packaged_task`	将任务打包，便于存储和传递。	写好待寄的快递单。

八、读写锁：提高读多写少的性能

问题背景

在共享数据场景中，如果大部分操作是读取，少部分操作是修改，使用普通的互斥锁会限制性能------即使多个线程只想读取数据，也必须排队。

C++14 的解决方案：`std::shared_timed_mutex`

C++14 引入了读写锁，C++17 进一步提供了 std::shared_mutex。

cpp 复制代码

#include <shared_mutex>
#include <map>
#include <string>

class ThreadSafeDictionary {
private:
    std::map<std::string, std::string> data_;
    mutable std::shared_timed_mutex rw_mutex_; // 读写锁
    
public:
    // 读操作：多个线程可以同时读取
    std::string find(const std::string& key) const {
        std::shared_lock<std::shared_timed_mutex> lock(rw_mutex_); // 共享锁
        auto it = data_.find(key);
        return (it != data_.end()) ? it->second : "";
    }
    
    // 写操作：一次只能有一个线程写入
    void insert(const std::string& key, const std::string& value) {
        std::unique_lock<std::shared_timed_mutex> lock(rw_mutex_); // 独占锁
        data_[key] = value;
    }
};

读写锁的工作原理

复制代码

读线程1 --- shared_lock ---+
                            |--- 共享区域（允许多个读取） ---|
读线程2 --- shared_lock ---+
写线程 ----- unique_lock ------------------ 独占区域（一次一个写入）

C++17 的 `std::shared_mutex`（更高效）

cpp 复制代码

#include <shared_mutex>

class Configuration {
    std::shared_mutex config_mutex_;
    ConfigData config_;
    
public:
    // 多个线程可同时读取配置
    ConfigData get_config() const {
        std::shared_lock lock(config_mutex_); // C++17 CTAD: 自动推导类型
        return config_;
    }
    
    // 更新配置需要独占访问
    void update_config(const ConfigData& new_config) {
        std::unique_lock lock(config_mutex_);
        config_ = new_config;
    }
};

九、线程池模式

虽然 C++11 标准库没有直接提供线程池，但基于已有工具可以构建简单的线程池。

基础线程池实现

cpp 复制代码

#include <thread>
#include <mutex>
#include <condition_variable>
#include <queue>
#include <functional>
#include <future>
#include <vector>

class ThreadPool {
public:
    ThreadPool(size_t num_threads = std::thread::hardware_concurrency()) {
        for (size_t i = 0; i < num_threads; ++i) {
            workers_.emplace_back([this] {
                while (true) {
                    std::function<void()> task;
                    
                    {
                        std::unique_lock<std::mutex> lock(queue_mutex_);
                        condition_.wait(lock, [this] {
                            return stop_ || !tasks_.empty();
                        });
                        
                        if (stop_ && tasks_.empty()) return;
                        
                        task = std::move(tasks_.front());
                        tasks_.pop();
                    }
                    
                    task(); // 执行任务
                }
            });
        }
    }
    
    template<class F, class... Args>
    auto enqueue(F&& f, Args&&... args) 
        -> std::future<typename std::invoke_result<F, Args...>::type> {
        
        using return_type = typename std::invoke_result<F, Args...>::type;
        
        auto task = std::make_shared<std::packaged_task<return_type()>>(
            std::bind(std::forward<F>(f), std::forward<Args>(args)...)
        );
        
        std::future<return_type> result = task->get_future();
        
        {
            std::unique_lock<std::mutex> lock(queue_mutex_);
            if (stop_) throw std::runtime_error("enqueue on stopped ThreadPool");
            tasks_.emplace([task]() { (*task)(); });
        }
        
        condition_.notify_one();
        return result;
    }
    
    ~ThreadPool() {
        {
            std::unique_lock<std::mutex> lock(queue_mutex_);
            stop_ = true;
        }
        condition_.notify_all();
        for (std::thread &worker : workers_) {
            worker.join();
        }
    }
    
private:
    std::vector<std::thread> workers_;
    std::queue<std::function<void()>> tasks_;
    std::mutex queue_mutex_;
    std::condition_variable condition_;
    bool stop_ = false;
};

// 使用示例
int main() {
    ThreadPool pool(4); // 4个工作线程
    
    std::vector<std::future<int>> results;
    
    // 提交任务
    for (int i = 0; i < 8; ++i) {
        results.emplace_back(pool.enqueue([i] {
            std::this_thread::sleep_for(std::chrono::seconds(1));
            return i * i;
        }));
    }
    
    // 获取结果
    for (auto& result : results) {
        std::cout << result.get() << ' ';
    }
    // 输出: 0 1 4 9 16 25 36 49
    return 0;
}

十、无锁数据结构示例

基于原子操作可以实现简单的无锁数据结构。

无锁栈

cpp 复制代码

#include <atomic>
#include <memory>

template<typename T>
class LockFreeStack {
private:
    struct Node {
        std::shared_ptr<T> data;
        Node* next;
        Node(const T& value) : data(std::make_shared<T>(value)) {}
    };
    
    std::atomic<Node*> head_ = nullptr;
    
public:
    void push(const T& value) {
        Node* new_node = new Node(value);
        new_node->next = head_.load(std::memory_order_relaxed);
        while (!head_.compare_exchange_weak(
            new_node->next, 
            new_node,
            std::memory_order_release,
            std::memory_order_relaxed)) {
            // CAS失败，重试
        }
    }
    
    std::shared_ptr<T> pop() {
        Node* old_head = head_.load(std::memory_order_relaxed);
        while (old_head && 
               !head_.compare_exchange_weak(
                   old_head,
                   old_head->next,
                   std::memory_order_acquire,
                   std::memory_order_relaxed)) {
            // CAS失败，重试
        }
        return old_head ? old_head->data : std::shared_ptr<T>();
    }
    
    bool empty() const {
        return head_.load(std::memory_order_relaxed) == nullptr;
    }
};

无锁队列（Michael-Scott 队列）

cpp 复制代码

#include <atomic>
#include <memory>

template<typename T>
class LockFreeQueue {
private:
    struct Node {
        std::shared_ptr<T> data;
        std::atomic<Node*> next;
        Node() : next(nullptr) {}
    };
    
    std::atomic<Node*> head_;
    std::atomic<Node*> tail_;
    
public:
    LockFreeQueue() {
        Node* dummy = new Node();
        head_ = tail_ = dummy;
    }
    
    ~LockFreeQueue() {
        while (Node* node = head_.load()) {
            head_ = node->next;
            delete node;
        }
    }
    
    void push(const T& value) {
        Node* new_node = new Node();
        new_node->data = std::make_shared<T>(value);
        
        while (true) {
            Node* old_tail = tail_.load();
            Node* next = old_tail->next.load();
            
            // 检查tail是否被其他线程更新
            if (old_tail == tail_.load()) {
                if (next == nullptr) {
                    // 尝试将新节点链接到链表末尾
                    if (old_tail->next.compare_exchange_weak(next, new_node)) {
                        // 成功，尝试更新tail指针
                        tail_.compare_exchange_weak(old_tail, new_node);
                        return;
                    }
                } else {
                    // 帮助其他线程完成操作
                    tail_.compare_exchange_weak(old_tail, next);
                }
            }
        }
    }
    
    std::shared_ptr<T> pop() {
        while (true) {
            Node* old_head = head_.load();
            Node* old_tail = tail_.load();
            Node* next = old_head->next.load();
            
            if (old_head == head_.load()) {
                if (old_head == old_tail) {
                    if (next == nullptr) {
                        return std::shared_ptr<T>(); // 队列为空
                    }
                    // 帮助其他线程完成操作
                    tail_.compare_exchange_weak(old_tail, next);
                } else {
                    std::shared_ptr<T> result = next->data;
                    if (head_.compare_exchange_weak(old_head, next)) {
                        delete old_head;
                        return result;
                    }
                }
            }
        }
    }
};

十一、线程局部存储

`thread_local` 关键字

每个线程拥有该变量的独立副本，适用于不希望在多线程间共享的全局数据。

cpp 复制代码

#include <thread>
#include <iostream>

thread_local int thread_specific_value = 0; // 每个线程有自己的副本

void thread_func(int id) {
    thread_specific_value = id; // 修改本线程的副本
    std::this_thread::sleep_for(std::chrono::milliseconds(100));
    std::cout << "Thread " << id << ": value = " << thread_specific_value << std::endl;
}

int main() {
    std::thread t1(thread_func, 1);
    std::thread t2(thread_func, 2);
    
    thread_specific_value = 100; // 修改主线程的副本
    
    t1.join();
    t2.join();
    
    std::cout << "Main thread: value = " << thread_specific_value << std::endl;
    // 输出:
    // Thread 1: value = 1
    // Thread 2: value = 2  
    // Main thread: value = 100
}

十二、性能优化与陷阱

当多个线程频繁访问同一缓存行中的不同数据时，会导致缓存一致性开销。

问题示例：

cpp 复制代码

struct Data {
    int x; // 线程1频繁修改
    int y; // 线程2频繁修改
    // x和y可能在同一个缓存行（通常64字节）
};

Data data;
std::thread t1([&] { for (int i=0; i<1e9; ++i) ++data.x; });
std::thread t2([&] { for (int i=0; i<1e9; ++i) ++data.y; });

解决方案：填充或对齐

cpp 复制代码

struct alignas(64) Data { // C++11 对齐支持
    int x;
    char padding[60]; // 填充到64字节
};

struct Data {
    alignas(64) int x;
    alignas(64) int y;
};

2. 锁粒度优化

cpp 复制代码

// 不好的做法：锁粒度太大
void process_data(std::vector<int>& data) {
    std::lock_guard<std::mutex> lock(mutex_);
    // 长时间计算1
    // 长时间计算2
    // I/O操作
}

// 好的做法：减小锁粒度
void process_data(std::vector<int>& data) {
    // 第一阶段：只保护必要的数据访问
    int intermediate_result;
    {
        std::lock_guard<std::mutex> lock(mutex_);
        intermediate_result = calculate_step1(data);
    }
    
    // 第二阶段：无锁计算
    int final_result = calculate_step2(intermediate_result);
    
    // 第三阶段：再次加锁
    {
        std::lock_guard<std::mutex> lock(mutex_);
        store_result(data, final_result);
    }
}

3. 避免优先级反转

cpp 复制代码

// 使用 std::lock 避免死锁的同时，也减少了优先级反转的风险
void transfer(Account& from, Account& to, int amount) {
    std::lock(from.mutex, to.mutex); // 同时锁定，无固定顺序
    std::lock_guard<std::mutex> lock1(from.mutex, std::adopt_lock);
    std::lock_guard<std::mutex> lock2(to.mutex, std::adopt_lock);
    
    from.balance -= amount;
    to.balance += amount;
}

十三、C++11 并发工具选择决策树

简单异步计算
需要精确控制
是
否
否
是
只读
读写
读远大于写
写较多或均等
很小
较大
是
否, 性能关键
典型同步
是
否
是
否
开始并发任务
任务类型
使用 std::async
使用 std::thread
需要返回值吗?
配合 std::future
使用 std::launch::async
数据共享吗?
使用 thread_local
访问模式?
不需要同步
读写比例?
使用 shared_mutex
临界区大小?
使用 atomic
使用 mutex
需要顺序保证吗?
memory_order_seq_cst
memory_order_relaxed
acquire/release
需要超时功能吗?
使用 timed_mutex
使用 mutex
需要递归吗?
使用 recursive_mutex
基本 mutex 即可
配合 lock_guard/unique_lock
读: shared_lock, 写: unique_lock
任务完成

十四、完整实战：简单的Web服务器模型

cpp 复制代码

#include <iostream>
#include <thread>
#include <vector>
#include <queue>
#include <mutex>
#include <condition_variable>
#include <functional>
#include <chrono>
#include <random>
#include <atomic>

// 模拟HTTP请求
struct HttpRequest {
    int id;
    std::string method;
    std::string path;
};

// 模拟HTTP响应
struct HttpResponse {
    int request_id;
    int status_code;
    std::string body;
};

// 简单的线程池服务器
class HttpServer {
private:
    std::vector<std::thread> workers_;
    std::queue<HttpRequest> request_queue_;
    std::mutex queue_mutex_;
    std::condition_variable queue_cv_;
    std::atomic<bool> running_{true};
    std::atomic<int> request_count_{0};
    
    // 处理单个请求
    HttpResponse handle_request(const HttpRequest& req) {
        std::this_thread::sleep_for(std::chrono::milliseconds(50)); // 模拟处理时间
        
        HttpResponse resp;
        resp.request_id = req.id;
        resp.status_code = 200;
        resp.body = "Hello from request #" + std::to_string(req.id);
        
        return resp;
    }
    
    // 工作线程函数
    void worker_thread(int id) {
        std::cout << "Worker " << id << " started\n";
        
        while (running_) {
            HttpRequest req;
            
            {
                std::unique_lock<std::mutex> lock(queue_mutex_);
                queue_cv_.wait(lock, [this] {
                    return !running_ || !request_queue_.empty();
                });
                
                if (!running_ && request_queue_.empty()) {
                    break;
                }
                
                req = request_queue_.front();
                request_queue_.pop();
            }
            
            // 处理请求
            HttpResponse resp = handle_request(req);
            
            std::cout << "Worker " << id << " processed request " 
                      << req.id << ": " << req.path << "\n";
        }
        
        std::cout << "Worker " << id << " stopped\n";
    }
    
public:
    HttpServer(int num_workers = 4) {
        for (int i = 0; i < num_workers; ++i) {
            workers_.emplace_back(&HttpServer::worker_thread, this, i);
        }
    }
    
    ~HttpServer() {
        stop();
    }
    
    // 接收请求（模拟客户端发送）
    void receive_request(const std::string& method, const std::string& path) {
        HttpRequest req;
        req.id = ++request_count_;
        req.method = method;
        req.path = path;
        
        {
            std::lock_guard<std::mutex> lock(queue_mutex_);
            request_queue_.push(req);
        }
        
        queue_cv_.notify_one();
    }
    
    // 停止服务器
    void stop() {
        running_ = false;
        queue_cv_.notify_all();
        
        for (auto& worker : workers_) {
            if (worker.joinable()) {
                worker.join();
            }
        }
    }
    
    // 获取当前排队请求数
    int queue_size() const {
        std::lock_guard<std::mutex> lock(queue_mutex_);
        return request_queue_.size();
    }
};

int main() {
    HttpServer server(4); // 4个工作线程
    
    // 模拟100个并发请求
    std::vector<std::thread> clients;
    for (int i = 0; i < 100; ++i) {
        clients.emplace_back([&server, i]() {
            std::this_thread::sleep_for(std::chrono::milliseconds(i * 10));
            server.receive_request("GET", "/api/data/" + std::to_string(i));
        });
    }
    
    // 等待所有请求发送完成
    for (auto& client : clients) {
        client.join();
    }
    
    // 等待队列清空
    while (server.queue_size() > 0) {
        std::this_thread::sleep_for(std::chrono::milliseconds(100));
    }
    
    server.stop();
    
    std::cout << "All requests processed.\n";
    return 0;
}

十五、C++17/C++20 并发增强（前瞻）

C++17 改进

std::shared_mutex：更高效的读写锁

并行算法 ：std::execution::par

cpp 复制代码

#include <execution>
#include <vector>
#include <algorithm>

std::vector<int> data(1000000);
std::sort(std::execution::par, data.begin(), data.end());

C++20 重要特性

协程（Coroutines）：异步编程新模式
信号量（Semaphore）：更灵活的同步原语
屏障（Barrier）：多线程协同

std::jthread ：自动 join 的线程

cpp 复制代码

std::jthread worker([](std::stop_token token) {
    while (!token.stop_requested()) {
        // 工作
    }
});
// 析构时自动 join，支持停止请求

总结：

优先使用高级抽象 ：async > packaged_task > promise > 裸 thread
RAII 是朋友 ：总用 lock_guard/unique_lock，避免手动锁管理
无锁不万能：只在简单场景且确有性能需求时使用原子操作
同步最小化：临界区尽可能小，考虑读写锁优化读多写少场景
预防死锁 ：固定顺序加锁或使用 std::lock
考虑缓存效应：注意虚假共享，合理对齐数据
异常安全：确保锁在异常时能正确释放
性能测量：并发优化前先 profile，避免过早优化

一、 线程基础：std::thread

二、 同步基石：互斥锁 (mutex) 与锁守卫

三、 高级同步：条件变量 (condition_variable)

四、 一次性初始化：std::call_once

五、 无锁编程基石：std::atomic

六、 内存模型：控制指令重排

七、 异步操作：std::future, std::promise, std::async

总结与选择建议

八、读写锁：提高读多写少的性能

问题背景

C++14 的解决方案：std::shared_timed_mutex

读写锁的工作原理

C++17 的 std::shared_mutex（更高效）

九、线程池模式

基础线程池实现

十、无锁数据结构示例

无锁栈

无锁队列（Michael-Scott 队列）

十一、线程局部存储

thread_local 关键字

十二、性能优化与陷阱

1. 虚假共享（False Sharing）

2. 锁粒度优化

3. 避免优先级反转

十三、C++11 并发工具选择决策树

十四、完整实战：简单的Web服务器模型

十五、C++17/C++20 并发增强（前瞻）

C++17 改进

C++20 重要特性

一、线程基础：`std::thread`

二、同步基石：互斥锁 (`mutex`) 与锁守卫

三、高级同步：条件变量 (`condition_variable`)

四、一次性初始化：`std::call_once`

五、无锁编程基石：`std::atomic`

六、内存模型：控制指令重排

七、异步操作：`std::future`, `std::promise`, `std::async`

C++14 的解决方案：`std::shared_timed_mutex`

C++17 的 `std::shared_mutex`（更高效）

`thread_local` 关键字