Modern C++——共享所有权指针保证内部对象析构安全的原因分析

大纲

悖论

在《Modern C++------唯一所有权的明确》一文中，我们介绍了std::unique_ptr的使用。对于在编码时，所有权传承就非常明确的场景，std::unique_ptr是非常适合和高效的。比如下面这段代码，Custom最终是在线程中使用的，所以我们直接将其所有权从main函数"转移"到process函数中。

cpp 复制代码

void process(std::unique_ptr<Custom> ptr) {
    std::cout << "Processing value: " << ptr->get_value() << std::endl;
}

int main() {
    std::unique_ptr<Custom> unique_ptr_custom = std::make_unique<Custom>(30);
    std::thread t1(process, std::move(unique_ptr_custom));

    // unique_ptr_custom is now nullptr
    if (!unique_ptr_custom) {
        std::cout << "unique_ptr_custom is now nullptr" << std::endl;
    }
    
    t1.join();
    
	return 0;
}

但是某些特别复杂的场景，所有权并不能在编码时明确，这个时候就需要使用"共享所有权"的shared_ptr。比如下面这段代码，t1和t2线程会同时运行，那么shared_ptr_custom 所有权是该归process还是print_use_count呢？这个问题进而引出，shared_ptr_custom 所管理的对象是由哪个函数析构？

cpp 复制代码

void print_use_count(std::shared_ptr<Custom> shared_ptr_custom) {
    std::cout << "shared_ptr_custom.use_count() = " << shared_ptr_custom.use_count() << std::endl;
}

void process(std::shared_ptr<Custom> shared_ptr_custom) {
    std::cout << "Processing value: " << shared_ptr_custom->get_value() << std::endl;
}

void start_thread() {
    std::shared_ptr<Custom> shared_ptr_custom = std::make_shared<Custom>(1);
    std::thread t1(process, shared_ptr_custom);
    std::thread t2(print_use_count, shared_ptr_custom);
    t1.join();
    t2.join();
}

int main(int argc, char* argv[]) {
    std::thread t(start_thread);
    t.join();

    return 0;
}

shared_ptr在底层使用了一个原子操作的成员变量，来统计持有管理对象指针的shared_ptr对象个数。当shared_ptr发生复制构造时，原子操作递增；

cpp 复制代码

  // Increment the use count (used when the count is greater than zero).
  void
  _M_add_ref_copy()
  { __gnu_cxx::__atomic_add_dispatch(&_M_use_count, 1); }

  inline void
  __attribute__ ((__always_inline__))
  __atomic_add_dispatch(_Atomic_word* __mem, int __val)
  {
    if (__is_single_threaded())
      __atomic_add_single(__mem, __val);
    else
      __atomic_add(__mem, __val);
  }

当shared_ptr对象析构时，原子操作递减。

cpp 复制代码

template<>
    inline void
    _Sp_counted_base<_S_atomic>::_M_release() noexcept
    {
      _GLIBCXX_SYNCHRONIZATION_HAPPENS_BEFORE(&_M_use_count);
#if ! _GLIBCXX_TSAN
      constexpr bool __lock_free
	= __atomic_always_lock_free(sizeof(long long), 0)
	&& __atomic_always_lock_free(sizeof(_Atomic_word), 0);
      constexpr bool __double_word
	= sizeof(long long) == 2 * sizeof(_Atomic_word);
      // The ref-count members follow the vptr, so are aligned to
      // alignof(void*).
      constexpr bool __aligned = __alignof(long long) <= alignof(void*);
      if _GLIBCXX17_CONSTEXPR (__lock_free && __double_word && __aligned)
	{
	  constexpr int __wordbits = __CHAR_BIT__ * sizeof(_Atomic_word);
	  constexpr int __shiftbits = __double_word ? __wordbits : 0;
	  constexpr long long __unique_ref = 1LL + (1LL << __shiftbits);
	  auto __both_counts = reinterpret_cast<long long*>(&_M_use_count);

	  _GLIBCXX_SYNCHRONIZATION_HAPPENS_BEFORE(&_M_weak_count);
	  if (__atomic_load_n(__both_counts, __ATOMIC_ACQUIRE) == __unique_ref)
	    {
	      // Both counts are 1, so there are no weak references and
	      // we are releasing the last strong reference. No other
	      // threads can observe the effects of this _M_release()
	      // call (e.g. calling use_count()) without a data race.
	      _M_weak_count = _M_use_count = 0;
	      _GLIBCXX_SYNCHRONIZATION_HAPPENS_AFTER(&_M_use_count);
	      _GLIBCXX_SYNCHRONIZATION_HAPPENS_AFTER(&_M_weak_count);
	      _M_dispose();
	      _M_destroy();
	      return;
	    }
	  if (__gnu_cxx::__exchange_and_add_dispatch(&_M_use_count, -1) == 1)
	    [[__unlikely__]]
	    {
	      _M_release_last_use_cold();
	      return;
	    }
	}

我们可以看到上述代码通过__atomic_load_n取到一个值后，和__unique_ref进行了对比。如果不相等，则继续原子递减_M_use_count；如果相同，则做管理对象的析构操作。

对于经常分析多线程问题的同学，可能会对这段代码有个疑问：上述原子操作只是取值，后续的对比和对象析构并没有被保护。难道这样的代码没有多线程安全问题吗？

我们把上述代码可以替换成下面的伪代码，就更容易理解这样的疑惑了。

cpp 复制代码

auto tmp = __atomic_load_n(__both_counts, __ATOMIC_ACQUIRE);
// 其他线程中代码可能修改了__both_counts

if (tmp == __unique_ref） {
	// 其他线程中代码可能让tmp已经不等于__unique_ref了
	_M_use_count = 0;
	// 其他线程代码可能让_M_use_count通过原子操作做了其他修改了
	_M_weak_count = 0;
	// 其他线程代码可能让_M_weak_count通过原子操作做了其他修改了
	
	// 可能其他线程还要使用所管理的指针
	_M_dispose();
}

一般情况下，我们对于对比后替换数值的原子操作使用compare_exchange_strong这类的函数。因为这样的操作让"对比和赋值"多个CPU指令打包成了一个原子操作，从而预防了在多步执行过程中，数据的污染问题。

cpp 复制代码

#include <iostream>
#include <atomic>

int main() {
    std::atomic<int> value(1);

    // 尝试将 value 从 1 改为 0
    int expected = 1;
    bool success = value.compare_exchange_strong(expected, 0);

    if (success) {
        std::cout << "Value was 1, changed to 0." << std::endl;
    } else {
        std::cout << "Value was not 1, it was " << expected << "." << std::endl;
    }

    return 0;
}

但是shared_ptr对内部对象析构前，做了很多非线程安全的操作。那么这段代码安全吗？

悖论

答案是安全的。

这儿是个逻辑问题：假如在析构shared_ptr（我们姑且称之为A）的某个时刻__atomic_load_n(__both_counts, __ATOMIC_ACQUIRE) == __unique_ref了，说明当前shared_ptr对象是目前最后一个副本了。如果在这句执行之后，有一个新的副本（B）要建立，那么这个新的副本要持有一个副本，那么这个副本只能是A。而A因为被别人持有，其引用计数（至少是2）就不会符合__atomic_load_n(__both_counts, __ATOMIC_ACQUIRE) == __unique_ref（这一步要求计数是1）。于是上面的假设形成了悖论。正因为这个悖论的存在，从而证明shared_ptr内部对象析构是线程安全的。

所以网上所谓shared_ptr底层使用原子操作保证析构安全的论述并不严谨。

但是这个也预示着：如果需要shared_ptr保持安全的内部对象析构行为，我们就应该利用编译器编排的构造和析构顺序来使用shared_ptr，而不应该绕过这些机制，否则就会出现线程安全问题。