这是【零基础 Rust 入门】系列的第 6 章。本系列由前端技术专家零弌分享。想要探索前端技术的无限可能,就请关注我们吧!🤗
- 🧑💻 我们是谁:支付宝体验技术部-基础服务团队
- 📖 我们的专栏:前沿视点
unsafe 一览
- 读取指针
- 调用 unsafe 方法
- 实现 unsafe traits
- 修改 statics
- 访问 unions 字段
为什么需要 unsafe traits? 目前 rust 中有这三个 unsafe traits,以及还有 unstable 的 unsafe traits 这里没有列出。
Send
是一个 mark trait,标识 struct 可以安全地移动到另一个线程。Sync
是一个 mark trait,承诺线程可以通过共享引用安全地共享 struct。GlobalAlloc
允许自定义整个程序的内存分配器,如实现 jemalloc 的切换。
是否将一个特征标记为 unsafe 是一个 API 设计选择。一个安全的特征更容易实现,但任何依赖它的不安全代码必须防御不正确的行为。将特征标记为 unsafe 将这一责任转移给实现者。Rust 传统上避免标记特征为 unsafe,因为这会使 Unsafe Rust 普遍存在,这不是我们希望的。
GlobalAlloc 被标记被 unsafe,管理程序中所有的内存,像 Box 或 Vec 这样的其他东西构建在它之上。如果它做了一些奇怪的事情(在它仍然在使用时,将相同的内存块给另一个请求),那么没有机会检测到这一点并采取任何措施。
来自 rust 的嘲讽
以下这些行为在 rust 中是安全的
- Deadlock: 死锁
- Have a race condition: 竞争条件
- Leak memory: 内存泄漏(内存安全 != 内存不会泄漏,内存泄漏一般是逻辑问题)
- Overflow integers (with the built-in operators such as + etc.): 移除
- Abort the program: 自缢
- Delete the production database: 删库跑路
不存在 bug free 的代码,只存在相对安全的代码。
危险的不是 rust,危险的是程序员。
unsafe 与 safe
unsafe/safe 其实并不能相互信任,及其复杂。
以下这段代码,unsafe 是否正确依赖于 safe 代码的正确性。
rust
fn index(idx: usize, arr: &[u8]) -> Option<u8> {
if idx < arr.len() {
unsafe {
Some(*arr.get_unchecked(idx))
}
} else {
None
}
}
如果 idx 的范围出现了错误,如下列代码就会导致越界问题。
rust
fn index(idx: usize, arr: &[u8]) -> Option<u8> {
if idx <= arr.len() {
unsafe {
Some(*arr.get_unchecked(idx))
}
} else {
None
}
}
另一个 vec 的例子。我们可以使用 safe rust 写出不稳定的 rust 代码,因此 rust 做了必要的防御措施,make_room 没有标记为 pub。
rust
pub struct Vec<T> {
ptr: *mut T,
len: usize,
cap: usize,
}
impl<T> Vec<T> {
pub fn push(&mut self, elem: T) {
if self.len == self.cap {
// not important for this example
self.reallocate();
}
unsafe {
ptr::write(self.ptr.add(self.len), elem);
self.len += 1;
}
}
fn make_room(&mut self) {
// grow the capacity
self.cap += 1;
}
}
我们已经看到,unsafe 代码必须信任一些 safe 代码,但不应该信任 generic safe 代码。出于类似的原因, privacy 对 unsafe 代码很重要:it prevents us from having to trust all the safe code in the universe from messing with our trusted state.
Data Layout
By default, composite structures have an alignment equal to the maximum of their fields' alignments. Rust will consequently insert padding where necessary to ensure that all fields are properly aligned and that the overall type's size is a multiple of its alignment.
rust
struct A {
a: u8,
b: u32,
c: u16,
}
rust
struct A {
a: u8,
_pad1: [u8; 3], // to align `b`
b: u32,
c: u16,
_pad2: [u8; 2], // to make overall size multiple of 4
}
Enum 的 layout 更加复杂。
rust
enum Foo {
A(u32),
B(u64),
C(u8),
}
struct FooRepr {
data: u64, // this is either a u64, u32, or u8 based on `tag`
tag: u8, // 0 = A, 1 = B, 2 = C
}
However there are several cases where such a representation is inefficient. The classic case of this is Rust's "null pointer optimization": an enum consisting of a single outer unit variant (e.g. None) and a (potentially nested) non- nullable pointer variant (e.g. Some(&T)) makes the tag unnecessary. A null pointer can safely be interpreted as the unit (None) variant. The net result is that, for example, size_of::<Option<&T>>() == size_of::<&T>().
DST(Dynamically Sized Types) 动态大小类型,在编译期没有固定大小,一般都放在堆上。
- trait objects: dyn MyTrait
- slices: [T], str, and others
ZST(Zero Sized Types) 0 大小类型,很神奇的一种类型。可以用来实现 Set<Key> = Map<Key, ()>
,因为 ZST 不会有额外的存储。
rust
struct Nothing; // No fields = no size
// All fields have no size = no size
struct LotsOfNothing {
foo: Nothing,
qux: (), // empty tuple has no size
baz: [u8; 0], // empty array has no size
}
和 FFI 沟通时就不能用 rust 原生的 layout,因为和 C 的 layout 是不一样的。
rust
#[repr(C)]
struct A {
a: u8,
b: u32,
c: u16,
}
深入 ownership
reference
如何对以下代码做优化?
rust
fn compute(input: &u32, output: &mut u32) {
if *input > 10 {
*output = 1;
}
if *input > 5 {
*output *= 2;
}
}
可以把 input
放在一个寄存器中,减少一次内存访问。
rust
fn compute(input: &u32, output: &mut u32) {
let cached_input = *input;
if cached_input > 10 {
// If the input is greater than 10, the previous code would set the output to 1 and then double it,
// resulting in an output of 2 (because `>10` implies `>5`).
// Here, we avoid the double assignment and just set it directly to 2.
*output = 2;
} else if cached_input > 5 {
*output *= 2;
}
}
rust 可以默认实现这类优化,因为 rust compiler 知道 input 和 output 一定是两个变量,因为一个变量的 mutable reference 和 immutable reference 是互相冲突的,不需要考虑 input/output 是同一个变量,导致 output 赋值之后 input 的值需要重新读取并且需要执行第二个 if
。
生命周期
rust
let x = 0;
let z;
let y = &x;
z = y;
在生命周期看来对象的生命周期是这样的。
rust
// 注意: `'a: {` 和 `&'b x` 不是合法的语法,只是做个标识
'a: {
let x: i32 = 0;
'b: {
let y: &'b i32 = &'b x;
'c: {
let z: &'c &'b i32 = &'c y;
}
}
}
生命周期的局限性
rust
#[derive(Debug)]
struct Foo;
impl Foo {
fn mutate_and_share(&mut self) -> &Self { &*self }
fn share(&self) {}
}
fn main() {
let mut foo = Foo;
let loan = foo.mutate_and_share();
foo.share();
println!("{:?}", loan);
}
The lifetime system is forced to extend the &mut foo to have lifetime 'c, due to the lifetime of loan and mutate_and_share's signature. Then when we try to call share, it sees we're trying to alias that &'c mut foo and blows up in our face!
实现一个 map 常见的 get_default 能力就会遇到 rust 的 borrow 冲突问题。
rust
fn get_default<'m, K, V>(map: &'m mut HashMap<K, V>, key: K) -> &'m mut V
where
K: Clone + Eq + Hash,
V: Default,
{
// map 第一次 borrow
match map.get_mut(&key) {
Some(value) => value,
None => {
// map 第二次 borrow
map.insert(key.clone(), V::default());
map.get_mut(&key).unwrap()
}
}
}
Unbounded Lifetime
最常见的来源是取一个对解引用的原始指针的引用,这产生了一个 Unbounded Lifetime 的引用。这样的生命周期会变得像上下文需要的那样大。这实际上比简单地变成 'static
更暴力,因为例如 &'static &'a T
将无法通过类型检查,但无界生命周期将完美地适应成 &'a &'a T
。
简单来看可以把 Unbounded Lifetime 视为是 'static
。
rust
fn get_str<'a>(s: *const String) -> &'a str {
unsafe { &*s }
}
fn main() {
let soon_dropped = String::from("hello");
let dangling = get_str(&soon_dropped);
drop(soon_dropped);
println!("Invalid str: {}", dangling); // 无效的str: gӚ_`
}
Splitting Borrow
Splitting Borrow 是一个比较常见的特性,比如代码实例中把一个数组分为了两个 mutable slice。
rust
pub fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T]) {
let len = self.len();
let ptr = self.as_mut_ptr();
unsafe {
assert!(mid <= len);
(from_raw_parts_mut(ptr, mid),
from_raw_parts_mut(ptr.add(mid), len - mid))
}
}
类型
类型自动转换
- 直接调用: First, the compiler checks if it can call T::foo(value) directly. This is called a "by value" method call.
- 自动引用: If it can't call this function (for example, if the function has the wrong type or a trait isn't implemented for Self), then the compiler tries to add in an automatic reference. This means that the compiler tries <&T>::foo(value) and <&mut T>::foo(value). This is called an "autoref" method call.
- 自动解引用: If none of these candidates worked, it dereferences T and tries again. This uses the Deref trait - if T: Deref<Target = U> then it tries again with type U instead of T. If it can't dereference T, it can also try unsizing T. This just means that if T has a size parameter known at compile time, it "forgets" it for the purpose of resolving methods. For instance, this unsizing step can convert [i32; 2] into [i32] by "forgetting" the size of the array.
rust
let array: Rc<Box<[T; 3]>> = ...;
let first_entry = array[0];
- array[0]
- ✅ 解语法糖: array.index(0)
- ❌ Rc<Box<[T; 3]>> 未实现 index
- ❌ &Rc<Box<[T; 3]>> 未实现 index
- ❌ &mut Rc<Box<[T; 3]>> 未实现 index
- ✅ Rc<Box<[T; 3]>> -- deref --> Box<[T; 3]>
- ❌ Box<[T; 3]>> 未实现 index
- ❌ &Rc<Box<[T; 3]>> 未实现 index
- ❌ &mut Rc<Box<[T; 3]>> 未实现 index
- ✅ Box<[T; 3]> -- deref --> [T; 3]
- ❌ [T; 3] 未实现 index
- ❌ &[T; 3] 未实现 index
- ❌ &mut [T; 3] 未实现 index
- ✅ [T; 3] -- unsize --> [T]
- ✅ [T].index
类型手动转换
手动转换必须在类型级别上是有效的,否则它们将在静态时被阻止。例如,7u8 as bool
将无法编译。
slice 长度
在对原始 slice 进行强制转换时,长度不会被调整;
*const [u16]
转换为*const [u8]
会创建一个只包含原始内存的一半的 slice。手动转换不是传递的,也就是说,即使e as U1 as U2是一个有效的表达式,e as U2不一定是有效的。
强转
只要两个参数的 size 相同就可以转换。
rust
fn foo() -> i32 {
0
}
// Crucially, we `as`-cast to a raw pointer before `transmute`ing to a function pointer.
// This avoids an integer-to-pointer `transmute`, which can be problematic.
// Transmuting between raw pointers and function pointers (i.e., two pointer types) is fine.
// 首先转换为裸指针
let pointer = foo as *const ();
let function = unsafe {
// 强转
std::mem::transmute::<*const (), fn() -> i32>(pointer)
};
并发
Race
Race 有以下几种情况:
- two or more threads concurrently accessing a location of memory
- one or more of them is a write
- one or more of them is unsynchronized
rust
use std::thread;
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;
let data = vec![1, 2, 3, 4];
let idx = Arc::new(AtomicUsize::new(0));
let other_idx = idx.clone();
thread::spawn(move || {
// 一个独立的线程内对 idx 做了 +10
other_idx.fetch_add(10, Ordering::SeqCst);
});
// 简单判断一下 idx 的值是否安全
if idx.load(Ordering::SeqCst) < data.len() {
unsafe {
// 这里的 idx 可能发生改变了
println!("{}", data.get_unchecked(idx.load(Ordering::SeqCst)));
}
}
Atomic
rust 继承了 c++20 的 atomic 模型。This is not due to this model being particularly excellent or easy to understand. Indeed, this model is quite complex and known to have several flaws. Rather, it is a pragmatic concession to the fact that everyone is pretty bad at modeling atomics. At very least, we can benefit from existing tooling and research around the C/C++ memory model. (You'll often see this model referred to as "C/C++11" or just "C11". C just copies the C++ memory model; and C++11 was the first version of the model but it has received some bugfixes since then.)
不是因为这个模型好理解,事实上,这个模型极度复杂并且还有瑕疵。但是这是一个让步,事实上每个人都不擅长对 atomic 建模。至少我们可以从围绕现有的 c++ 内存模型的工具链和研究受益。(c 是抄的 c++11,而 c++11 是第一个版本,并且自那以后就有不少 bugfix)。
原始代码,x 赋值两次
rust
x = 1;
y = 3;
x = 2;
编译器进行优化,直接将 x 赋值为 2。
rust
x = 2;
y = 3;
但是我们的程序很可能是多线程的,并且还依赖了在 y 赋值之前 x=1。比如说:
ini
initial state: x = 0, y = 1
THREAD 1 THREAD 2
y = 3; if x == 1 {
x = 1; y *= 2;
}
It is common to separate hardware into two categories: strongly-ordered and weakly-ordered. Most notably x86/64 provides strong ordering guarantees, while ARM provides weak ordering guarantees. This has two consequences for concurrent programming:
Asking for stronger guarantees on strongly-ordered hardware may be cheap or even free because they already provide strong guarantees unconditionally. Weaker guarantees may only yield performance wins on weakly-ordered hardware.
Asking for guarantees that are too weak on strongly-ordered hardware is more likely to happen to work, even though your program is strictly incorrect. If possible, concurrent algorithms should be tested on weakly-ordered hardware.
概述一下,x86/64 用强序保障,arm 提供弱保障。在 x86/64 上使用强序是 cheap/free 的,弱序只有在 arm 上有性能收益。应尽可能在 arm 上测试并发算法😊。
数据访问顺序:
- Sequentially Consistent (SeqCst) 顺序一致性
顺序一致性是所有中最强大的,暗示了所有其他排序的限制。直观地说,一个顺序一致的操作不能被重排序:在 SeqCst 访问之前和之后发生的所有访问都保持在它之前和之后。
- Acquire(获取)/Release(释放)
它们非常适合用于获取和释放锁,以及确保关键部分不会重叠。一个获取(Acquire)操作确保了在它之后的每一个访问都保持在它之后。然而,在获取之前发生的操作可以自由地被重排序到它之后。相似地,一个释放(Release)操作确保了在它之前的每一个访问都保持在它之前。然而,在释放之后发生的操作可以自由地被重排序到它之前。
-
Relaxed
它们可以自由地被重新排序,并且不提供任何先行发生关系。尽管如此,放松操作仍然是原子的。也就是说,它们不算作数据访问,任何对它们进行的读-修改-写操作都以原子方式发生。如果你不是用计数器来同步任何其他访问,那么使用放松的 fetch_add 可以安全地由多个线程增加计数器。
实现 Vec
layout
rust
use std::ptr::NonNull;
pub struct Vec<T> {
// 我们需要以下几点,因此选择了 NonNull 的智能指针
// 类型转换, 如 &str 能转换为 &dyn std::fmt::Display
// 声明拥有 T 类型,编译器会通过范型推导出来
// 指针永远不可能为 null
ptr: NonNull<T>,
cap: usize,
len: usize,
}
// 如果 T 是 Send/Sync 的,Vec<T> 也要是 Send/Sync
unsafe impl<T: Send> Send for Vec<T> {}
unsafe impl<T: Sync> Sync for Vec<T> {}
allocate
rust
use std::mem;
use std::alloc::{self, Layout};
impl<T> Vec<T> {
pub fn new() -> Self {
assert!(mem::size_of::<T>() != 0, "We're not ready to handle ZSTs");
Vec {
// 创建了一个指针,可以做 lazy allocate,因为这时候 len/cap 都是 0
ptr: NonNull::dangling(),
len: 0,
cap: 0,
}
}
fn grow(&mut self) {
// 创建 cap/layout(内存 layout)
let (new_cap, new_layout) = if self.cap == 0 {
(1, Layout::array::<T>(1).unwrap())
} else {
// This can't overflow since self.cap <= isize::MAX.
let new_cap = 2 * self.cap;
// `Layout::array` checks that the number of bytes is <= usize::MAX,
// but this is redundant since old_layout.size() <= isize::MAX,
// so the `unwrap` should never fail.
let new_layout = Layout::array::<T>(new_cap).unwrap();
(new_cap, new_layout)
};
// Ensure that the new allocation doesn't exceed `isize::MAX` bytes.
assert!(new_layout.size() <= isize::MAX as usize, "Allocation too large");
let new_ptr = if self.cap == 0 {
// 申请内存
unsafe { alloc::alloc(new_layout) }
} else {
let old_layout = Layout::array::<T>(self.cap).unwrap();
let old_ptr = self.ptr.as_ptr() as *mut u8;
// 重新分配内存
unsafe { alloc::realloc(old_ptr, old_layout, new_layout.size()) }
};
// If allocation fails, `new_ptr` will be null, in which case we abort.
// oom 了
self.ptr = match NonNull::new(new_ptr as *mut T) {
Some(p) => p,
None => alloc::handle_alloc_error(new_layout),
};
self.cap = new_cap;
}
}
push/pop
rust
pub fn push(&mut self, elem: T) {
if self.len == self.cap { self.grow(); }
unsafe {
// 将 elem 写入 ptr + sizeof(ptr) * len
ptr::write(self.ptr.as_ptr().add(self.len), elem);
}
// 防御,分配好内存之后才会更新 len
// Can't fail, we'll OOM first.
self.len += 1;
}
pub fn pop(&mut self) -> Option<T> {
if self.len == 0 {
None
} else {
self.len -= 1;
unsafe {
// 读取 ptr + sizeof(ptr) * len 的值
Some(ptr::read(self.ptr.as_ptr().add(self.len)))
}
}
}
deallocate
rust
impl<T> Drop for Vec<T> {
fn drop(&mut self) {
// = 0 的时候不应该访问 ptr
if self.cap != 0 {
while let Some(_) = self.pop() { }
let layout = Layout::array::<T>(self.cap).unwrap();
unsafe {
alloc::dealloc(self.ptr.as_ptr() as *mut u8, layout);
}
}
}
}
deref
rust
use std::ops::Deref;
use std::ops::DerefMut;
impl<T> Deref for Vec<T> {
type Target = [T];
fn deref(&self) -> &[T] {
unsafe {
std::slice::from_raw_parts(self.ptr.as_ptr(), self.len)
}
}
}
impl<T> DerefMut for Vec<T> {
fn deref_mut(&mut self) -> &mut [T] {
unsafe {
std::slice::from_raw_parts_mut(self.ptr.as_ptr(), self.len)
}
}
}
课后作业
实现 vec 的 insert/remove/IntoIter/RawVec/Drain/ZST