Rust : zero-cost abstraction

Rust zero-cost abstraction

[Iterators vs. For Loops](#Iterators vs. For Loops)
[Generics and Monomorphization](#Generics and Monomorphization)
[Trait-based Abstraction](#Trait-based Abstraction)
[Closures vs. Function Pointers](#Closures vs. Function Pointers)
[Enums and Pattern Matching](#Enums and Pattern Matching)
[Smart Pointers (e.g., Box, Rc, Arc)](#Smart Pointers (e.g., Box, Rc, Arc))
[Ownership and Borrowing](#Ownership and Borrowing)
[Static Dispatch vs. Dynamic Dispatch](#Static Dispatch vs. Dynamic Dispatch)

Iterators vs. For Loops

Rust's "zero-cost abstraction" can be demonstrated in many scenarios where high-level constructs produce low-level, optimized code with no additional runtime cost. Here are several examples that highlight how Rust achieves this:

Rust 的"零成本抽象"可以在许多场景中得到证明，其中高级构造生成低级优化代码，而无需额外的运行时成本。以下是几个例子，重点介绍了 Rust 如何实现这一目标：

Using Iterators:

cpp 复制代码

fn sum_of_squares(v: &[i32]) -> i32 {
    v.iter().map(|&x| x * x).sum()
}

Manual For Loop:

cpp 复制代码

fn sum_of_squares(v: &[i32]) -> i32 {
    let mut sum = 0;
    for &x in v {
        sum += x * x;
    }
    sum
}

To compare the efficiency of Iterators vs. For Loops in Rust, let's write a benchmark that compares the time it takes to sum the squares of numbers using both approaches. We can use the std::time::Instant to measure the execution time.

为了比较 Rust 中迭代器与 For 循环的效率，让我们编写一个基准测试来比较使用这两种方法对数字平方求和所需的时间。我们可以使用 std::time::Instant 来测量执行时间。

cpp 复制代码

use std::time::Instant;

fn sum_of_squares_iterator(v: &[i32]) -> i32 {
    v.iter().map(|&x| x * x).sum()
}

fn sum_of_squares_for_loop(v: &[i32]) -> i32 {
    let mut sum = 0;
    for &x in v {
        sum += x * x;
    }
    sum
}

fn main() {
    // Create a large vector for benchmarking
    let size = 10_00;
    let v: Vec<i32> = (1..=size).collect();

    // Benchmarking Iterators
    let start_iter = Instant::now();
    let sum_iter = sum_of_squares_iterator(&v);
    let duration_iter = start_iter.elapsed();
    println!("Iterator: Sum = {}, Time = {:?}", sum_iter, duration_iter);

    // Benchmarking For Loop
    let start_loop = Instant::now();
    let sum_loop = sum_of_squares_for_loop(&v);
    let duration_loop = start_loop.elapsed();
    println!("For Loop: Sum = {}, Time = {:?}", sum_loop, duration_loop);
}

We define two functions:

1.1 sum_of_squares_iterator uses Iterators with iter(), map(), and sum().

1.2 sum_of_squares_for_loop uses a For Loop to calculate the sum of squares.

The vector v contains 1000 integers for a meaningful comparison.

We measure the time taken for each approach using Instant::now() and compare the results.

Output:

The output will display The total sum calculated (which should be The same for both methods) and The time taken by each approach.

cpp 复制代码

Iterator: Sum = 333833500, Time = 24.291µs
For Loop: Sum = 333833500, Time = 26.75µs

If use u128:

cpp 复制代码

use std::time::Instant;

fn sum_of_squares_iterator(v: &[u128]) -> u128 {
    v.iter()
        .map(|&x| x*x)  // Ensure overflowing addition
        .sum()
}

fn sum_of_squares_for_loop(v: &[u128]) -> u128 {
    let mut sum: u128 = 0;
    for &x in v {
        sum += x*x;  // Ensure overflowing addition
    }
    sum
}

fn main() {
    // Create a large vector for benchmarking
    let size = 10_000_000;
    // let size = 10;
    let v: Vec<u128> = (1..=size).collect();  // Explicitly type the vector as i32

    // Benchmarking Iterators
    let start_iter = Instant::now();
    let sum_iter: u128 = sum_of_squares_iterator(&v);
    let duration_iter = start_iter.elapsed();
    println!("Iterator: Sum = {}, Time = {:?}", sum_iter, duration_iter);

    // Benchmarking For Loop
    let start_loop = Instant::now();
    let sum_loop = sum_of_squares_for_loop(&v);
    let duration_loop = start_loop.elapsed();
    println!("For Loop: Sum = {}, Time = {:?}", sum_loop, duration_loop);
}

如果size = 10_000_000，时间效率约相差4倍

cpp 复制代码

Iterator: Sum = 333333383333335000000, Time = 254.317167ms
For Loop: Sum = 333333383333335000000, Time = 61.483042ms

Generics and Monomorphization

Rust allows for generic functions and types that work with any type, but at compile time, it monomorphizes them, generating specialized versions of the function or type for each concrete type used. This eliminates the overhead of dynamic dispatch or boxing.

Generic Function:

Rust 允许使用任何类型的泛型函数和类型，但在编译时，它会将它们单态化，为所使用的每个具体类型生成函数或类型的专门版本。这消除了动态调度或装箱的开销。

通用函数：

cpp 复制代码

fn add<T: std::ops::Add<Output = T>>(a: T, b: T) -> T {
    a + b
}

If you call add(3, 4) and add(1.5, 2.0), the compiler generates specific versions of the function for i32 and f64, respectively. There's no cost associated with generics at runtime since the specific code is generated during compilation.

如果调用 add(3, 4) 和 add(1.5, 2.0)，编译器将分别为 i32 和 f64 生成该函数的特定版本。由于特定代码是在编译期间生成的，因此在运行时没有与泛型相关的成本。

cpp 复制代码

fn add_i32(a: i32, b: i32) -> i32 { a + b }
fn add_f64(a: f64, b: f64) -> f64 { a + b }

Trait-based Abstraction

Traits are a way to define shared behavior across types, but using traits doesn't introduce runtime overhead.

Using Traits:

特征是一种定义跨类型共享行为的方法，但使用特征不会引入运行时开销。

使用特征：

cpp 复制代码

trait Area {
    fn area(&self) -> f64;
}

struct Circle {
    radius: f64,
}

impl Area for Circle {
    fn area(&self) -> f64 {
        std::f64::consts::PI * self.radius * self.radius
    }
}

struct Square {
    side: f64,
}

impl Area for Square {
    fn area(&self) -> f64 {
        self.side * self.side
    }
}

Even though this uses a high-level trait abstraction, the Rust compiler optimizes the code, inlining the methods and ensuring there's no additional cost compared to calling methods directly on the concrete types.

即使这使用了高级特征抽象，Rust 编译器也会优化代码，内联方法并确保与直接在具体类型上调用方法相比没有额外的成本。

Closures vs. Function Pointers

Closures in Rust are abstractions over functions that capture the environment. However, Rust compiles them down to efficient function pointers without runtime overhead.

Using Closures:

Rust 中的闭包是对捕获环境的函数的抽象。然而，Rust 将它们编译为高效的函数指针，没有运行时开销。

使用闭包：

cpp 复制代码

fn apply<F>(f: F, x: i32) -> i32
where
    F: Fn(i32) -> i32,
{
    f(x)
}

fn main() {
    let result = apply(|x| x + 1, 5);
    println!("{}", result); // 6
}

This example shows a higher-level abstraction where a closure is passed as a function, but at runtime, this is transformed into an efficient function pointer call without any additional overhead.

此示例显示了一个更高级别的抽象，其中闭包作为函数传递，但在运行时，这会转换为高效的函数指针调用，而无需任何额外的开销。

Enums and Pattern Matching

Enums are a zero-cost abstraction for handling different types or states. Pattern matching is highly optimized and incurs no runtime overhead compared to manual branching.

Enum Example:

枚举是处理不同类型或状态的零成本抽象。模式匹配经过高度优化，与手动分支相比不会产生运行时开销。

Enum 示例：

cpp 复制代码

enum Shape {
    Circle(f64),
    Square(f64),
}

fn area(shape: Shape) -> f64 {
    match shape {
        Shape::Circle(radius) => std::f64::consts::PI * radius * radius,
        Shape::Square(side) => side * side,
    }
}

The match statement compiles down to an efficient switch or branching construct. Rust doesn't add any extra runtime cost beyond what would happen with manually written conditionals.

match 语句编译为高效的 switch 或分支结构。除了手动编写的条件之外，Rust 不会增加任何额外的运行时成本。

Smart Pointers (e.g., Box, Rc, Arc)

Rust's smart pointers like Box, Rc, and Arc are abstractions over raw pointers but are zero-cost in the sense that they only add the exact bookkeeping needed (like reference counting in Rc or Arc). There's no unnecessary overhead for using smart pointers, and they're often as fast as managing raw memory by hand in C.

Boxed Values:

Rust 的智能指针（如 Box、Rc 和 Arc）是对原始指针的抽象，但它们是零成本的，因为它们只添加所需的精确簿记（如 Rc 或 Arc 中的引用计数）。使用智能指针没有不必要的开销，而且它们通常与在 C 中手动管理原始内存一样快。

装箱值：

cpp 复制代码

let x = Box::new(5); // Allocates `5` on the heap, but no unnecessary overhead

In this case, the Box type ensures that the value is allocated on the heap, but the abstraction adds no additional runtime cost beyond the necessary heap allocation.

在这种情况下，Box 类型确保值在堆上分配，但除了必要的堆分配之外，抽象不会增加额外的运行时成本。

Ownership and Borrowing

Rust's ownership system is enforced at compile time, meaning it incurs no runtime cost. The borrow checker ensures that memory is managed safely without needing garbage collection or reference counting unless you explicitly choose to use Rc or Arc.

Ownership:

Rust 的所有权系统在编译时强制执行，这意味着它不会产生运行时成本。借用检查器可确保内存得到安全管理，无需垃圾回收或引用计数，除非您明确选择使用 Rc 或 Arc。

所有权：

cpp 复制代码

fn main() {
    let x = String::from("hello");
    let y = x; // Ownership of `x` is moved to `y`
    println!("{}", y); // Safe access with no runtime cost
}

In other languages, handling memory ownership might require runtime checks, but Rust handles this entirely at compile time, making ownership a zero-cost abstraction.

在其他语言中，处理内存所有权可能需要运行时检查，但 Rust 完全在编译时处理此问题，使所有权成为零成本抽象。

Static Dispatch vs. Dynamic Dispatch

In Rust, you can choose between static dispatch (zero-cost) and dynamic dispatch when using traits. With static dispatch, the compiler knows at compile time which implementation to call, and there is no additional overhead. Dynamic dispatch (dyn Trait) uses a vtable lookup, but it's only used when explicitly chosen.

Static Dispatch:

在 Rust 中，使用特征时，您可以在静态调度（零成本）和动态调度之间进行选择。使用静态分派，编译器在编译时就知道要调用哪个实现，并且没有额外的开销。动态调度 (dyn Trait) 使用 vtable 查找，但仅在显式选择时使用。

静态调度：

cpp 复制代码

fn process<T: Area>(shape: T) {
    println!("Area: {}", shape.area());
}

This will incur zero runtime cost because the compiler knows exactly which function to call.

Dynamic Dispatch (slight overhead):

这将导致零运行时成本，因为编译器确切地知道要调用哪个函数。

动态调度（轻微开销）：

cpp 复制代码

fn process(shape: &dyn Area) {
    println!("Area: {}", shape.area());
}

This introduces a small vtable lookup but only when explicitly requested with dyn.

这引入了一个小的 vtable 查找，但仅当使用 dyn 明确请求时才进行。