探秘新一代向量存储格式Lance-format (十八) 向量量化技术

第18章：向量量化技术

🎯 核心概览

量化是向量压缩和加速的关键技术。通过牺牲极小的精度，换取 100 倍的空间节省和 10-100 倍的速度提升。

📊 三种主要量化方式

Product Quantization (PQ)

ini 复制代码

原始：768 维 float32 = 3KB
    ↓
分割为 8 个 96D 段
    ↓
每段学习 256 个码字（通过 KMeans）
    ↓
每个向量编码为 8 字节（每段 1 字节）

结果：8 字节 / 向量 = 99.7% 压缩

Scalar Quantization (SQ)

go 复制代码

原始：float32
    ↓
映射到 int8 范围 [-128, 127]
    ↓
每个浮点值 → 1 字节

结果：768 × 1 字节 = 768 字节 / 向量 = 75% 压缩

Binary Quantization (BQ)

go 复制代码

原始：float32（768 维）
    ↓
转换为二进制：v_i > threshold ? 1 : 0
    ↓
96 字节存储 768 比特

结果：96 字节 / 向量 = 96.9% 压缩

📊 精度 vs 速度权衡

量化方式	精度	速度	空间
无量化	100%	1x	3KB
SQ	98%	10x	768B
BQ	95%	100x	96B
PQ	99%	50x	8B

💡 使用建议

小数据集：无量化，最高精度
中等数据集：PQ（最平衡）
大数据集：PQ 或 SQ
实时性要求：BQ（速度最快）

🔍 设计思想与原理

为什么需要量化？

高维向量在存储和计算上存在关键瓶颈：

存储瓶颈
- 768 维 float32 向量：3KB，100万向量 = 3GB 内存
- 现代 GPU 显存有限（如 80GB），无法加载大规模数据集
计算瓶颈
- 距离计算（如 L2）时间复杂度 O(d)，d 为维度
- 对于 768 维向量，每次计算需要 768 次乘法和加法
- 搜索 100 万向量需要 7.68 亿次操作
量化的权衡
- 牺牲精度换取空间和速度
- 精度损失通常 < 5%，而性能提升 10-100 倍

三种量化方式的原理对比

Product Quantization (PQ)

原理：向量分割 + 独立量化

ini 复制代码

原始向量 [v1, v2, ..., v768]
  ↓ 分割为 M 个子向量
[v1...vD], [vD+1...v2D], ..., [v(M-1)D+1...v768]
  ↓ 每个子向量独立聚类学习 2^K 个码字
  ↓ 每个子向量编码为 K 比特（通常 K=8，256 个码字）
  ↓ 完整向量编码为 M*K 比特

例：768 维分为 8 个 96D 子向量，8 比特编码
总大小 = 8 字节（vs 3KB 原始）= 99.7% 压缩

关键优势：

压缩率最高（99%+）
距离计算使用查表法加速
精度损失最小（99%+ 保留）

距离计算加速：

scss 复制代码

PQ 距离 = Σ distance_table[i][code[i]]
         i=0..M-1

预构建距离表：
  distance_table[i][j] = ||centroid[i][j] - query_subvec[i]||^2
  
查表法：O(M) 比精确计算 O(d) 快 d/M 倍
(d=768, M=8 时快 96 倍)

Scalar Quantization (SQ)

原理：浮点值范围缩放到整数

ini 复制代码

浮点值范围 [min_val, max_val]
  ↓ 线性缩放到 [0, 2^K-1]
  ↓ 每个值编码为 K 比特（通常 K=8，范围 [0, 255]）
  ↓ 完整向量编码为 d*K 比特

例：768 维 float32，8 比特编码
总大小 = 768 字节 = 75% 压缩

缩放公式：
quantized = round((value - min) / (max - min) * 255)

关键优势：

实现最简单
精度损失可预测（范围缩放线性）
支持多比特深度（4、8、16 等）

Binary Quantization (BQ)

原理：符号位编码

bash 复制代码

向量值 v_i
  ↓ 符号判断：v_i > 0 ? 1 : 0
  ↓ 8 个值打包成 1 字节（1 比特/值）
  ↓ 完整向量编码为 d/8 字节

例：768 维 float32
总大小 = 96 字节 = 96.9% 压缩

二进制编码：
byte = Σ (v_i > 0) << i  (i=0..7)

关键优势：

极致压缩（96%+）
距离计算可用汉明距离（极快）
适合实时查询

💻 代码实现示例

Rust 实现 - Product Quantization

核心结构体定义

rust 复制代码

pub struct ProductQuantizer {
    pub num_sub_vectors: usize,      // 子向量个数，如 8
    pub num_bits: u32,               // 编码比特数，如 8
    pub dimension: usize,            // 向量维度，如 768
    pub codebook: FixedSizeListArray, // 码字本：[M, K, D/M]
    pub distance_type: DistanceType,  // 距离类型：L2、Dot
}

构造与初始化

rust 复制代码

impl ProductQuantizer {
    // 从已有码字本创建量化器
    pub fn new(
        num_sub_vectors: usize,
        num_bits: u32,
        dimension: usize,
        codebook: FixedSizeListArray,
        distance_type: DistanceType,
    ) -> Self {
        Self {
            num_bits,
            num_sub_vectors,
            dimension,
            codebook,
            distance_type,
        }
    }

    // 从数据构建：学习码字本
    pub fn build(
        data: &dyn Array,
        distance_type: DistanceType,
        params: &PQBuildParams,
    ) -> Result<Self> {
        // 1. 验证输入数据格式
        let fsl = data.as_fixed_size_list_opt()
            .ok_or(Error::Index { ... })?;

        // 2. 若已提供码字本，直接使用
        if let Some(codebook) = params.codebook.as_ref() {
            return Ok(Self::new(
                params.num_sub_vectors,
                params.num_bits as u32,
                fsl.value_length() as usize,
                FixedSizeListArray::try_new_from_values(
                    codebook.clone(),
                    fsl.value_length()
                )?,
                distance_type,
            ));
        }

        // 3. 从数据学习码字本
        params.build(data, distance_type)
    }
}

向量量化（编码）

rust 复制代码

pub fn quantize(&self, vectors: &dyn Array) -> Result<ArrayRef> {
    let fsl = vectors.as_fixed_size_list_opt()
        .ok_or(Error::Index { ... })?;

    // 支持多种浮点类型
    match fsl.value_type() {
        DataType::Float16 => self.transform::<Float16Type>(vectors),
        DataType::Float32 => self.transform::<Float32Type>(vectors),
        DataType::Float64 => self.transform::<Float64Type>(vectors),
        _ => Err(Error::Index { ... }),
    }
}

fn transform<T: ArrowPrimitiveType>(
    &self,
    vectors: &dyn Array,
) -> Result<ArrayRef>
where
    T::Native: Float + L2 + Dot,
{
    let fsl = vectors.as_fixed_size_list_opt()
        .ok_or(Error::Index { ... })?;

    // 获取码字本（中心点）
    let codebook = self.codebook.values().as_primitive::<T>();

    // 对每个向量进行编码
    let pq_codes: Vec<u8> = fsl.iter()
        .map(|vec_opt| {
            let vec = vec_opt.unwrap().as_primitive::<T>();
            let mut codes = vec![0u8; self.num_sub_vectors];

            // 为每个子向量找最近的码字
            for (i, sub_vec) in vec.chunks(
                self.dimension / self.num_sub_vectors
            ).enumerate() {
                // 与所有码字计算距离
                let mut min_dist = f32::MAX;
                let mut best_code = 0u8;

                for (code, centroid) in codebook.iter().enumerate() {
                    let dist = sub_vec.iter()
                        .zip(centroid.iter())
                        .map(|(v, c)| (v.to_f32() - c.to_f32()).powi(2))
                        .sum::<f32>();
                    if dist < min_dist {
                        min_dist = dist;
                        best_code = code as u8;
                    }
                }
                codes[i] = best_code;
            }
            codes
        })
        .flatten()
        .collect();

    // 返回 UInt8Array
    Ok(Arc::new(UInt8Array::from(pq_codes)))
}

距离查询加速

rust 复制代码

pub fn compute_distances(
    &self,
    query: &dyn Array,
    code: &UInt8Array,
) -> Result<Float32Array> {
    match self.distance_type {
        DistanceType::L2 => self.l2_distances(query, code),
        DistanceType::Dot => self.dot_distances(query, code),
        _ => Err(Error::Index { ... }),
    }
}

fn l2_distances(
    &self,
    query: &dyn Array,
    code: &UInt8Array,
) -> Result<Float32Array> {
    // 预构建距离表
    let distance_table = self.build_l2_distance_table(query)?;
    // distance_table[i][j] = ||centroid[i][j] - query_sub[i]||^2

    // 从码字索引查表计算距离
    let distances: Vec<f32> = code.values()
        .chunks(self.num_sub_vectors)
        .map(|codes| {
            codes.iter()
                .enumerate()
                .map(|(i, &code_idx)| distance_table[i * 256 + code_idx as usize])
                .sum()
        })
        .collect();

    Ok(Float32Array::from(distances))
}

fn build_l2_distance_table(
    &self,
    query: &dyn Array,
) -> Result<Vec<f32>> {
    match query.data_type() {
        DataType::Float16 => self.build_l2_distance_table_impl::<Float16Type>(
            query.as_primitive()
        ),
        DataType::Float32 => self.build_l2_distance_table_impl::<Float32Type>(
            query.as_primitive()
        ),
        DataType::Float64 => self.build_l2_distance_table_impl::<Float64Type>(
            query.as_primitive()
        ),
        _ => Err(Error::Index { ... }),
    }
}

Rust 实现 - Scalar Quantization

核心结构体

rust 复制代码

pub struct ScalarQuantizer {
    metadata: ScalarQuantizationMetadata,
}

pub struct ScalarQuantizationMetadata {
    pub num_bits: u16,          // 编码比特数（8 或 16）
    pub dim: usize,             // 向量维度
    pub bounds: Range<f64>,     // 值范围 [min, max]
}

初始化与边界更新

rust 复制代码

impl ScalarQuantizer {
    // 创建新的 SQ 量化器
    pub fn new(num_bits: u16, dim: usize) -> Self {
        Self {
            metadata: ScalarQuantizationMetadata {
                num_bits,
                dim,
                bounds: Range {
                    start: f64::MAX,
                    end: f64::MIN,
                },
            },
        }
    }

    // 指定范围创建
    pub fn with_bounds(
        num_bits: u16,
        dim: usize,
        bounds: Range<f64>,
    ) -> Self {
        let mut sq = Self::new(num_bits, dim);
        sq.metadata.bounds = bounds;
        sq
    }

    // 从向量数据更新范围
    pub fn update_bounds<T: ArrowFloatType>(
        &mut self,
        vectors: &FixedSizeListArray,
    ) -> Result<Range<f64>> {
        let data = vectors.values()
            .as_primitive::<T>()
            .as_slice();

        // 扫描找 min/max
        self.metadata.bounds = data.iter()
            .fold(self.metadata.bounds.clone(), |bounds, v| {
                let v_f64 = v.to_f64().unwrap();
                bounds.start.min(v_f64) .. bounds.end.max(v_f64)
            });

        Ok(self.metadata.bounds.clone())
    }
}

向量量化（编码）

rust 复制代码

pub fn transform<T: ArrowFloatType>(
    &self,
    data: &dyn Array,
) -> Result<ArrayRef> {
    let fsl = data.as_fixed_size_list_opt()
        .ok_or(Error::Index { ... })?
        .clone();

    let float_data = fsl.values()
        .as_primitive::<T>()
        .as_slice();

    // 缩放到 [0, 255] 并转换为 u8
    let quantized = scale_to_u8::<T>(float_data, &self.metadata.bounds);

    Ok(Arc::new(FixedSizeListArray::try_new_from_values(
        UInt8Array::from(quantized),
        fsl.value_length(),
    )?))
}

pub fn scale_to_u8<T: ArrowFloatType>(
    values: &[T::Native],
    bounds: &Range<f64>,
) -> Vec<u8> {
    if bounds.start == bounds.end {
        return vec![0; values.len()];
    }

    let range = bounds.end - bounds.start;
    values
        .iter()
        .map(|&v| {
            let v_f64 = v.to_f64().unwrap();
            // 线性缩放：(value - min) / (max - min) * 255
            let scaled = ((v_f64 - bounds.start) * 255.0 / range).round();
            scaled as u8  // 自动饱和转换
        })
        .collect()
}

Rust 实现 - Binary Quantization

实现原理

rust 复制代码

pub struct BinaryQuantization {}

impl BinaryQuantization {
    pub fn transform(&self, data: &dyn Array) -> Result<ArrayRef> {
        let fsl = data.as_fixed_size_list_opt()
            .ok_or(Error::Index { ... })?
            .clone();

        let float_data = fsl.values()
            .as_primitive::<Float32Type>()
            .values();

        let dim = fsl.value_length() as usize;

        // 对每个向量进行二进制量化
        let codes: Vec<u8> = float_data
            .chunks_exact(dim)
            .flat_map(binary_quantization)
            .collect();

        Ok(Arc::new(UInt8Array::from(codes)))
    }
}

// 二进制量化函数：8 个值打包成 1 字节
fn binary_quantization<T: Float>(data: &[T]) -> impl Iterator<Item = u8> + '_ {
    // 正好的 8 元组
    let iter = data.chunks_exact(8);
    let exact_chunks: Vec<u8> = iter.clone()
        .map(|chunk| {
            // Auto-vectorized 循环
            let mut bits: u8 = 0;
            chunk.iter().enumerate().for_each(|(idx, v)| {
                // v > 0 ? 1 : 0，并移位到对应比特位
                bits |= (v.is_sign_positive() as u8) << idx;
            });
            bits
        })
        .collect();

    // 剩余元素（< 8 个）
    let remainder_byte = {
        let mut bits: u8 = 0;
        iter.remainder()
            .iter()
            .enumerate()
            .for_each(|(idx, v)| {
                bits |= (v.is_sign_positive() as u8) << idx;
            });
        bits
    };

    exact_chunks.into_iter().chain(std::iter::once(remainder_byte))
}

Python 使用示例

使用 Lance 进行量化

python 复制代码

import lance
import numpy as np

# 创建示例向量数据
data = {
    "vector": np.random.randn(10000, 768).astype(np.float32),
    "id": np.arange(10000),
}

# 创建 Lance 表
table = lance.write_table(data, uri="./vectors.lance")

# ========== Product Quantization ==========
index_pq = table.create_index(
    column="vector",
    index_type="ivf_pq",
    metric_type="L2",
    num_partitions=100,     # IVF 分区
    num_sub_vectors=8,      # PQ 子向量数
    num_bits=8,             # PQ 编码比特数（8=256 个码字）
)

# 查询（自动使用 PQ）
query_vec = np.random.randn(768).astype(np.float32)
results = table.search(query_vec).limit(10).to_list()
print(f"PQ 搜索结果：{len(results)} 个结果")

# ========== Scalar Quantization ==========
index_sq = table.create_index(
    column="vector",
    index_type="ivf_sq",
    metric_type="L2",
    num_partitions=100,     # IVF 分区
    num_bits=8,             # SQ 编码比特数
)

# 查询
results = table.search(query_vec).limit(10).to_list()
print(f"SQ 搜索结果：{len(results)} 个结果")

# ========== 对比查询精度 ==========
# 无量化搜索（全精度）
results_full = table.search(query_vec)
    .limit(100)
    .to_list()

# 有量化搜索
results_pq = index_pq.search(query_vec)
    .limit(100)
    .to_list()

# 计算 Recall@100
full_ids = set(r["id"] for r in results_full)
pq_ids = set(r["id"] for r in results_pq)
recall = len(full_ids & pq_ids) / len(full_ids)
print(f"Recall@100: {recall:.2%}")

📊 性能对比与基准

存储成本对比

量化方式	单个向量大小	100万向量	压缩率
无量化 (float32)	3,072 字节	3 GB	1.0x
SQ (8 bit)	768 字节	768 MB	4.0x
PQ (8 sub, 8 bit)	8 字节	8 MB	384x
BQ (1 bit)	96 字节	96 MB	32x

搜索速度对比

量化方式	距离计算方式	单次查询时间(ms)	相对速度
无量化	浮点数乘积	100 ms	1.0x
SQ	查表 L2	15 ms	6.7x
PQ	预计算距离表	2 ms	50x
BQ	汉明距离	0.5 ms	200x

精度对比 (Recall@10)

量化方式	SIFT1M	GIST1M	Deep1B
无量化	100%	100%	100%
SQ	96.5%	97.2%	95.8%
PQ	98.2%	98.9%	97.3%
BQ	87.4%	89.1%	85.2%

🎯 应用场景指南

场景 1：高精度要求 + 中等规模

条件：

向量数据量：< 1000万
精度要求：> 99%
延迟要求：< 100ms

推荐方案：PQ

python 复制代码

# 配置示例
index = table.create_index(
    column="vector",
    index_type="ivf_pq",
    num_partitions=100,
    num_sub_vectors=8,
    num_bits=8,
)

优势：

99% 精度保留
100 倍压缩
50 倍速度提升

场景 2：极致速度 + 实时系统

条件：

向量数据量：任意
精度要求：> 90%
延迟要求：< 10ms

推荐方案：BQ（Binary Quantization）

python 复制代码

# 使用 HNSW + BQ
index = table.create_index(
    column="vector",
    index_type="hnsw",
    use_bq=True,  # 启用二进制量化
    ef_construction=200,
)

优势：

200 倍速度提升
99.9% 压缩
极低内存占用

使用场景：

实时推荐系统
在线人脸识别
流式视频搜索

场景 3：均衡方案

条件：

向量数据量：100万-1亿
精度要求：> 98%
延迟要求：< 50ms

推荐方案：SQ + IVF

python 复制代码

index = table.create_index(
    column="vector",
    index_type="ivf_sq",
    num_partitions=1000,  # 更多分区
    num_bits=8,
)

优势：

实现简单
4-10 倍压缩
10 倍速度提升
98%+ 精度

场景 4：大规模离线处理

条件：

向量数据量：> 1亿
精度要求：> 99.5%
延迟要求：无限制

推荐方案：IVF-PQ 两级量化

python 复制代码

# IVF 分割 + PQ 压缩
index = table.create_index(
    column="vector",
    index_type="ivf_pq",
    num_partitions=10000,      # 超大分区数
    num_sub_vectors=8,
    num_bits=8,
    refinement_factor=2,       # 精调
)

优势：

超大规模支持
99%+ 精度
极致存储压缩

📚 总结

量化技术是构建可扩展向量搜索系统的基础，选择合适的量化方案直接影响系统的精度、速度和成本：

PQ（产品量化）：精度与压缩的完美均衡，99% 精度下实现 100 倍压缩
SQ（标量量化）：实现最简单，适合快速部署，4-10 倍压缩
BQ（二进制量化）：极致性能，适合实时系统，200 倍速度提升

选择时的关键因素：

数据规模：> 1000万推荐 PQ；< 1000万可用 SQ
精度需求：> 99% 用 PQ；> 95% 用 SQ；> 90% 用 BQ
延迟需求：< 10ms 用 BQ；< 50ms 用 SQ；无限制用 PQ
存储约束：极严格用 BQ；严格用 PQ；宽松用 SQ