Spring AI Alibaba 向量数据库集成：Milvus与Elasticsearch配置详解

导读：向量数据库是 RAG 系统的"基础设施"，选型和调优直接影响检索质量和系统性能。本文深入讲解 Milvus 和 Elasticsearch 的生产级集成方案，覆盖索引类型选择、性能调优参数、元数据过滤查询以及数据备份恢复策略。

一、向量存储选型指南

在选 VectorStore 之前，先把几个主流选项的定位说清楚：

复制代码

+-------------------+----------+-----------+-----------+--------+------------------+
| 向量存储            | 专业性   | 性能      | 运维成本  | 适用规模 | 推荐场景         |
+-------------------+----------+-----------+-----------+--------+------------------+
| Milvus            | ★★★★★  | ★★★★★   | ★★★      | 亿级   | 大规模向量检索   |
| Elasticsearch     | ★★★★    | ★★★★    | ★★★      | 千万级 | 混合检索（已有ES）|
| Redis Vector      | ★★★     | ★★★★    | ★★       | 百万级 | 快速原型/中等规模 |
| PGVector          | ★★★     | ★★★     | ★★       | 百万级 | 已有 PostgreSQL  |
| Chroma            | ★★★     | ★★★     | ★        | 小规模 | 本地开发/原型    |
+-------------------+----------+-----------+-----------+--------+------------------+

结论：

全新项目，数据规模大：选 Milvus；
团队已有 Elasticsearch 集群：直接上 ES 向量字段；
快速验证 PoC：Redis Vector 最省事；
不想额外运维中间件：PGVector 融合到现有数据库。

二、Milvus 集成方案

2.1 Milvus 依赖配置

xml 复制代码

<!-- Spring AI Milvus Vector Store -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-milvus-store-spring-boot-autoconfigure</artifactId>
</dependency>

<!-- Milvus Java SDK -->
<dependency>
    <groupId>io.milvus</groupId>
    <artifactId>milvus-sdk-java</artifactId>
    <version>2.3.7</version>
</dependency>

2.2 Milvus YAML 配置

yaml 复制代码

spring:
  ai:
    vectorstore:
      milvus:
        # Milvus 服务地址
        host: localhost
        port: 19530
        # 数据库名（Milvus 2.3+ 支持多数据库）
        database-name: ai_knowledge
        # Collection 名称（类似关系型数据库的表）
        collection-name: document_vectors
        # 向量维度（必须与 Embedding 模型维度一致）
        embedding-dimension: 1024
        # 索引类型（见下文详细说明）
        index-type: IVF_FLAT
        # 距离度量方式：IP（内积）适合归一化向量，L2 适合非归一化
        metric-type: IP
        # 初始化时自动创建 Collection
        initialize-schema: true

2.3 Milvus 索引类型深度解析

索引类型是 Milvus 性能调优的核心，不同索引类型的选择会导致数量级的性能差异：

复制代码

+----------+-------------------+----------+---------+------------------------------+
| 索引类型  | 原理              | 查询精度 | 查询速度 | 适用场景                     |
+----------+-------------------+----------+---------+------------------------------+
| FLAT     | 暴力搜索（精确）   | 100%     | 最慢    | 数据量 < 100万，要求精确检索 |
| IVF_FLAT | 倒排索引+精确搜索  | 高       | 中等    | 通用场景，百万级数据          |
| IVF_SQ8  | 倒排索引+量化压缩  | 中等     | 快      | 对内存敏感，可接受少量精度损失 |
| HNSW     | 图索引            | 极高     | 极快    | 高 QPS 场景，亿级数据        |
| DISKANN  | 磁盘索引          | 高       | 快      | 超大数据集（内存放不下）      |
+----------+-------------------+----------+---------+------------------------------+

参数调优建议（以 IVF_FLAT 为例）：

java 复制代码

@Configuration
public class MilvusConfig {

    @Bean
    public MilvusVectorStore vectorStore(
            MilvusServiceClient milvusClient,
            EmbeddingModel embeddingModel) {

        MilvusVectorStoreConfig config = MilvusVectorStoreConfig.builder()
                .withCollectionName("document_vectors")
                .withDatabaseName("ai_knowledge")
                .withEmbeddingDimension(1024)
                // IVF_FLAT 索引参数
                .withIndexType(IndexType.IVF_FLAT)
                .withMetricType(MetricType.IP)
                // nlist：聚类中心数
                // 建议值：sqrt(数据量) ~ 4*sqrt(数据量)
                // 100万数据 → nlist=1000~4000
                .withIndexParameters(Map.of("nlist", "1024"))
                .withBatchingStrategy(BatchingStrategy.NONE)
                .build();

        return new MilvusVectorStore(milvusClient, embeddingModel, config);
    }
}

搜索时参数调优：

java 复制代码

// nprobe：搜索时扫描的聚类数
// nprobe 越大，精度越高，但速度越慢
// 建议：nprobe = nlist * 0.1（查准率与速度的平衡点）
SearchRequest searchRequest = SearchRequest.defaults()
        .withTopK(10)
        .withSimilarityThreshold(0.7)
        // 通过 filterExpression 传入搜索参数（部分实现支持）
        .withFilterExpression("department == 'tech'");

2.4 Milvus 分区策略

对于数据量大的场景，可以按业务维度做分区，缩小搜索范围：

java 复制代码

/**
 * 按部门分区的向量存储 Service
 * 同一 Collection，不同部门的数据存入不同分区
 */
@Service
@RequiredArgsConstructor
public class PartitionedVectorService {

    private final MilvusServiceClient milvusClient;

    /**
     * 创建分区（按部门）
     */
    public void createPartition(String collectionName, String department) {
        CreatePartitionParam param = CreatePartitionParam.newBuilder()
                .withCollectionName(collectionName)
                .withPartitionName("dept_" + department)
                .build();
        milvusClient.createPartition(param);
    }

    /**
     * 向指定分区插入向量
     */
    public void insertToPartition(String department, List<Float> vector,
                                   String docId, String content) {
        // 构建插入请求（指定分区名）
        InsertParam param = InsertParam.newBuilder()
                .withCollectionName("document_vectors")
                .withPartitionName("dept_" + department)
                .withFields(List.of(
                        new InsertParam.Field("vector", List.of(vector)),
                        new InsertParam.Field("doc_id", List.of(docId)),
                        new InsertParam.Field("content", List.of(content))
                ))
                .build();
        milvusClient.insert(param);
    }
}

三、Elasticsearch 向量集成方案

3.1 ES 向量依赖配置

xml 复制代码

<!-- Spring AI Elasticsearch Vector Store -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-elasticsearch-store-spring-boot-autoconfigure</artifactId>
</dependency>

<!-- Elasticsearch Java Client -->
<dependency>
    <groupId>co.elastic.clients</groupId>
    <artifactId>elasticsearch-java</artifactId>
    <version>8.13.0</version>
</dependency>

3.2 ES 向量 YAML 配置

yaml 复制代码

spring:
  ai:
    vectorstore:
      elasticsearch:
        # ES 索引名称
        index-name: ai-knowledge-vectors
        # 向量维度
        dimensions: 1024
        # 相似度算法：cosine（余弦）/ dot_product（内积）/ l2_norm（欧氏距离）
        similarity: cosine
        # 初始化时自动创建索引
        initialize-schema: true

  elasticsearch:
    uris: http://localhost:9200
    username: ${ES_USERNAME:elastic}
    password: ${ES_PASSWORD:}

3.3 ES 索引 Mapping 设计

理解 Spring AI 为 ES 自动创建的索引结构，有助于做进一步优化：

json 复制代码

{
  "mappings": {
    "properties": {
      "id": { "type": "keyword" },
      "content": {
        "type": "text",
        "analyzer": "ik_max_word"
      },
      "embedding": {
        "type": "dense_vector",
        "dims": 1024,
        "index": true,
        "similarity": "cosine"
      },
      "metadata": {
        "type": "object",
        "dynamic": true,
        "properties": {
          "source":    { "type": "keyword" },
          "category":  { "type": "keyword" },
          "ingestTime":{ "type": "long" }
        }
      }
    }
  }
}

对于中文内容，强烈建议安装 IK 分词器，提升全文搜索效果：

bash 复制代码

# 安装 IK 分词器插件
./bin/elasticsearch-plugin install \
  https://github.com/infinilabs/analysis-ik/releases/download/v8.13.0/elasticsearch-analysis-ik-8.13.0.zip

3.4 _knn_search 与 script_score 对比

ES 提供两种向量搜索接口：

复制代码

_knn_search（推荐用于纯向量搜索）
+-------------------------------------------+
| 优点：                                     |
|   专为 kNN 优化，自动使用 HNSW 索引        |
|   API 简洁，支持 filter 前置过滤           |
|   性能优于 script_score                    |
| 缺点：                                     |
|   不支持复杂的自定义评分                   |
+-------------------------------------------+

script_score（用于需要混合评分的场景）
+-------------------------------------------+
| 优点：                                     |
|   可以自定义评分公式                       |
|   支持向量相似度 + 关键词 BM25 混合打分    |
| 缺点：                                     |
|   不使用 HNSW 索引，全量扫描，性能较低     |
+-------------------------------------------+

Spring AI 默认使用 _knn_search，如果需要混合搜索，可以通过自定义实现：

java 复制代码

@Component
public class HybridSearchService {

    private final ElasticsearchClient esClient;
    private final EmbeddingModel embeddingModel;

    /**
     * 混合检索：向量相似度 + 关键词全文搜索
     * 融合两种得分，提升专业术语的检索精度
     */
    public List<Map<String, Object>> hybridSearch(
            String query, int topK, String categoryFilter) throws IOException {

        // 1. 生成查询向量
        float[] queryVector = embeddingModel.embed(query);

        // 2. 构建混合搜索请求（kNN + 全文）
        SearchResponse<Map> response = esClient.search(s -> s
                        .index("ai-knowledge-vectors")
                        // kNN 向量检索部分
                        .knn(k -> k
                                .field("embedding")
                                .queryVector(toDoubleList(queryVector))
                                .k(topK)
                                .numCandidates(topK * 5) // 候选数，影响精度
                                .filter(f -> f
                                        .term(t -> t
                                                .field("metadata.category")
                                                .value(categoryFilter)))
                        )
                        // 全文检索部分（关键词匹配）
                        .query(q -> q
                                .match(m -> m
                                        .field("content")
                                        .query(query)
                                        .boost(0.5f) // 全文分权重
                                ))
                        .size(topK),
                Map.class);

        return response.hits().hits().stream()
                .map(hit -> {
                    Map<String, Object> result = new HashMap<>(hit.source());
                    result.put("_score", hit.score());
                    return result;
                })
                .toList();
    }

    private List<Double> toDoubleList(float[] floats) {
        List<Double> list = new ArrayList<>(floats.length);
        for (float f : floats) list.add((double) f);
        return list;
    }
}

四、元数据过滤：标量字段联合查询

Spring AI 的 SearchRequest.withFilterExpression() 支持类 SQL 的过滤表达式：

java 复制代码

// 单条件过滤：只搜索技术分类的文档
SearchRequest.defaults()
    .withTopK(5)
    .withFilterExpression("category == 'tech'");

// 多条件 AND 过滤
SearchRequest.defaults()
    .withTopK(5)
    .withFilterExpression("category == 'tech' AND department == 'rd'");

// 时间范围过滤（最近 30 天入库的文档）
long thirtyDaysAgo = System.currentTimeMillis() - 30L * 24 * 3600 * 1000;
SearchRequest.defaults()
    .withTopK(5)
    .withFilterExpression("ingestTime >= " + thirtyDaysAgo);

// IN 条件（多值匹配）
SearchRequest.defaults()
    .withTopK(5)
    .withFilterExpression("category IN ['tech', 'product', 'support']");

注意：不同 VectorStore 对过滤表达式的支持程度不同。Milvus 和 Elasticsearch 支持丰富的过滤操作，Redis Vector 支持有限。使用前请查阅具体实现文档。

五、数据同步与容灾

5.1 增量同步方案

复制代码

文档变更事件
    |
    | (通过 CDC 或消息队列触发)
    v
VectorSyncService
    |
    +-- 新增文档 → 向量化 → 插入 VectorStore
    |
    +-- 更新文档 → 删除旧向量 → 向量化新内容 → 插入
    |
    +-- 删除文档 → 删除对应向量（按元数据过滤）

java 复制代码

@Service
@RequiredArgsConstructor
@Slf4j
public class VectorSyncService {

    private final VectorStore vectorStore;
    private final EmbeddingModel embeddingModel;

    /**
     * 基于文档 hash 的增量同步
     * 内容未变化的文档不重复入库
     */
    public void syncDocument(String docId, String content,
                              Map<String, Object> metadata) {
        String contentHash = DigestUtils.md5DigestAsHex(
                content.getBytes(StandardCharsets.UTF_8));

        // 检查是否已存在相同 hash 的文档
        List<Document> existing = vectorStore.similaritySearch(
                SearchRequest.defaults()
                        .withTopK(1)
                        .withFilterExpression("doc_id == '" + docId + "'")
        );

        if (!existing.isEmpty()) {
            String existingHash = (String) existing.get(0)
                    .getMetadata().getOrDefault("contentHash", "");
            if (contentHash.equals(existingHash)) {
                log.debug("[docId={}] 内容未变化，跳过同步", docId);
                return;
            }
            // 内容有变化：删除旧版本
            log.info("[docId={}] 检测到内容变化，更新向量", docId);
            vectorStore.delete(List.of(docId));
        }

        // 入库新版本
        metadata.put("doc_id", docId);
        metadata.put("contentHash", contentHash);
        metadata.put("syncTime", System.currentTimeMillis());

        Document document = new Document(content, metadata);
        vectorStore.add(List.of(document));
        log.info("[docId={}] 向量同步完成", docId);
    }
}

5.2 Milvus 快照备份

bash 复制代码

# 创建 Collection 快照
curl -X POST "http://milvus-host:9091/api/v1/snapshot" \
  -H "Content-Type: application/json" \
  -d '{
    "collection_name": "document_vectors",
    "snapshot_name": "backup_20250320"
  }'

# 查看快照列表
curl "http://milvus-host:9091/api/v1/snapshot/document_vectors"

# 从快照恢复
curl -X POST "http://milvus-host:9091/api/v1/snapshot/restore" \
  -H "Content-Type: application/json" \
  -d '{
    "collection_name": "document_vectors_restore",
    "snapshot_name": "backup_20250320"
  }'

5.3 ES 索引备份（Snapshot API）

bash 复制代码

# 1. 注册快照仓库（使用共享文件系统）
curl -X PUT "localhost:9200/_snapshot/my_backup" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "fs",
    "settings": {
      "location": "/mount/backups/es-snapshots"
    }
  }'

# 2. 创建快照
curl -X PUT "localhost:9200/_snapshot/my_backup/snapshot_20250320?wait_for_completion=true" \
  -H "Content-Type: application/json" \
  -d '{
    "indices": "ai-knowledge-vectors",
    "ignore_unavailable": true
  }'

# 3. 恢复快照
curl -X POST "localhost:9200/_snapshot/my_backup/snapshot_20250320/_restore" \
  -H "Content-Type: application/json" \
  -d '{
    "indices": "ai-knowledge-vectors",
    "rename_pattern": "(.+)",
    "rename_replacement": "$1_restore"
  }'

六、性能调优基准参考

6.1 Milvus 性能调优参数

yaml 复制代码

# Milvus 内存配置（在 milvus.yaml 中设置）
queryNode:
  # 加载 Collection 到内存的并发数
  loadMemoryUsageFactor: 3.0
  # 内存水位：超过时触发 gc
  overloadedMemoryThresholdPercentage: 90

indexNode:
  # 索引构建并发线程数
  buildParallel: 4

6.2 ES 向量索引调优

json 复制代码

// 调整 HNSW 索引参数（需要重建索引后生效）
PUT /ai-knowledge-vectors
{
  "mappings": {
    "properties": {
      "embedding": {
        "type": "dense_vector",
        "dims": 1024,
        "index": true,
        "similarity": "cosine",
        "index_options": {
          "type": "hnsw",
          "m": 16,             // 每个节点的连接数，越大精度越高，内存越多
          "ef_construction": 100  // 建索引时的候选集大小，越大精度越高
        }
      }
    }
  }
}

七、向量维度选择指南

复制代码

+------------+----------+----------+---------------+------------------+
| Embedding  | 向量维度 | 存储/百万 | 查询速度      | 推荐场景         |
+------------+----------+----------+---------------+------------------+
| text-emb-v3| 512       | ~2GB     | 很快          | 对速度敏感的场景 |
| text-emb-v3| 1024      | ~4GB     | 快            | 通用 RAG（推荐） |
| text-emb-v3| 2048      | ~8GB     | 中等          | 高精度要求场景   |
| OpenAI v3  | 1536      | ~6GB     | 快            | OpenAI 生态     |
| OpenAI v3  | 3072      | ~12GB    | 较慢          | OpenAI 高精度   |
+------------+----------+----------+---------------+------------------+

归一化处理：使用余弦相似度时，建议对向量做归一化，使相似度计算更准确：

java 复制代码

private float[] normalizeVector(float[] vector) {
    double norm = 0;
    for (float v : vector) norm += v * v;
    norm = Math.sqrt(norm);
    float[] normalized = new float[vector.length];
    for (int i = 0; i < vector.length; i++) {
        normalized[i] = (float) (vector[i] / norm);
    }
    return normalized;
}

八、总结

向量数据库的选型与调优是 RAG 系统落地的关键一环：

Milvus：专业向量数据库，大规模数据首选，HNSW 索引是高 QPS 场景的不二之选；
Elasticsearch：混合检索（向量 + 关键词）的最佳平台，现有 ES 用户直接扩展；
元数据过滤：标量字段联合查询可显著缩小搜索范围，提升精度和速度；
数据备份：快照机制是生产必备，建议每天凌晨自动备份；
向量维度：1024 维是通义千问 text-embedding-v3 的推荐维度，平衡精度与性能。

下一篇将聚焦文档智能处理：PDF 解析、Markdown 入库、内容清洗与智能分块策略的全链路实现。

参考资料

Spring AI Vector Store 文档

Milvus 官方文档

Elasticsearch kNN 搜索文档