Spring Boot + Milvus + LangChain4j 实现 RAG 问答:从向量入库到 DeepSeek 生成

Spring Boot + Milvus + LangChain4j 实现 RAG 问答:从向量入库到 DeepSeek 生成

本文基于 com.haiwei.javaai.demo1 包,梳理一套完整的 检索增强生成(RAG) 流程:应用启动时自动创建 Milvus 库表、加载文档并切块向量化入库;用户提问时检索相似片段、拼装 Prompt,再调用 DeepSeek 大模型生成答案。


一、整体架构

#mermaid-svg-qh8rBoWnTRc3fTIE{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-qh8rBoWnTRc3fTIE .error-icon{fill:#552222;}#mermaid-svg-qh8rBoWnTRc3fTIE .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-qh8rBoWnTRc3fTIE .marker{fill:#333333;stroke:#333333;}#mermaid-svg-qh8rBoWnTRc3fTIE .marker.cross{stroke:#333333;}#mermaid-svg-qh8rBoWnTRc3fTIE svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-qh8rBoWnTRc3fTIE p{margin:0;}#mermaid-svg-qh8rBoWnTRc3fTIE .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE .cluster-label text{fill:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE .cluster-label span{color:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE .cluster-label span p{background-color:transparent;}#mermaid-svg-qh8rBoWnTRc3fTIE .label text,#mermaid-svg-qh8rBoWnTRc3fTIE span{fill:#333;color:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE .node rect,#mermaid-svg-qh8rBoWnTRc3fTIE .node circle,#mermaid-svg-qh8rBoWnTRc3fTIE .node ellipse,#mermaid-svg-qh8rBoWnTRc3fTIE .node polygon,#mermaid-svg-qh8rBoWnTRc3fTIE .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-qh8rBoWnTRc3fTIE .rough-node .label text,#mermaid-svg-qh8rBoWnTRc3fTIE .node .label text,#mermaid-svg-qh8rBoWnTRc3fTIE .image-shape .label,#mermaid-svg-qh8rBoWnTRc3fTIE .icon-shape .label{text-anchor:middle;}#mermaid-svg-qh8rBoWnTRc3fTIE .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-qh8rBoWnTRc3fTIE .rough-node .label,#mermaid-svg-qh8rBoWnTRc3fTIE .node .label,#mermaid-svg-qh8rBoWnTRc3fTIE .image-shape .label,#mermaid-svg-qh8rBoWnTRc3fTIE .icon-shape .label{text-align:center;}#mermaid-svg-qh8rBoWnTRc3fTIE .node.clickable{cursor:pointer;}#mermaid-svg-qh8rBoWnTRc3fTIE .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-qh8rBoWnTRc3fTIE .arrowheadPath{fill:#333333;}#mermaid-svg-qh8rBoWnTRc3fTIE .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-qh8rBoWnTRc3fTIE .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-qh8rBoWnTRc3fTIE .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-qh8rBoWnTRc3fTIE .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-qh8rBoWnTRc3fTIE .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-qh8rBoWnTRc3fTIE .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-qh8rBoWnTRc3fTIE .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-qh8rBoWnTRc3fTIE .cluster text{fill:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE .cluster span{color:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-qh8rBoWnTRc3fTIE .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE rect.text{fill:none;stroke-width:0;}#mermaid-svg-qh8rBoWnTRc3fTIE .icon-shape,#mermaid-svg-qh8rBoWnTRc3fTIE .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-qh8rBoWnTRc3fTIE .icon-shape p,#mermaid-svg-qh8rBoWnTRc3fTIE .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-qh8rBoWnTRc3fTIE .icon-shape .label rect,#mermaid-svg-qh8rBoWnTRc3fTIE .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-qh8rBoWnTRc3fTIE .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-qh8rBoWnTRc3fTIE .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-qh8rBoWnTRc3fTIE :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 问答请求阶段
应用启动阶段
Spring Boot 启动
MilvusCollectionCreator
创建/重建 demo1 数据库
创建 my_collection_1 Collection
MilvusService.loadFile
读取 Smartshell.txt
DocumentSplitter 切块
EmbeddingModelUtil 向量化
写入 Milvus
GET /ai/rag/call
用户 message 向量化
RetrievalService 向量检索 TopK
buildRagPrompt 拼装 Prompt
DeepSeekChatModel 生成回答
返回 generation

核心组件职责

职责
MilvusConfig 注册 MilvusClientV2 Bean
MilvusClientService Milvus 连接配置(URI、Token)
MilvusConstant 数据库名、Collection 名常量
MilvusCollectionCreator 启动时建库建表,触发文档入库
MilvusService 文档加载、切块、向量化、插入 Milvus
EmbeddingModelUtil 本地 ONNX 嵌入模型(384 维)
RetrievalService 按查询向量在 Milvus 中检索 Top3
ChatRAGController RAG 问答入口:检索 + Prompt + LLM
SearchResultWithScore 检索结果 DTO(id、score、message)

数据流概览

  1. 离线/启动入库Smartshell.txt → 文本切块 → AllMiniLmL6V2 384 维向量 → Milvus vector 字段 + 原文 message 字段。
  2. 在线问答:用户问题 → 同样模型向量化 → Milvus L2 近邻搜索 → 取最相似 3 段原文 → 拼进 Prompt → DeepSeek 生成。

二、Maven 依赖(pom.xml)

demo1 依赖以下核心库(节选):

xml 复制代码
<properties>
    <java.version>17</java.version>
</properties>

<dependencies>
    <!-- LangChain4j:文档处理、本地嵌入模型 -->
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j</artifactId>
        <version>1.11.8</version>
    </dependency>
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j-document-parser-apache-tika</artifactId>
        <version>1.11.8-beta19</version>
    </dependency>
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j-embeddings</artifactId>
        <version>1.11.8-beta19</version>
    </dependency>
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j-embeddings-all-minilm-l6-v2-q</artifactId>
        <version>1.11.8-beta19</version>
    </dependency>

    <!-- Milvus Java SDK V2 -->
    <dependency>
        <groupId>io.milvus</groupId>
        <artifactId>milvus-sdk-java</artifactId>
        <version>2.6.18</version>
    </dependency>

    <!-- Spring AI DeepSeek -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-starter-model-deepseek</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
</dependencies>

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-bom</artifactId>
            <version>2.0.0</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

依赖说明:

依赖 用途
langchain4j-embeddings-all-minilm-l6-v2-q 本地量化嵌入模型,输出 384 维 向量,与 Milvus Collection 的 dimension(384) 一致
milvus-sdk-java Milvus V2 API:建库建表、插入、向量检索
spring-ai-starter-model-deepseek 通过 Spring AI 调用 DeepSeek Chat API
gson 构造 Milvus 插入时的 JSON 行数据

三、配置与环境变量(application.properties)

properties 复制代码
server.port=${SSHELL_PORT:8080}
server.servlet.context-path=/javaai

# DeepSeek(Spring AI)
spring.ai.deepseek.api-key=${DEEPSEEK_API_KEY:}
spring.ai.deepseek.base-url=${DEEPSEEK_API_BASE_URL:https://api.deepseek.com}

deepseek.api-base-url=${DEEPSEEK_API_BASE_URL:https://api.deepseek.com}
deepseek.api-key=${DEEPSEEK_API_KEY:}
deepseek.model=deepseek-chat

环境变量 / 启动参数

变量名 默认值 说明
SSHELL_PORT 8080 HTTP 端口
DEEPSEEK_API_KEY 必填,DeepSeek API Key
DEEPSEEK_API_BASE_URL https://api.deepseek.com DeepSeek API 地址

Milvus 连接(代码内硬编码)

当前在 MilvusClientService 中配置,application.properties

java 复制代码
private static final String CLUSTER_ENDPOINT = "http://localhost:19530";
private static final String TOKEN = "root:Milvus";

启动前需确保本地 Milvus 已运行,且地址/凭证与上述一致。

完整问答 URL 示例

复制代码
GET http://localhost:8080/javaai/ai/rag/call?message=SmartShell和传统堡垒机有什么区别?

四、Milvus 连接与 Bean 配置

MilvusConfig 将单例客户端注入 Spring 容器:

java 复制代码
@Configuration
public class MilvusConfig {

    @Bean
    public MilvusClientV2 milvusClient() {
        return MilvusClientService.getClient();
    }
}

MilvusClientService 使用懒加载单例:

java 复制代码
public class MilvusClientService {
    private static final String CLUSTER_ENDPOINT = "http://localhost:19530";
    private static final String TOKEN = "root:Milvus";
    private static MilvusClientV2 instance;

    public static MilvusClientV2 getClient() {
        if (instance == null) {
            ConnectConfig connectConfig = ConnectConfig.builder()
                    .uri(CLUSTER_ENDPOINT)
                    .token(TOKEN)
                    .build();
            instance = new MilvusClientV2(connectConfig);
        }
        return instance;
    }
}

常量定义:

java 复制代码
public class MilvusConstant {
    public static final String DATA_BASE = "demo1";
    public static final String MY_COLLECTION_1 = "my_collection_1";
}

五、启动时创建 Milvus 库表与入库

MilvusCollectionCreator 实现 InitializingBean,在 所有 Bean 属性注入完成后 执行 afterPropertiesSet()

5.1 启动逻辑(注意:会清空 demo1 库)

#mermaid-svg-fTPKw8evPxKutFgC{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-fTPKw8evPxKutFgC .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-fTPKw8evPxKutFgC .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-fTPKw8evPxKutFgC .error-icon{fill:#552222;}#mermaid-svg-fTPKw8evPxKutFgC .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-fTPKw8evPxKutFgC .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-fTPKw8evPxKutFgC .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-fTPKw8evPxKutFgC .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-fTPKw8evPxKutFgC .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-fTPKw8evPxKutFgC .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-fTPKw8evPxKutFgC .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-fTPKw8evPxKutFgC .marker{fill:#333333;stroke:#333333;}#mermaid-svg-fTPKw8evPxKutFgC .marker.cross{stroke:#333333;}#mermaid-svg-fTPKw8evPxKutFgC svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-fTPKw8evPxKutFgC p{margin:0;}#mermaid-svg-fTPKw8evPxKutFgC .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-fTPKw8evPxKutFgC .cluster-label text{fill:#333;}#mermaid-svg-fTPKw8evPxKutFgC .cluster-label span{color:#333;}#mermaid-svg-fTPKw8evPxKutFgC .cluster-label span p{background-color:transparent;}#mermaid-svg-fTPKw8evPxKutFgC .label text,#mermaid-svg-fTPKw8evPxKutFgC span{fill:#333;color:#333;}#mermaid-svg-fTPKw8evPxKutFgC .node rect,#mermaid-svg-fTPKw8evPxKutFgC .node circle,#mermaid-svg-fTPKw8evPxKutFgC .node ellipse,#mermaid-svg-fTPKw8evPxKutFgC .node polygon,#mermaid-svg-fTPKw8evPxKutFgC .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-fTPKw8evPxKutFgC .rough-node .label text,#mermaid-svg-fTPKw8evPxKutFgC .node .label text,#mermaid-svg-fTPKw8evPxKutFgC .image-shape .label,#mermaid-svg-fTPKw8evPxKutFgC .icon-shape .label{text-anchor:middle;}#mermaid-svg-fTPKw8evPxKutFgC .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-fTPKw8evPxKutFgC .rough-node .label,#mermaid-svg-fTPKw8evPxKutFgC .node .label,#mermaid-svg-fTPKw8evPxKutFgC .image-shape .label,#mermaid-svg-fTPKw8evPxKutFgC .icon-shape .label{text-align:center;}#mermaid-svg-fTPKw8evPxKutFgC .node.clickable{cursor:pointer;}#mermaid-svg-fTPKw8evPxKutFgC .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-fTPKw8evPxKutFgC .arrowheadPath{fill:#333333;}#mermaid-svg-fTPKw8evPxKutFgC .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-fTPKw8evPxKutFgC .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-fTPKw8evPxKutFgC .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-fTPKw8evPxKutFgC .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-fTPKw8evPxKutFgC .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-fTPKw8evPxKutFgC .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-fTPKw8evPxKutFgC .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-fTPKw8evPxKutFgC .cluster text{fill:#333;}#mermaid-svg-fTPKw8evPxKutFgC .cluster span{color:#333;}#mermaid-svg-fTPKw8evPxKutFgC div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-fTPKw8evPxKutFgC .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-fTPKw8evPxKutFgC rect.text{fill:none;stroke-width:0;}#mermaid-svg-fTPKw8evPxKutFgC .icon-shape,#mermaid-svg-fTPKw8evPxKutFgC .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-fTPKw8evPxKutFgC .icon-shape p,#mermaid-svg-fTPKw8evPxKutFgC .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-fTPKw8evPxKutFgC .icon-shape .label rect,#mermaid-svg-fTPKw8evPxKutFgC .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-fTPKw8evPxKutFgC .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-fTPKw8evPxKutFgC .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-fTPKw8evPxKutFgC :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 存在
不存在
检查 demo1 库是否存在
删除所有 Collection
dropDatabase demo1
createDatabase demo1
useDatabase demo1
创建 my_collection_1
milvusService.loadFile

重要行为:demo1 数据库已存在,会先 删除该库下全部 Collection 并 drop 整个数据库,再重建。每次启动都会重新导入文档,适合 Demo,生产环境需改造。

5.2 Collection Schema

字段 类型 说明
id Int64 主键,autoID=true
vector FloatVector 维度 384,与嵌入模型一致
message VarChar(2000) 切块后的原文片段

5.3 索引配置

  • vectorIVF_FLAT,度量 L2nlist=128
  • messageAUTOINDEX(标量字段索引)

5.4 关键代码:建表

java 复制代码
fieldSchemaList1.add(CreateCollectionReq.FieldSchema.builder()
        .name("id")
        .dataType(DataType.Int64)
        .isPrimaryKey(true)
        .autoID(true)
        .build());
fieldSchemaList1.add(CreateCollectionReq.FieldSchema.builder()
        .name("vector")
        .dataType(DataType.FloatVector)
        .dimension(384)
        .build());
fieldSchemaList1.add(CreateCollectionReq.FieldSchema.builder()
        .name("message")
        .dataType(DataType.VarChar)
        .maxLength(2000)
        .build());

indexParams1.add(IndexParam.builder()
        .indexName("vector_index")
        .fieldName("vector")
        .indexType(IndexParam.IndexType.IVF_FLAT)
        .metricType(IndexParam.MetricType.L2)
        .extraParams(Collections.singletonMap("nlist", 128))
        .build());

milvusClientV2.createCollection(createCollectionReq);
// 建表成功后
milvusService.loadFile();

六、文档加载、拆分、向量化与存储

6.1 嵌入模型

使用 LangChain4j 本地 ONNX 模型,无需额外 API

java 复制代码
public class EmbeddingModelUtil {
    private static final EmbeddingModel embeddingModel =
            new AllMiniLmL6V2QuantizedEmbeddingModel();

    public static float[] embed(String text) {
        Response<Embedding> response = embeddingModel.embed(text);
        return response.content().vector();
    }
}

6.2 文档处理流程(MilvusService.loadFile)

#mermaid-svg-BbG8tWVRna2E8yS1{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-BbG8tWVRna2E8yS1 .error-icon{fill:#552222;}#mermaid-svg-BbG8tWVRna2E8yS1 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-BbG8tWVRna2E8yS1 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-BbG8tWVRna2E8yS1 .marker.cross{stroke:#333333;}#mermaid-svg-BbG8tWVRna2E8yS1 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-BbG8tWVRna2E8yS1 p{margin:0;}#mermaid-svg-BbG8tWVRna2E8yS1 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 .cluster-label text{fill:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 .cluster-label span{color:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 .cluster-label span p{background-color:transparent;}#mermaid-svg-BbG8tWVRna2E8yS1 .label text,#mermaid-svg-BbG8tWVRna2E8yS1 span{fill:#333;color:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 .node rect,#mermaid-svg-BbG8tWVRna2E8yS1 .node circle,#mermaid-svg-BbG8tWVRna2E8yS1 .node ellipse,#mermaid-svg-BbG8tWVRna2E8yS1 .node polygon,#mermaid-svg-BbG8tWVRna2E8yS1 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-BbG8tWVRna2E8yS1 .rough-node .label text,#mermaid-svg-BbG8tWVRna2E8yS1 .node .label text,#mermaid-svg-BbG8tWVRna2E8yS1 .image-shape .label,#mermaid-svg-BbG8tWVRna2E8yS1 .icon-shape .label{text-anchor:middle;}#mermaid-svg-BbG8tWVRna2E8yS1 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-BbG8tWVRna2E8yS1 .rough-node .label,#mermaid-svg-BbG8tWVRna2E8yS1 .node .label,#mermaid-svg-BbG8tWVRna2E8yS1 .image-shape .label,#mermaid-svg-BbG8tWVRna2E8yS1 .icon-shape .label{text-align:center;}#mermaid-svg-BbG8tWVRna2E8yS1 .node.clickable{cursor:pointer;}#mermaid-svg-BbG8tWVRna2E8yS1 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-BbG8tWVRna2E8yS1 .arrowheadPath{fill:#333333;}#mermaid-svg-BbG8tWVRna2E8yS1 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-BbG8tWVRna2E8yS1 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-BbG8tWVRna2E8yS1 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-BbG8tWVRna2E8yS1 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-BbG8tWVRna2E8yS1 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-BbG8tWVRna2E8yS1 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-BbG8tWVRna2E8yS1 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-BbG8tWVRna2E8yS1 .cluster text{fill:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 .cluster span{color:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-BbG8tWVRna2E8yS1 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 rect.text{fill:none;stroke-width:0;}#mermaid-svg-BbG8tWVRna2E8yS1 .icon-shape,#mermaid-svg-BbG8tWVRna2E8yS1 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-BbG8tWVRna2E8yS1 .icon-shape p,#mermaid-svg-BbG8tWVRna2E8yS1 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-BbG8tWVRna2E8yS1 .icon-shape .label rect,#mermaid-svg-BbG8tWVRna2E8yS1 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-BbG8tWVRna2E8yS1 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-BbG8tWVRna2E8yS1 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-BbG8tWVRna2E8yS1 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} ClassPathResource data/Smartshell.txt
Document.from 全文
recursive splitter 500/50
遍历 TextSegment
EmbeddingModelUtil.embed
insertData vector + message
Milvus insert

切块参数:

  • MAX_SEGMENT_SIZE = 500:单段最大字符数
  • MAX_OVERLAP_SIZE = 50:段间重叠,避免语义在边界断裂

源文档: src/main/resources/data/Smartshell.txt(SmartShell 智能运维平台产品说明)

6.3 关键代码:加载与入库

java 复制代码
public void loadFile() {
    ClassPathResource resource = new ClassPathResource("data/Smartshell.txt");
    String text = resource.getContentAsString(StandardCharsets.UTF_8);
    Document document = Document.from(text);

    DocumentSplitter splitter = DocumentSplitters.recursive(MAX_SEGMENT_SIZE, MAX_OVERLAP_SIZE);
    List<TextSegment> segments = splitter.split(document);

    for (TextSegment segment : segments) {
        String segmentText = segment.text().trim();
        if (segmentText.isEmpty()) continue;

        float[] vector = EmbeddingModelUtil.embed(segmentText);
        insertData(vector, segmentText);
    }
}

public void insertData(float[] vector, String message) {
    JsonObject row = new JsonObject();
    row.add("vector", JsonUtil.gson.toJsonTree(vector));
    row.addProperty("message", message);

    milvusClientV2.insert(InsertReq.builder()
            .collectionName(MilvusConstant.MY_COLLECTION_1)
            .data(List.of(row))
            .build());
}

入库与检索使用 同一套嵌入模型,是保证 RAG 效果的前提。


七、问答链路:ChatRAGController

7.1 请求处理流程

DeepSeekChatModel Milvus RetrievalService EmbeddingModelUtil ChatRAGController User DeepSeekChatModel Milvus RetrievalService EmbeddingModelUtil ChatRAGController User #mermaid-svg-OeK161NMo9QZyUeO{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-OeK161NMo9QZyUeO .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-OeK161NMo9QZyUeO .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-OeK161NMo9QZyUeO .error-icon{fill:#552222;}#mermaid-svg-OeK161NMo9QZyUeO .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-OeK161NMo9QZyUeO .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-OeK161NMo9QZyUeO .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-OeK161NMo9QZyUeO .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-OeK161NMo9QZyUeO .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-OeK161NMo9QZyUeO .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-OeK161NMo9QZyUeO .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-OeK161NMo9QZyUeO .marker{fill:#333333;stroke:#333333;}#mermaid-svg-OeK161NMo9QZyUeO .marker.cross{stroke:#333333;}#mermaid-svg-OeK161NMo9QZyUeO svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-OeK161NMo9QZyUeO p{margin:0;}#mermaid-svg-OeK161NMo9QZyUeO .actor{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-OeK161NMo9QZyUeO text.actor>tspan{fill:black;stroke:none;}#mermaid-svg-OeK161NMo9QZyUeO .actor-line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-OeK161NMo9QZyUeO .innerArc{stroke-width:1.5;stroke-dasharray:none;}#mermaid-svg-OeK161NMo9QZyUeO .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333;}#mermaid-svg-OeK161NMo9QZyUeO .messageLine1{stroke-width:1.5;stroke-dasharray:2,2;stroke:#333;}#mermaid-svg-OeK161NMo9QZyUeO #arrowhead path{fill:#333;stroke:#333;}#mermaid-svg-OeK161NMo9QZyUeO .sequenceNumber{fill:white;}#mermaid-svg-OeK161NMo9QZyUeO #sequencenumber{fill:#333;}#mermaid-svg-OeK161NMo9QZyUeO #crosshead path{fill:#333;stroke:#333;}#mermaid-svg-OeK161NMo9QZyUeO .messageText{fill:#333;stroke:none;}#mermaid-svg-OeK161NMo9QZyUeO .labelBox{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-OeK161NMo9QZyUeO .labelText,#mermaid-svg-OeK161NMo9QZyUeO .labelText>tspan{fill:black;stroke:none;}#mermaid-svg-OeK161NMo9QZyUeO .loopText,#mermaid-svg-OeK161NMo9QZyUeO .loopText>tspan{fill:black;stroke:none;}#mermaid-svg-OeK161NMo9QZyUeO .loopLine{stroke-width:2px;stroke-dasharray:2,2;stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-OeK161NMo9QZyUeO .note{stroke:#aaaa33;fill:#fff5ad;}#mermaid-svg-OeK161NMo9QZyUeO .noteText,#mermaid-svg-OeK161NMo9QZyUeO .noteText>tspan{fill:black;stroke:none;}#mermaid-svg-OeK161NMo9QZyUeO .activation0{fill:#f4f4f4;stroke:#666;}#mermaid-svg-OeK161NMo9QZyUeO .activation1{fill:#f4f4f4;stroke:#666;}#mermaid-svg-OeK161NMo9QZyUeO .activation2{fill:#f4f4f4;stroke:#666;}#mermaid-svg-OeK161NMo9QZyUeO .actorPopupMenu{position:absolute;}#mermaid-svg-OeK161NMo9QZyUeO .actorPopupMenuPanel{position:absolute;fill:#ECECFF;box-shadow:0px 8px 16px 0px rgba(0,0,0,0.2);filter:drop-shadow(3px 5px 2px rgb(0 0 0 / 0.4));}#mermaid-svg-OeK161NMo9QZyUeO .actor-man line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-OeK161NMo9QZyUeO .actor-man circle,#mermaid-svg-OeK161NMo9QZyUeO line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;stroke-width:2px;}#mermaid-svg-OeK161NMo9QZyUeO :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} GET /ai/rag/call?message=...embed(message)queryVector384retrieveDocuments(queryVector)search TopK=3, L2id, message, scoreList SearchResultWithScorebuildRagPrompt + sort by scorePrompt(promptMessage)ChatResponse{ generation: "..." }

7.2 向量检索(RetrievalService)

  • topK = 3:最多返回 3 条相关片段
  • 度量类型 L2:与建表时 vector_index 一致
  • L2 距离越小表示越相似 (Controller 中按 score 升序排序)
java 复制代码
SearchReq searchReq = SearchReq.builder()
        .collectionName(MilvusConstant.MY_COLLECTION_1)
        .data(Collections.singletonList(new FloatVec(queryVector)))
        .topK(topK)
        .metricType(IndexParam.MetricType.L2)
        .outputFields(List.of("id", "message"))
        .build();

SearchResp searchResp = milvusClient.search(searchReq);

7.3 Prompt 拼装

将检索到的文档片段编号后作为「参考文章」,再接用户原问题:

java 复制代码
private String buildRagPrompt(List<SearchResultWithScore> results, String question) {
    StringBuilder prompt = new StringBuilder();
    prompt.append("参考文章:\n\n");
    for (int i = 0; i < results.size(); i++) {
        prompt.append("[").append(i + 1).append("] ")
                .append(results.get(i).getMessage())
                .append("\n\n");
    }
    prompt.append("请根据如上信息回答如下问题:\n")
            .append(question);
    return prompt.toString();
}

7.4 调用 DeepSeek 生成

java 复制代码
@GetMapping("/call")
public Map<String, String> call(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
    float[] vectors = EmbeddingModelUtil.embed(message);

    List<SearchResultWithScore> searchResultWithScores = retrievalService.retrieveDocuments(vectors);
    String promptMessage = message;
    if (!searchResultWithScores.isEmpty()) {
        searchResultWithScores = searchResultWithScores.stream()
                .sorted(Comparator.comparingDouble(SearchResultWithScore::getScore))
                .toList();
        promptMessage = buildRagPrompt(searchResultWithScores, message);
    }

    DeepSeekChatOptions options = DeepSeekChatOptions.builder()
            .model(DeepSeekApi.ChatModel.DEEPSEEK_V4_PRO.getValue())
            .temperature(0.8)
            .build();
    Prompt prompt = new Prompt(promptMessage, options);
    ChatResponse response = chatModel.call(prompt);
    return Map.of("generation", response.getResult().getOutput().getText());
}

说明: 此处将整段 RAG Prompt 作为 单条用户消息 发给模型,未使用 System Prompt 或多轮 Message 结构;对 Demo 足够,生产可改为 UserMessage + SystemMessage 分离。


八、启动与验证步骤

8.1 前置条件

  1. Milvus 运行在 localhost:19530(Docker 或本地安装)
  2. 设置环境变量 DEEPSEEK_API_KEY
  3. Java 17 + Maven

8.2 启动应用

bash 复制代码
# Windows PowerShell 示例
$env:DEEPSEEK_API_KEY="your-api-key"
mvn spring-boot:run

或:

bash 复制代码
java -jar target/javaai-0.0.1-SNAPSHOT.jar

启动日志中应出现:

  • start create milvus collections
  • 文件切分为 N 个文本片段
  • 成功插入 1 条数据(每个片段一条)

8.3 调用问答接口

bash 复制代码
curl "http://localhost:8080/javaai/ai/rag/call?message=SmartShell和传统堡垒机有什么区别?"

预期响应:

json 复制代码
{
    "generation": "SmartShell 是一款**智能化的运维与数据库管理工具**。它通过 SSH 连接主机和数据库,但不同于传统堡垒机的死板命令拦截,SmartShell 使用 AI 实时分析命令的实际风险。例如,面对类似 `rm -rf` 或 `DELETE FROM` 这类高危指令,它会结合当前目录内容或表中是否有数据等真实场景进行判断:如果目标是空目录、日志文件或空表,会放行并给予提示;如果表中有数据,则会先备份再执行,而不是像传统堡垒机那样无条件禁止。"
}

九、设计要点与可改进方向

主题 当前实现 建议
启动清库 每次启动 drop demo1 生产改为增量更新或版本化 Collection
Milvus 配置 硬编码 URI/Token 迁入 application.properties 或环境变量
入库效率 每段单独 insert 批量 insert 提升性能
检索失败 无结果时抛 RuntimeException 降级为「无上下文」直接问答
Prompt 简单字符串拼接 增加 System 角色、引用标注、长度截断
文档来源 仅 classpath 单文件 扩展 Tika 解析 PDF/Word、对象存储等

十、类与文件索引

复制代码
src/main/java/com/haiwei/javaai/demo1/
├── MilvusConstant.java          # 库名、Collection 名
├── MilvusConfig.java            # MilvusClientV2 Bean
├── MilvusCollectionCreator.java # 启动建库建表 + 触发 loadFile
├── MilvusService.java           # 文档切块、向量化、写入
├── RetrievalService.java        # Milvus 向量检索
├── ChatRAGController.java       # RAG 问答 API
└── SearchResultWithScore.java   # 检索结果模型

src/main/java/com/haiwei/javaai/langchain4j/
└── EmbeddingModelUtil.java      # 384 维本地嵌入

src/main/java/com/haiwei/javaai/service/impl/
└── MilvusClientService.java     # Milvus 连接单例

src/main/resources/
├── application.properties
└── data/Smartshell.txt          # RAG 知识库文档

总结

demo1 实现了一条清晰的 RAG 闭环:启动阶段 用 Milvus V2 SDK 建库建表并向量入库;请求阶段对用户问题做相同向量化、L2 检索 Top3、拼装「参考文章 + 问题」的 Prompt,再通过 Spring AI 调用 DeepSeek 生成答案。嵌入层与向量库维度(384)、度量(L2)保持一致,是整条链路能跑通的关键约束。