Spring Boot + Milvus + LangChain4j 实现 RAG 问答:从向量入库到 DeepSeek 生成
本文基于 com.haiwei.javaai.demo1 包,梳理一套完整的 检索增强生成(RAG) 流程:应用启动时自动创建 Milvus 库表、加载文档并切块向量化入库;用户提问时检索相似片段、拼装 Prompt,再调用 DeepSeek 大模型生成答案。
一、整体架构
#mermaid-svg-qh8rBoWnTRc3fTIE{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-qh8rBoWnTRc3fTIE .error-icon{fill:#552222;}#mermaid-svg-qh8rBoWnTRc3fTIE .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-qh8rBoWnTRc3fTIE .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-qh8rBoWnTRc3fTIE .marker{fill:#333333;stroke:#333333;}#mermaid-svg-qh8rBoWnTRc3fTIE .marker.cross{stroke:#333333;}#mermaid-svg-qh8rBoWnTRc3fTIE svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-qh8rBoWnTRc3fTIE p{margin:0;}#mermaid-svg-qh8rBoWnTRc3fTIE .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE .cluster-label text{fill:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE .cluster-label span{color:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE .cluster-label span p{background-color:transparent;}#mermaid-svg-qh8rBoWnTRc3fTIE .label text,#mermaid-svg-qh8rBoWnTRc3fTIE span{fill:#333;color:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE .node rect,#mermaid-svg-qh8rBoWnTRc3fTIE .node circle,#mermaid-svg-qh8rBoWnTRc3fTIE .node ellipse,#mermaid-svg-qh8rBoWnTRc3fTIE .node polygon,#mermaid-svg-qh8rBoWnTRc3fTIE .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-qh8rBoWnTRc3fTIE .rough-node .label text,#mermaid-svg-qh8rBoWnTRc3fTIE .node .label text,#mermaid-svg-qh8rBoWnTRc3fTIE .image-shape .label,#mermaid-svg-qh8rBoWnTRc3fTIE .icon-shape .label{text-anchor:middle;}#mermaid-svg-qh8rBoWnTRc3fTIE .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-qh8rBoWnTRc3fTIE .rough-node .label,#mermaid-svg-qh8rBoWnTRc3fTIE .node .label,#mermaid-svg-qh8rBoWnTRc3fTIE .image-shape .label,#mermaid-svg-qh8rBoWnTRc3fTIE .icon-shape .label{text-align:center;}#mermaid-svg-qh8rBoWnTRc3fTIE .node.clickable{cursor:pointer;}#mermaid-svg-qh8rBoWnTRc3fTIE .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-qh8rBoWnTRc3fTIE .arrowheadPath{fill:#333333;}#mermaid-svg-qh8rBoWnTRc3fTIE .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-qh8rBoWnTRc3fTIE .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-qh8rBoWnTRc3fTIE .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-qh8rBoWnTRc3fTIE .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-qh8rBoWnTRc3fTIE .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-qh8rBoWnTRc3fTIE .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-qh8rBoWnTRc3fTIE .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-qh8rBoWnTRc3fTIE .cluster text{fill:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE .cluster span{color:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-qh8rBoWnTRc3fTIE .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-qh8rBoWnTRc3fTIE rect.text{fill:none;stroke-width:0;}#mermaid-svg-qh8rBoWnTRc3fTIE .icon-shape,#mermaid-svg-qh8rBoWnTRc3fTIE .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-qh8rBoWnTRc3fTIE .icon-shape p,#mermaid-svg-qh8rBoWnTRc3fTIE .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-qh8rBoWnTRc3fTIE .icon-shape .label rect,#mermaid-svg-qh8rBoWnTRc3fTIE .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-qh8rBoWnTRc3fTIE .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-qh8rBoWnTRc3fTIE .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-qh8rBoWnTRc3fTIE :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 问答请求阶段
应用启动阶段
Spring Boot 启动
MilvusCollectionCreator
创建/重建 demo1 数据库
创建 my_collection_1 Collection
MilvusService.loadFile
读取 Smartshell.txt
DocumentSplitter 切块
EmbeddingModelUtil 向量化
写入 Milvus
GET /ai/rag/call
用户 message 向量化
RetrievalService 向量检索 TopK
buildRagPrompt 拼装 Prompt
DeepSeekChatModel 生成回答
返回 generation
核心组件职责
| 类 | 职责 |
|---|---|
MilvusConfig |
注册 MilvusClientV2 Bean |
MilvusClientService |
Milvus 连接配置(URI、Token) |
MilvusConstant |
数据库名、Collection 名常量 |
MilvusCollectionCreator |
启动时建库建表,触发文档入库 |
MilvusService |
文档加载、切块、向量化、插入 Milvus |
EmbeddingModelUtil |
本地 ONNX 嵌入模型(384 维) |
RetrievalService |
按查询向量在 Milvus 中检索 Top3 |
ChatRAGController |
RAG 问答入口:检索 + Prompt + LLM |
SearchResultWithScore |
检索结果 DTO(id、score、message) |
数据流概览
- 离线/启动入库 :
Smartshell.txt→ 文本切块 →AllMiniLmL6V2384 维向量 → Milvusvector字段 + 原文message字段。 - 在线问答:用户问题 → 同样模型向量化 → Milvus L2 近邻搜索 → 取最相似 3 段原文 → 拼进 Prompt → DeepSeek 生成。
二、Maven 依赖(pom.xml)
demo1 依赖以下核心库(节选):
xml
<properties>
<java.version>17</java.version>
</properties>
<dependencies>
<!-- LangChain4j:文档处理、本地嵌入模型 -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j</artifactId>
<version>1.11.8</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-document-parser-apache-tika</artifactId>
<version>1.11.8-beta19</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-embeddings</artifactId>
<version>1.11.8-beta19</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-embeddings-all-minilm-l6-v2-q</artifactId>
<version>1.11.8-beta19</version>
</dependency>
<!-- Milvus Java SDK V2 -->
<dependency>
<groupId>io.milvus</groupId>
<artifactId>milvus-sdk-java</artifactId>
<version>2.6.18</version>
</dependency>
<!-- Spring AI DeepSeek -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-model-deepseek</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
</dependencies>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>2.0.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
依赖说明:
| 依赖 | 用途 |
|---|---|
langchain4j-embeddings-all-minilm-l6-v2-q |
本地量化嵌入模型,输出 384 维 向量,与 Milvus Collection 的 dimension(384) 一致 |
milvus-sdk-java |
Milvus V2 API:建库建表、插入、向量检索 |
spring-ai-starter-model-deepseek |
通过 Spring AI 调用 DeepSeek Chat API |
gson |
构造 Milvus 插入时的 JSON 行数据 |
三、配置与环境变量(application.properties)
properties
server.port=${SSHELL_PORT:8080}
server.servlet.context-path=/javaai
# DeepSeek(Spring AI)
spring.ai.deepseek.api-key=${DEEPSEEK_API_KEY:}
spring.ai.deepseek.base-url=${DEEPSEEK_API_BASE_URL:https://api.deepseek.com}
deepseek.api-base-url=${DEEPSEEK_API_BASE_URL:https://api.deepseek.com}
deepseek.api-key=${DEEPSEEK_API_KEY:}
deepseek.model=deepseek-chat
环境变量 / 启动参数
| 变量名 | 默认值 | 说明 |
|---|---|---|
SSHELL_PORT |
8080 |
HTTP 端口 |
DEEPSEEK_API_KEY |
空 | 必填,DeepSeek API Key |
DEEPSEEK_API_BASE_URL |
https://api.deepseek.com |
DeepSeek API 地址 |
Milvus 连接(代码内硬编码)
当前在 MilvusClientService 中配置,未 走 application.properties:
java
private static final String CLUSTER_ENDPOINT = "http://localhost:19530";
private static final String TOKEN = "root:Milvus";
启动前需确保本地 Milvus 已运行,且地址/凭证与上述一致。
完整问答 URL 示例
GET http://localhost:8080/javaai/ai/rag/call?message=SmartShell和传统堡垒机有什么区别?
四、Milvus 连接与 Bean 配置
MilvusConfig 将单例客户端注入 Spring 容器:
java
@Configuration
public class MilvusConfig {
@Bean
public MilvusClientV2 milvusClient() {
return MilvusClientService.getClient();
}
}
MilvusClientService 使用懒加载单例:
java
public class MilvusClientService {
private static final String CLUSTER_ENDPOINT = "http://localhost:19530";
private static final String TOKEN = "root:Milvus";
private static MilvusClientV2 instance;
public static MilvusClientV2 getClient() {
if (instance == null) {
ConnectConfig connectConfig = ConnectConfig.builder()
.uri(CLUSTER_ENDPOINT)
.token(TOKEN)
.build();
instance = new MilvusClientV2(connectConfig);
}
return instance;
}
}
常量定义:
java
public class MilvusConstant {
public static final String DATA_BASE = "demo1";
public static final String MY_COLLECTION_1 = "my_collection_1";
}
五、启动时创建 Milvus 库表与入库
MilvusCollectionCreator 实现 InitializingBean,在 所有 Bean 属性注入完成后 执行 afterPropertiesSet()。
5.1 启动逻辑(注意:会清空 demo1 库)
#mermaid-svg-fTPKw8evPxKutFgC{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-fTPKw8evPxKutFgC .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-fTPKw8evPxKutFgC .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-fTPKw8evPxKutFgC .error-icon{fill:#552222;}#mermaid-svg-fTPKw8evPxKutFgC .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-fTPKw8evPxKutFgC .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-fTPKw8evPxKutFgC .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-fTPKw8evPxKutFgC .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-fTPKw8evPxKutFgC .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-fTPKw8evPxKutFgC .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-fTPKw8evPxKutFgC .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-fTPKw8evPxKutFgC .marker{fill:#333333;stroke:#333333;}#mermaid-svg-fTPKw8evPxKutFgC .marker.cross{stroke:#333333;}#mermaid-svg-fTPKw8evPxKutFgC svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-fTPKw8evPxKutFgC p{margin:0;}#mermaid-svg-fTPKw8evPxKutFgC .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-fTPKw8evPxKutFgC .cluster-label text{fill:#333;}#mermaid-svg-fTPKw8evPxKutFgC .cluster-label span{color:#333;}#mermaid-svg-fTPKw8evPxKutFgC .cluster-label span p{background-color:transparent;}#mermaid-svg-fTPKw8evPxKutFgC .label text,#mermaid-svg-fTPKw8evPxKutFgC span{fill:#333;color:#333;}#mermaid-svg-fTPKw8evPxKutFgC .node rect,#mermaid-svg-fTPKw8evPxKutFgC .node circle,#mermaid-svg-fTPKw8evPxKutFgC .node ellipse,#mermaid-svg-fTPKw8evPxKutFgC .node polygon,#mermaid-svg-fTPKw8evPxKutFgC .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-fTPKw8evPxKutFgC .rough-node .label text,#mermaid-svg-fTPKw8evPxKutFgC .node .label text,#mermaid-svg-fTPKw8evPxKutFgC .image-shape .label,#mermaid-svg-fTPKw8evPxKutFgC .icon-shape .label{text-anchor:middle;}#mermaid-svg-fTPKw8evPxKutFgC .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-fTPKw8evPxKutFgC .rough-node .label,#mermaid-svg-fTPKw8evPxKutFgC .node .label,#mermaid-svg-fTPKw8evPxKutFgC .image-shape .label,#mermaid-svg-fTPKw8evPxKutFgC .icon-shape .label{text-align:center;}#mermaid-svg-fTPKw8evPxKutFgC .node.clickable{cursor:pointer;}#mermaid-svg-fTPKw8evPxKutFgC .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-fTPKw8evPxKutFgC .arrowheadPath{fill:#333333;}#mermaid-svg-fTPKw8evPxKutFgC .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-fTPKw8evPxKutFgC .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-fTPKw8evPxKutFgC .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-fTPKw8evPxKutFgC .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-fTPKw8evPxKutFgC .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-fTPKw8evPxKutFgC .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-fTPKw8evPxKutFgC .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-fTPKw8evPxKutFgC .cluster text{fill:#333;}#mermaid-svg-fTPKw8evPxKutFgC .cluster span{color:#333;}#mermaid-svg-fTPKw8evPxKutFgC div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-fTPKw8evPxKutFgC .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-fTPKw8evPxKutFgC rect.text{fill:none;stroke-width:0;}#mermaid-svg-fTPKw8evPxKutFgC .icon-shape,#mermaid-svg-fTPKw8evPxKutFgC .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-fTPKw8evPxKutFgC .icon-shape p,#mermaid-svg-fTPKw8evPxKutFgC .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-fTPKw8evPxKutFgC .icon-shape .label rect,#mermaid-svg-fTPKw8evPxKutFgC .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-fTPKw8evPxKutFgC .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-fTPKw8evPxKutFgC .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-fTPKw8evPxKutFgC :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 存在
不存在
检查 demo1 库是否存在
删除所有 Collection
dropDatabase demo1
createDatabase demo1
useDatabase demo1
创建 my_collection_1
milvusService.loadFile
重要行为: 若 demo1 数据库已存在,会先 删除该库下全部 Collection 并 drop 整个数据库,再重建。每次启动都会重新导入文档,适合 Demo,生产环境需改造。
5.2 Collection Schema
| 字段 | 类型 | 说明 |
|---|---|---|
id |
Int64 | 主键,autoID=true |
vector |
FloatVector | 维度 384,与嵌入模型一致 |
message |
VarChar(2000) | 切块后的原文片段 |
5.3 索引配置
vector:IVF_FLAT,度量L2,nlist=128message:AUTOINDEX(标量字段索引)
5.4 关键代码:建表
java
fieldSchemaList1.add(CreateCollectionReq.FieldSchema.builder()
.name("id")
.dataType(DataType.Int64)
.isPrimaryKey(true)
.autoID(true)
.build());
fieldSchemaList1.add(CreateCollectionReq.FieldSchema.builder()
.name("vector")
.dataType(DataType.FloatVector)
.dimension(384)
.build());
fieldSchemaList1.add(CreateCollectionReq.FieldSchema.builder()
.name("message")
.dataType(DataType.VarChar)
.maxLength(2000)
.build());
indexParams1.add(IndexParam.builder()
.indexName("vector_index")
.fieldName("vector")
.indexType(IndexParam.IndexType.IVF_FLAT)
.metricType(IndexParam.MetricType.L2)
.extraParams(Collections.singletonMap("nlist", 128))
.build());
milvusClientV2.createCollection(createCollectionReq);
// 建表成功后
milvusService.loadFile();
六、文档加载、拆分、向量化与存储
6.1 嵌入模型
使用 LangChain4j 本地 ONNX 模型,无需额外 API:
java
public class EmbeddingModelUtil {
private static final EmbeddingModel embeddingModel =
new AllMiniLmL6V2QuantizedEmbeddingModel();
public static float[] embed(String text) {
Response<Embedding> response = embeddingModel.embed(text);
return response.content().vector();
}
}
6.2 文档处理流程(MilvusService.loadFile)
#mermaid-svg-BbG8tWVRna2E8yS1{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-BbG8tWVRna2E8yS1 .error-icon{fill:#552222;}#mermaid-svg-BbG8tWVRna2E8yS1 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-BbG8tWVRna2E8yS1 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-BbG8tWVRna2E8yS1 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-BbG8tWVRna2E8yS1 .marker.cross{stroke:#333333;}#mermaid-svg-BbG8tWVRna2E8yS1 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-BbG8tWVRna2E8yS1 p{margin:0;}#mermaid-svg-BbG8tWVRna2E8yS1 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 .cluster-label text{fill:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 .cluster-label span{color:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 .cluster-label span p{background-color:transparent;}#mermaid-svg-BbG8tWVRna2E8yS1 .label text,#mermaid-svg-BbG8tWVRna2E8yS1 span{fill:#333;color:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 .node rect,#mermaid-svg-BbG8tWVRna2E8yS1 .node circle,#mermaid-svg-BbG8tWVRna2E8yS1 .node ellipse,#mermaid-svg-BbG8tWVRna2E8yS1 .node polygon,#mermaid-svg-BbG8tWVRna2E8yS1 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-BbG8tWVRna2E8yS1 .rough-node .label text,#mermaid-svg-BbG8tWVRna2E8yS1 .node .label text,#mermaid-svg-BbG8tWVRna2E8yS1 .image-shape .label,#mermaid-svg-BbG8tWVRna2E8yS1 .icon-shape .label{text-anchor:middle;}#mermaid-svg-BbG8tWVRna2E8yS1 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-BbG8tWVRna2E8yS1 .rough-node .label,#mermaid-svg-BbG8tWVRna2E8yS1 .node .label,#mermaid-svg-BbG8tWVRna2E8yS1 .image-shape .label,#mermaid-svg-BbG8tWVRna2E8yS1 .icon-shape .label{text-align:center;}#mermaid-svg-BbG8tWVRna2E8yS1 .node.clickable{cursor:pointer;}#mermaid-svg-BbG8tWVRna2E8yS1 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-BbG8tWVRna2E8yS1 .arrowheadPath{fill:#333333;}#mermaid-svg-BbG8tWVRna2E8yS1 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-BbG8tWVRna2E8yS1 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-BbG8tWVRna2E8yS1 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-BbG8tWVRna2E8yS1 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-BbG8tWVRna2E8yS1 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-BbG8tWVRna2E8yS1 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-BbG8tWVRna2E8yS1 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-BbG8tWVRna2E8yS1 .cluster text{fill:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 .cluster span{color:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-BbG8tWVRna2E8yS1 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-BbG8tWVRna2E8yS1 rect.text{fill:none;stroke-width:0;}#mermaid-svg-BbG8tWVRna2E8yS1 .icon-shape,#mermaid-svg-BbG8tWVRna2E8yS1 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-BbG8tWVRna2E8yS1 .icon-shape p,#mermaid-svg-BbG8tWVRna2E8yS1 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-BbG8tWVRna2E8yS1 .icon-shape .label rect,#mermaid-svg-BbG8tWVRna2E8yS1 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-BbG8tWVRna2E8yS1 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-BbG8tWVRna2E8yS1 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-BbG8tWVRna2E8yS1 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} ClassPathResource data/Smartshell.txt
Document.from 全文
recursive splitter 500/50
遍历 TextSegment
EmbeddingModelUtil.embed
insertData vector + message
Milvus insert
切块参数:
MAX_SEGMENT_SIZE = 500:单段最大字符数MAX_OVERLAP_SIZE = 50:段间重叠,避免语义在边界断裂
源文档: src/main/resources/data/Smartshell.txt(SmartShell 智能运维平台产品说明)
6.3 关键代码:加载与入库
java
public void loadFile() {
ClassPathResource resource = new ClassPathResource("data/Smartshell.txt");
String text = resource.getContentAsString(StandardCharsets.UTF_8);
Document document = Document.from(text);
DocumentSplitter splitter = DocumentSplitters.recursive(MAX_SEGMENT_SIZE, MAX_OVERLAP_SIZE);
List<TextSegment> segments = splitter.split(document);
for (TextSegment segment : segments) {
String segmentText = segment.text().trim();
if (segmentText.isEmpty()) continue;
float[] vector = EmbeddingModelUtil.embed(segmentText);
insertData(vector, segmentText);
}
}
public void insertData(float[] vector, String message) {
JsonObject row = new JsonObject();
row.add("vector", JsonUtil.gson.toJsonTree(vector));
row.addProperty("message", message);
milvusClientV2.insert(InsertReq.builder()
.collectionName(MilvusConstant.MY_COLLECTION_1)
.data(List.of(row))
.build());
}
入库与检索使用 同一套嵌入模型,是保证 RAG 效果的前提。
七、问答链路:ChatRAGController
7.1 请求处理流程
DeepSeekChatModel Milvus RetrievalService EmbeddingModelUtil ChatRAGController User DeepSeekChatModel Milvus RetrievalService EmbeddingModelUtil ChatRAGController User #mermaid-svg-OeK161NMo9QZyUeO{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-OeK161NMo9QZyUeO .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-OeK161NMo9QZyUeO .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-OeK161NMo9QZyUeO .error-icon{fill:#552222;}#mermaid-svg-OeK161NMo9QZyUeO .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-OeK161NMo9QZyUeO .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-OeK161NMo9QZyUeO .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-OeK161NMo9QZyUeO .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-OeK161NMo9QZyUeO .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-OeK161NMo9QZyUeO .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-OeK161NMo9QZyUeO .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-OeK161NMo9QZyUeO .marker{fill:#333333;stroke:#333333;}#mermaid-svg-OeK161NMo9QZyUeO .marker.cross{stroke:#333333;}#mermaid-svg-OeK161NMo9QZyUeO svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-OeK161NMo9QZyUeO p{margin:0;}#mermaid-svg-OeK161NMo9QZyUeO .actor{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-OeK161NMo9QZyUeO text.actor>tspan{fill:black;stroke:none;}#mermaid-svg-OeK161NMo9QZyUeO .actor-line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-OeK161NMo9QZyUeO .innerArc{stroke-width:1.5;stroke-dasharray:none;}#mermaid-svg-OeK161NMo9QZyUeO .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333;}#mermaid-svg-OeK161NMo9QZyUeO .messageLine1{stroke-width:1.5;stroke-dasharray:2,2;stroke:#333;}#mermaid-svg-OeK161NMo9QZyUeO #arrowhead path{fill:#333;stroke:#333;}#mermaid-svg-OeK161NMo9QZyUeO .sequenceNumber{fill:white;}#mermaid-svg-OeK161NMo9QZyUeO #sequencenumber{fill:#333;}#mermaid-svg-OeK161NMo9QZyUeO #crosshead path{fill:#333;stroke:#333;}#mermaid-svg-OeK161NMo9QZyUeO .messageText{fill:#333;stroke:none;}#mermaid-svg-OeK161NMo9QZyUeO .labelBox{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-OeK161NMo9QZyUeO .labelText,#mermaid-svg-OeK161NMo9QZyUeO .labelText>tspan{fill:black;stroke:none;}#mermaid-svg-OeK161NMo9QZyUeO .loopText,#mermaid-svg-OeK161NMo9QZyUeO .loopText>tspan{fill:black;stroke:none;}#mermaid-svg-OeK161NMo9QZyUeO .loopLine{stroke-width:2px;stroke-dasharray:2,2;stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-OeK161NMo9QZyUeO .note{stroke:#aaaa33;fill:#fff5ad;}#mermaid-svg-OeK161NMo9QZyUeO .noteText,#mermaid-svg-OeK161NMo9QZyUeO .noteText>tspan{fill:black;stroke:none;}#mermaid-svg-OeK161NMo9QZyUeO .activation0{fill:#f4f4f4;stroke:#666;}#mermaid-svg-OeK161NMo9QZyUeO .activation1{fill:#f4f4f4;stroke:#666;}#mermaid-svg-OeK161NMo9QZyUeO .activation2{fill:#f4f4f4;stroke:#666;}#mermaid-svg-OeK161NMo9QZyUeO .actorPopupMenu{position:absolute;}#mermaid-svg-OeK161NMo9QZyUeO .actorPopupMenuPanel{position:absolute;fill:#ECECFF;box-shadow:0px 8px 16px 0px rgba(0,0,0,0.2);filter:drop-shadow(3px 5px 2px rgb(0 0 0 / 0.4));}#mermaid-svg-OeK161NMo9QZyUeO .actor-man line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-OeK161NMo9QZyUeO .actor-man circle,#mermaid-svg-OeK161NMo9QZyUeO line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;stroke-width:2px;}#mermaid-svg-OeK161NMo9QZyUeO :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} GET /ai/rag/call?message=...embed(message)queryVector384retrieveDocuments(queryVector)search TopK=3, L2id, message, scoreList SearchResultWithScorebuildRagPrompt + sort by scorePrompt(promptMessage)ChatResponse{ generation: "..." }
7.2 向量检索(RetrievalService)
topK = 3:最多返回 3 条相关片段- 度量类型
L2:与建表时vector_index一致 - L2 距离越小表示越相似 (Controller 中按
score升序排序)
java
SearchReq searchReq = SearchReq.builder()
.collectionName(MilvusConstant.MY_COLLECTION_1)
.data(Collections.singletonList(new FloatVec(queryVector)))
.topK(topK)
.metricType(IndexParam.MetricType.L2)
.outputFields(List.of("id", "message"))
.build();
SearchResp searchResp = milvusClient.search(searchReq);
7.3 Prompt 拼装
将检索到的文档片段编号后作为「参考文章」,再接用户原问题:
java
private String buildRagPrompt(List<SearchResultWithScore> results, String question) {
StringBuilder prompt = new StringBuilder();
prompt.append("参考文章:\n\n");
for (int i = 0; i < results.size(); i++) {
prompt.append("[").append(i + 1).append("] ")
.append(results.get(i).getMessage())
.append("\n\n");
}
prompt.append("请根据如上信息回答如下问题:\n")
.append(question);
return prompt.toString();
}
7.4 调用 DeepSeek 生成
java
@GetMapping("/call")
public Map<String, String> call(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
float[] vectors = EmbeddingModelUtil.embed(message);
List<SearchResultWithScore> searchResultWithScores = retrievalService.retrieveDocuments(vectors);
String promptMessage = message;
if (!searchResultWithScores.isEmpty()) {
searchResultWithScores = searchResultWithScores.stream()
.sorted(Comparator.comparingDouble(SearchResultWithScore::getScore))
.toList();
promptMessage = buildRagPrompt(searchResultWithScores, message);
}
DeepSeekChatOptions options = DeepSeekChatOptions.builder()
.model(DeepSeekApi.ChatModel.DEEPSEEK_V4_PRO.getValue())
.temperature(0.8)
.build();
Prompt prompt = new Prompt(promptMessage, options);
ChatResponse response = chatModel.call(prompt);
return Map.of("generation", response.getResult().getOutput().getText());
}
说明: 此处将整段 RAG Prompt 作为 单条用户消息 发给模型,未使用 System Prompt 或多轮 Message 结构;对 Demo 足够,生产可改为 UserMessage + SystemMessage 分离。
八、启动与验证步骤
8.1 前置条件
- Milvus 运行在
localhost:19530(Docker 或本地安装) - 设置环境变量
DEEPSEEK_API_KEY - Java 17 + Maven
8.2 启动应用
bash
# Windows PowerShell 示例
$env:DEEPSEEK_API_KEY="your-api-key"
mvn spring-boot:run
或:
bash
java -jar target/javaai-0.0.1-SNAPSHOT.jar
启动日志中应出现:
start create milvus collections文件切分为 N 个文本片段成功插入 1 条数据(每个片段一条)
8.3 调用问答接口
bash
curl "http://localhost:8080/javaai/ai/rag/call?message=SmartShell和传统堡垒机有什么区别?"
预期响应:
json
{
"generation": "SmartShell 是一款**智能化的运维与数据库管理工具**。它通过 SSH 连接主机和数据库,但不同于传统堡垒机的死板命令拦截,SmartShell 使用 AI 实时分析命令的实际风险。例如,面对类似 `rm -rf` 或 `DELETE FROM` 这类高危指令,它会结合当前目录内容或表中是否有数据等真实场景进行判断:如果目标是空目录、日志文件或空表,会放行并给予提示;如果表中有数据,则会先备份再执行,而不是像传统堡垒机那样无条件禁止。"
}
九、设计要点与可改进方向
| 主题 | 当前实现 | 建议 |
|---|---|---|
| 启动清库 | 每次启动 drop demo1 库 |
生产改为增量更新或版本化 Collection |
| Milvus 配置 | 硬编码 URI/Token | 迁入 application.properties 或环境变量 |
| 入库效率 | 每段单独 insert |
批量 insert 提升性能 |
| 检索失败 | 无结果时抛 RuntimeException |
降级为「无上下文」直接问答 |
| Prompt | 简单字符串拼接 | 增加 System 角色、引用标注、长度截断 |
| 文档来源 | 仅 classpath 单文件 | 扩展 Tika 解析 PDF/Word、对象存储等 |
十、类与文件索引
src/main/java/com/haiwei/javaai/demo1/
├── MilvusConstant.java # 库名、Collection 名
├── MilvusConfig.java # MilvusClientV2 Bean
├── MilvusCollectionCreator.java # 启动建库建表 + 触发 loadFile
├── MilvusService.java # 文档切块、向量化、写入
├── RetrievalService.java # Milvus 向量检索
├── ChatRAGController.java # RAG 问答 API
└── SearchResultWithScore.java # 检索结果模型
src/main/java/com/haiwei/javaai/langchain4j/
└── EmbeddingModelUtil.java # 384 维本地嵌入
src/main/java/com/haiwei/javaai/service/impl/
└── MilvusClientService.java # Milvus 连接单例
src/main/resources/
├── application.properties
└── data/Smartshell.txt # RAG 知识库文档
总结
demo1 实现了一条清晰的 RAG 闭环:启动阶段 用 Milvus V2 SDK 建库建表并向量入库;请求阶段对用户问题做相同向量化、L2 检索 Top3、拼装「参考文章 + 问题」的 Prompt,再通过 Spring AI 调用 DeepSeek 生成答案。嵌入层与向量库维度(384)、度量(L2)保持一致,是整条链路能跑通的关键约束。