4-RAG-快速增进AI助手对我的了解

1-背景故事

上一篇AI助手学会了使用工具，终于能帮我做很多事情了呢。

但是继续聊天的过程中，发现她对我完全不了解，总感觉像个陌生人：

shell 复制代码

# 发送消息：我毕业几年了
curl http://localhost:8080/ai/chat?message=%E6%88%91%E6%AF%95%E4%B8%9A%E5%87%A0%E5%B9%B4%E4%BA%86
# AI助手回复：
你的毕业时间取决于你的具体情况，比如你是哪一年毕业的？如果你能告诉我具体的毕业年份，我可以帮你计算毕业多少年了。

有没有办法让她快速了解我呢？这就到RAG(检索增强生成)出马了。

2-动手实践

Spring AI又一次为RAG的实现提供了捷径。

不过即使这样，实现RAG需要的物料和步骤还是略显复杂。

不过别担心，跟着我的节奏一定能实现RAG，等实现之后，再去思考其中的道理。

为了让大家快速理解我们要做什么，我画了一张RAG缩略图供大家参考：

2.1 准备工作

完成本次实践，你需要额外准备：

pgvector向量数据库；
m3e中英文向量化模型；

💡以上两样你十有八九没有现成的，只要你安装了Docker或Podman，使用以下命令即可快速搭建：

shell 复制代码

# 1.启动pgvector数据库，如果《让AI助手拥有记忆》那一次你用的就是pgvector，可以复用
docker run -d \
    --name ai-assistant-pgvector \
    -p 5432:5432 \
    -e POSTGRES_DB=ai-assistant-db \
    -e POSTGRES_USER=ai-assistant \
    -e POSTGRES_PASSWORD=123456 \
    pgvector/pgvector:pg17

# 2.启动m3e中英文向量化模型
docker run -d \
    --name=ai-assistant-m3e \
    -p 6008:6008 \
    registry.cn-hangzhou.aliyuncs.com/fastgpt_docker/m3e-large-api:latest

2.2 添加组件依赖

在pom.xml中添加向量存储和RAG的依赖

xml 复制代码

       <!-- ... -->
       <!-- pgvector向量检索 -->
       <dependency>
          <groupId>org.springframework.ai</groupId>
          <artifactId>spring-ai-starter-vector-store-pgvector</artifactId>
       </dependency>
       <!-- RAG向量Advisor -->
       <dependency>
          <groupId>org.springframework.ai</groupId>
          <artifactId>spring-ai-advisors-vector-store</artifactId>
       </dependency>
       <!-- ... -->

2.3 添加向量存储配置

在application.yaml中添加以下内容：

yaml 复制代码

spring:
  #...
  # 数据库连接配置，之前配置过的可以省略
  datasource:
    driver-class-name: org.postgresql.Driver
    url: jdbc:postgresql://localhost:5432/ai-assistant-db
    username: ai-assistant
    password: 123456
  ai:
    openai:
      # 向量模型客户端配置
      embedding:
        # m3e服务地址，本地用前述docker命令启动就是下面这个
        base-url: "http://localhost:6008"
        # m3e的密钥，本地用前述docker命令启动默认就是下面这个
        api-key: "sk-aaabbbcccdddeeefffggghhhiiijjjkkk"
        options:
          model: m3e
    # 向量检索客户端配置
    vectorstore:
      pgvector:
        # 自动创建向量检索表
        initialize-schema: true
        # 对向量使用HNSW索引
        index-type: HNSW
        # 向量索引和检索时，使用余弦距离(适用于文本搜索)
        distance-type: COSINE_DISTANCE
        # 向量维度，与向量化模型匹配，m3e-large模型默认就是1536
        dimensions: 1536

2.4 代码修改

为了职责分离，我们新建一个向量控制器VectorStoreController：

java 复制代码

@RestController
public class VectorStoreController {
    private final VectorStore vectorStore;
    // 注入向理存储实例
    public VectorStoreController(VectorStore vectorStore) {
        this.vectorStore = vectorStore;
    }
    
    // 初始化资料库
    @GetMapping("vector/init")
    public String init() {
        List<Document> documents = List.of(
                new Document("我叫张三，出生于2000年1月1日，于2022年大学毕业于五道口职业技术学院，现工作于深圳南山科技园开心科技有限公司", Map.of("user", "zhangsan","category","resume")),
                new Document("我爸叫李雷，出生于1975年2月2日", Map.of("user", "lilei","category","resume")),
                new Document("我妈叫韩梅梅，出生于1975年3月3日", Map.of("user", "hanmeimei","category","resume")));
        vectorStore.add(documents);
        return "ok";
    }

    @GetMapping("vector/search")
    public List<Document> search(@RequestParam String query) {
        // Retrieve documents similar to a query
        return this.vectorStore.similaritySearch(SearchRequest.builder().query(query).topK(5).similarityThreshold(0.3).build());
    }
}

下面是将RAG正式引入聊天中的代码：

java 复制代码

@RestController
public class MyChatController {
    private final ChatClient chatClient;

    public MyChatController(ChatClient.Builder chatClientBuilder, ChatMemoryRepository chatMemoryRepository, VectorStore vectorStore) {
        ChatMemory chatMemory = MessageWindowChatMemory.builder()
                .chatMemoryRepository(chatMemoryRepository) // 消息记录到数据库
                .maxMessages(10) // 最多记住10条聊天记录
                .build();
        this.chatClient = chatClientBuilder
                .defaultAdvisors(new SimpleLoggerAdvisor())
                .defaultAdvisors(MessageChatMemoryAdvisor.builder(chatMemory).build())
                // ++++++添加RAG Advisor++++++++++
                .defaultAdvisors(
                        QuestionAnswerAdvisor.builder(vectorStore)
                                // 改写用户请求，拼装用户{query}与向量搜索结果{question_answer_context}
                                .promptTemplate(new PromptTemplate("""
                                         {query}
                                        
                                         ** 参考以下从资料库搜索到的信息：
                                         {question_answer_context}
                                        """))
                                // 设置向量库的请求参数，相似度阈值与返回的最大记录数，请根据实际效果调优
                                .searchRequest(SearchRequest.builder().similarityThreshold(0.3d).topK(5).build())
                                .build())
                // ================================
                .defaultTools(new DateTimeTools())
                .build();
    }
    //...
}

2.5 验证效果

shell 复制代码

# 第一步：初始化资料库
curl http://localhost:8080/vector/init

# 第二步：开始聊天

# 发送消息：我毕业几年了
curl http://localhost:8080/ai/chat?message=%E6%88%91%E6%AF%95%E4%B8%9A%E5%87%A0%E5%B9%B4%E4%BA%86
# AI助手回复(似乎忘了调用工具)：
你的毕业时间是 **2020年6月15日**。今天是 **2023年11月9日**，因此你已经毕业了 **3年4个月24天**。

# 发送消息提醒她：今天的日期
curl http://localhost:8080/ai/chat?message=%E4%BB%8A%E5%A4%A9%E7%9A%84%E6%97%A5%E6%9C%9F
今天的日期是 **2025年6月6日**。

# 再次提问：我毕业几年了
curl http://localhost:8080/ai/chat?message=%E6%88%91%E6%AF%95%E4%B8%9A%E5%87%A0%E5%B9%B4%E4%BA%86
# AI助手回复(终于答对了)：
根据你的毕业时间 **2022年6月30日**，今天是 **2025年6月6日**，你已经毕业了 **3年**。

现在她终于对我有所了解了🎉

3-课外扩展

看一下控制台日志或调试代码，整个调用过程、提示词是什么样的；
去深入了解一下RAG是什么；
实际文档入库的过程，对搜索结果及问题回答结果影响非常大，去了解一下RAG的完整流程和向量检索的优化过程；
你能用RAG做什么？
试着完善文档入库过程、并录入更多文档。

4-参考资料

《Spring AI官方文档-RAG》