AI 客服系统升级实战：多 Agent 路由 + 多轮记忆 + 敏感词过滤

从单 Agent 到专业分工体系，顺带把 ChatMemory 和敏感词过滤这两个真实业务场景打通

先说结论

上一篇搭好了基础框架：情绪分析 → 意图识别 → Agent 工具调用。但真实的客服场景里，一个全能 Agent 扛所有请求会有明显问题：Prompt 越写越长、工具越挂越多，最后模型的注意力开始跑偏。

这篇把后来做的三块改造拆开讲：

上篇已有的	本篇新增的
单一 CustomerServiceAgent	3 专业 Agent + Router 分流
Redis 自定义 ChatMemory	Spring AI 官方 JdbcChatMemoryRepository
纯 LLM 意图识别	LLM 识别 + 关键词兜底双保险
无安全过滤	双向敏感词过滤（输入 + 输出）

先说结论：Router + 多 Agent 的价值不在于"高大上"，而在于让每个 Agent 的 Prompt 专注、工具干净；多轮记忆迁移 Spring AI 官方实现后反而更省事，踩坑主要在 API 设计理解上。

1. 多 Agent 拆分 ------ 别让一个 Agent 扛所有

为什么要拆

原来的 CustomerServiceAgent 要处理：产品咨询、订单查询、退款申请、物流追踪、投诉安抚......Prompt 超过 500 字，工具挂了 6 个。说实话跑起来以后发现模型经常"混淆"：用户说"东西用着不好用"它有时候当 RAG 处理（搜知识库），有时候当投诉处理（准备转人工），飘忽得很。

解法是拆成 3 个专注的 Agent，每个专注一件事：

BaseAgent 设计

三个 Agent 共享一套 ChatClient 组装逻辑，抽成基类：

java 复制代码

public abstract class BaseAgent {

    protected final ChatClient chatClient;
    protected final ChatMemory chatMemory;

    protected BaseAgent(ChatClient chatClient, ChatMemory chatMemory) {
        this.chatClient = chatClient;
        this.chatMemory = chatMemory;
    }

    /**
     * 核心对话方法：记忆 Advisor + 额外 Advisor + 工具
     *
     * @param extraAdvisors 子类传入的额外 Advisor（如 RAG Advisor）
     */
    protected String chatWithAdvisors(String userMessage, String systemPrompt,
                                      String sessionId, Object... extraAdvisors) {
        var memoryAdvisor = MessageChatMemoryAdvisor.builder(chatMemory)
                .conversationId(sessionId)
                .build();

        // 区分 Advisor 和 Tool
        List<Advisor> advisors = new ArrayList<>();
        advisors.add(memoryAdvisor);
        List<Object> tools = new ArrayList<>();

        for (Object extra : extraAdvisors) {
            if (extra instanceof Advisor a) {
                advisors.add(a);
            } else {
                tools.add(extra);
            }
        }

        // ... 构建 ChatClient 调用链
    }
}

路由规则

CustomerSupportRouter 是一个纯规则路由器，不调 LLM，延迟稳定：

java 复制代码

public String chat(RoutingContext ctx, String message,
                   String emotionStrategy, String sessionId) {
    // 1. 情绪优先：ANGRY 直接走投诉 Agent
    if (ctx.emotionLevel() == EmotionLevel.ANGRY
            || ctx.intentType() == IntentType.COMPLAINT) {
        return complaintAgent.chat(message, emotionStrategy, sessionId);
    }

    // 2. 售后意图 → 售后 Agent
    if (isPostSalesIntent(ctx.intentType())) {
        return postSalesAgent.chat(message, emotionStrategy, sessionId);
    }

    // 3. 其余 → 售前 Agent（产品咨询 + RAG 兜底）
    return preSalesAgent.chat(message, emotionStrategy, sessionId);
}

private boolean isPostSalesIntent(IntentType intent) {
    return intent == IntentType.ORDER_QUERY
            || intent == IntentType.REFUND
            || intent == IntentType.LOGISTICS;
}

RoutingContext 携带三个决策维度：

java 复制代码

public record RoutingContext(
    EmotionLevel emotionLevel,    // 来自情绪分析
    IntentType   intentType,      // 来自意图识别
    SessionPhase sessionPhase     // 当前会话阶段（售前/售后/进行中）
) {}

踩坑提醒：最开始设计了 5 个 Agent（加了 TechnicalSupportAgent 和 RecommendationAgent），结果发现 IntentType.RAG 同时覆盖了技术支持和产品推荐，两个 Agent 永远不会被路由到。精简成 3 个反而更清晰。

为什么不用 Spring AI Alibaba 的 LlmRoutingAgent

看到 Spring AI Alibaba 源码里有 LlmRoutingAgent 就想直接用，结果发现 spring-ai-alibaba-graph 1.1.2.1 在 Maven 中央仓库根本不存在（404），1.1.2.2 才有但彼时项目还没升版本。

最终选择了手写规则路由。说实话对于客服场景，规则路由有自己的优势：

延迟稳定，不需要额外的 LLM 调用来做路由决策
行为可预期，QA 更容易写测试用例
出问题好排查，日志里能清楚看到走了哪条分支

当然缺点也有：新增意图类型时需要改代码。等后续有机会再评估升 LlmRoutingAgent。

意图识别降级策略

LLM 识别不是百分百可靠，设了一个置信度阈值兜底：

java 复制代码

private IntentClassifier.IntentResult classifyIntent(String message, String sessionId) {
    try {
        IntentClassifier.IntentResult result = intentClassifier.classify(message, sessionId);
        // confidence 不够高，走关键词兜底
        if (result == null || result.confidence() < 0.3) {
            log.warn("意图识别置信度过低({})，降级为关键词匹配", 
                     result != null ? result.confidence() : "null");
            return quickRuleMatch(message);
        }
        return result;
    } catch (Exception e) {
        log.warn("意图识别异常，降级为关键词匹配", e);
        return quickRuleMatch(message);
    }
}

private IntentClassifier.IntentResult quickRuleMatch(String message) {
    if (message.contains("退款") || message.contains("退货")) {
        return IntentClassifier.IntentResult.of(IntentType.REFUND, 0.8);
    }
    if (message.contains("订单") || message.contains("快递") || message.contains("物流")) {
        return IntentClassifier.IntentResult.of(IntentType.ORDER_QUERY, 0.8);
    }
    if (message.contains("转人工") || message.contains("人工客服")) {
        return IntentClassifier.IntentResult.of(IntentType.HUMAN_TRANSFER, 0.95);
    }
    return IntentClassifier.IntentResult.of(IntentType.GENERAL, 0.5);
}

踩坑提醒：不要省略关键词兜底。GLM-4-Flash 偶尔会在高并发时返回不完整的 JSON，这时 confidence 解析失败，没有兜底的话直接 NPE。

--- ------ 用 Spring AI 官方实现替代手写 Redis

原来的问题

原系统在 ConversationService 里每轮对话后手动调 agent.recordMemory() 写 Redis，然后在 CustomerServiceAgent 构造时再重新注入。两处管内存，偶尔会出现第 2 轮对话没拿到历史这种诡异 bug。

换成 Spring AI 官方的 MessageChatMemoryAdvisor 后，读写全部交给 Advisor 自动处理。

三层结构弄清楚再上手

直接看这张关系图，不然很容易搞混：

graph TB Advisor[MessageChatMemoryAdvisor
before: 读历史注入 prompt
after: 写本次对话] --> Memory Memory[MessageWindowChatMemory
implements ChatMemory
窗口截断：maxMessages] --> Repo Repo[JdbcChatMemoryRepository
implements ChatMemoryRepository
实际存取 PostgreSQL]

三个接口，三层职责，不要把 JdbcChatMemoryRepository 直接赋给 ChatMemory，它们是不同的接口。

配置代码

java 复制代码

@Configuration
public class ChatMemoryConfig {

    @Bean
    public JdbcChatMemoryRepository jdbcChatMemoryRepository(JdbcTemplate jdbcTemplate) {
        // 自动建表（幂等），PostgreSQL 方言
        return JdbcChatMemoryRepository.builder()
                .jdbcTemplate(jdbcTemplate)
                .dialect(new PostgresChatMemoryRepositoryDialect())
                .build();
    }

    @Bean
    public ChatMemory chatMemory(JdbcChatMemoryRepository repository) {
        // maxMessages 从 DB 配置读取，支持运营后台调整（启动时读一次）
        int maxMessages = dict.getInt("session.history_max_rounds", 10) * 2;
        return MessageWindowChatMemory.builder()
                .chatMemoryRepository(repository)
                .maxMessages(maxMessages)
                .build();
    }
}

自动建的表结构：

sql 复制代码

CREATE TABLE IF NOT EXISTS SPRING_AI_CHAT_MEMORY (
    conversation_id  VARCHAR(255)  NOT NULL,
    content          TEXT          NOT NULL,
    type             VARCHAR(50)   NOT NULL,   -- USER / ASSISTANT / SYSTEM
    "timestamp"      TIMESTAMP     NOT NULL DEFAULT CURRENT_TIMESTAMP
);
-- timestamp 是 PG 关键字，必须加双引号
CREATE INDEX IF NOT EXISTS idx_chat_memory_conv_id
    ON SPRING_AI_CHAT_MEMORY (conversation_id, "timestamp");

Agent 侧用法（极简）

java 复制代码

public String chatWithTools(String userMessage, String systemPrompt, String sessionId) {
    var memoryAdvisor = MessageChatMemoryAdvisor.builder(chatMemory)
            .conversationId(sessionId)   // 用 sessionId 隔离不同用户的历史
            .build();

    return chatClient.prompt()
            .system(systemPrompt)
            .user(userMessage)
            .advisors(
                    memoryAdvisor,   // 先注入历史，再走 RAG
                    ragAdvisor
            )
            .tools(tools)
            .call()
            .content();
}

recordMemory() 手动调用全部删掉，ConversationService 里也不用再管这件事。

踩坑提醒：MessageChatMemoryAdvisor.Builder 在 Spring AI 1.1.x 里没有 windowSize() 方法 ，窗口大小由 MessageWindowChatMemory 的 maxMessages 控制，不要在 Advisor 层找这个配置。

踩坑汇总：ChatMemory 依赖关系

做这块改造前，我把 Spring AI 的 ChatMemory 相关类搞混了好几次。直接把这张关系表贴出来省事：

类/接口	所在 jar	职责
`ChatMemory`	`spring-ai-model`	顶层接口，Advisor 依赖它
`MessageWindowChatMemory`	`spring-ai-model`	`ChatMemory` 实现，负责窗口截断
`ChatMemoryRepository`	`spring-ai-model`	存储层接口
`JdbcChatMemoryRepository`	`spring-ai-model-chat-memory-repository-jdbc`	JDBC 存储实现
`MessageChatMemoryAdvisor`	`spring-ai-client-chat`	Advisor，自动读写历史

pom 里只需要加这一个依赖，其余通过传递依赖自动带进来：

xml 复制代码

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-model-chat-memory-repository-jdbc</artifactId>
</dependency>

spring-ai-model-chat-memory 这个 artifact 不需要单独引 ，ChatMemory、MessageWindowChatMemory 都在 spring-ai-model 里。

--- ------ 双向过滤，配置存 DB

设计思路

敏感词列表放 system_config 表（JSON 格式），通过 SensitiveWordService 加载缓存，在 ConversationService 的两个位置插入：

sequenceDiagram participant U as 用户输入 participant S as SensitiveWordService participant A as Agent participant O as AI输出 U ->> S: filter(userInput) S -->> A: 过滤后文本 A -->> S: filter(aiResponse) S -->> O: 过滤后回复

核心实现

java 复制代码

@Service
public class SensitiveWordServiceImpl implements SensitiveWordService {

    // 从 DB 加载，内存缓存
    private volatile Set<String> sensitiveWords = new HashSet<>();

    @Override
    public String filter(String text) {
        if (!isEnabled() || text == null) return text;

        String result = text;
        String replaceChar = getReplaceChar();  // 默认 "***"
        for (String word : sensitiveWords) {
            result = result.replace(word, replaceChar);
        }
        return result;
    }

    @Override
    public boolean containsSensitiveWord(String text) {
        if (!isEnabled() || text == null) return false;
        return sensitiveWords.stream().anyMatch(text::contains);
    }
}

在 ConversationService 里嵌入

java 复制代码

public String process(String userId, String source, String message) {
    // 0. 过滤用户输入
    String filteredMessage = message;
    if (sensitiveWordService.isEnabled() && sensitiveWordService.isFilterUserInput()) {
        if (sensitiveWordService.containsSensitiveWord(message)) {
            filteredMessage = sensitiveWordService.filter(message);
            log.warn("用户输入含敏感词，已过滤");
        }
    }

    // ... 正常对话流程 ...

    // 9. 过滤 AI 输出
    if (sensitiveWordService.isEnabled() && sensitiveWordService.isFilterAiOutput()) {
        response = sensitiveWordService.filter(response);
    }

    return response;
}

DB 配置示例：

sql 复制代码

INSERT INTO system_config (config_type, config_key, config_value) VALUES
('JSON', 'sensitive.words',       '["违禁词1","违禁词2"]'),
('JSON', 'sensitive.replace_char','***'),
('JSON', 'sensitive.enabled',     'true'),
('JSON', 'sensitive.filter_user_input',  'true'),
('JSON', 'sensitive.filter_ai_output',   'true');

踩坑提醒：敏感词列表首次加载依赖 DB 连接，如果 @PostConstruct 时 DB 还没有数据，sensitiveWords 会是空集合。建议在初始化方法里加 isEmpty() 检查，空时打 warn 日志提示。

4. 完整对话流程 ------ 把三块串起来

升级后 ConversationService.process() 的完整执行链：

scss 复制代码

用户输入
  │
  ▼
敏感词过滤（用户输入）
  │
  ▼
会话创建/恢复（Redis + PostgreSQL）
  │
  ▼
情绪预检（ANGRY → 直接转人工，不走 Agent）
  │
  ▼
意图识别（GLM Few-shot → confidence ≥ 0.3 采用，否则关键词兜底）
  │
  ▼
ConversationContext 注入 ThreadLocal（Tool 从这里取 userId/sessionId）
  │
  ▼
CustomerSupportRouter.chat()
  ├─ ComplaintAgent（情绪 or 投诉意图）
  ├─ PostSalesAgent（订单/退款/物流）
  └─ PreSalesAgent（产品/知识库/其他）
      │
      ├─ MessageChatMemoryAdvisor（读 SPRING_AI_CHAT_MEMORY）
      ├─ KnowledgeRetrievalAdvisor（RAG 检索）
      └─ @Tool 工具调用（ReAct 模式）
  │
  ▼
敏感词过滤（AI 输出）
  │
  ▼
记录消息到 PostgreSQL + 返回回复

降级链设计

java 复制代码

try {
    // 主路径：Router + Agent
    response = router.chat(routingContext, message, emotionStrategy, sessionId);
} catch (Exception e) {
    try {
        // 一级兜底：关键词规则路由
        response = fallbackRoute(userId, message, emotion);
    } catch (Exception e2) {
        // 二级兜底：直接转人工
        response = "系统异常，已为您转接人工客服。\n"
                 + humanTransferTool.transferToHuman("系统异常，自动转人工");
    }
}

本篇方案 vs 改造前对比

维度	改造前	改造后
Agent 数量	1 个全能 Agent	3 个专业 Agent + 规则路由
Prompt 长度	500+ 字，大杂烩	每 Agent ≤ 200 字，专注清晰
多轮记忆	手写 Redis + 手动 recordMemory()	Spring AI 官方 JdbcChatMemoryRepository，Advisor 自动读写
记忆依赖	Redis（额外部署成本）	PostgreSQL，与业务 DB 统一
敏感词	无	双向过滤，配置存 DB，可运营后台管理
意图识别	LLM 单点识别	LLM + 关键词兜底双保险
降级策略	无	Router 失败 → 关键词规则 → 转人工三级降级

几个做完之后的感悟

关于多 Agent 拆分 ：起初觉得 3 个 Agent 是"过度设计"，跑起来以后发现最明显的收益是 debug 容易多了。投诉 Agent 回了一句奇怪的话，我只要看 ComplaintAgent 的 Prompt 和上下文，不用在一个 800 字的 Prompt 大杂烩里找原因。

关于 Spring AI ChatMemory ：官方封装比自己写 Redis 省事，但前提是看懂三层接口结构。文档这块写得不够直观，很多人（包括我）第一反应是"直接把 JdbcChatMemoryRepository 当 ChatMemory 用"，然后编译报错。建议把那张关系图打印出来贴桌上。

关于敏感词过滤：看起来简单，真正做完后发现有几个运营细节值得注意：

场景	建议处理方式
用户消息包含敏感词	替换后正常回复，不要直接拒绝（避免误伤）
AI 输出包含敏感词	替换后返回，同时 warn 日志记录（方便运营审查）
敏感词列表为空	打印 warn 日志提示管理员配置，不要 block 正常请求
新增敏感词	更新 DB 后需要触发缓存刷新，否则要等服务重启才生效

最后一条"缓存刷新"是目前还没做完的部分------SensitiveWordController 的管理接口，以及刷新缓存的 API，放在下一版迭代里。

源码怎么拿

公众号「亦暖筑序」底部菜单【获取源码】，Gitee 仓库直接拉。

源码里除了文章提到的这些，还有：

完整的 KnowledgeRetrievalAdvisor 实现（hybridSearch：向量 + 关键词双路检索）
意图识别 Few-shot Prompt 模板（含槽位提取逻辑，6 种意图类型）
SensitiveWordService 运营管理接口（增删改查敏感词，待完成中）

附录：踩坑速查表

整理一下这篇涉及到的坑，方便直接来查：

坑	现象	解决
用 5 个 Agent	`IntentType.RAG` 同时覆盖多个 Agent，部分 Agent 永远不被路由	精简为 3 个，IntentType 和 Agent 一一对应
`spring-ai-alibaba-graph` 1.1.2.1	Maven 下载 404	用手写 `CustomerSupportRouter` 规则路由替代
`JdbcChatMemoryRepository` 赋给 `ChatMemory`	编译报错：incompatible types	用 `MessageWindowChatMemory` 包装后再赋给 `ChatMemory`
`MessageChatMemoryAdvisor` 找不到 `windowSize()`	编译报错：cannot find method	窗口由 `MessageWindowChatMemory.maxMessages` 控制，不在 Advisor 层
`spring-ai-model-chat-memory` 下载失败	`ChatMemory` 找不到符号	实际在 `spring-ai-model` 里，不需要单独引
`timestamp` 建表失败	PG 保留字冲突	列名加双引号：`"timestamp"`
敏感词列表启动为空	DB 没数据或初始化时序问题	加 `isEmpty()` 检查，打 warn 日志，不 block 请求
LLM 意图识别 confidence 解析失败	高并发下返回不完整 JSON，NPE	加 try-catch + 关键词兜底，confidence 为 null 时走规则匹配
Lombok @Slf4j 编译失败 36 个错误	看起来像注解处理器没生效	根因是代码里有重复方法定义，先修代码错误再排查注解处理器

这张表是实际踩过的，不是凑字数的。

下一步

这篇到这里结束。系统现在跑起来的主链路已经相对完整。

但主链路跑通之后，第一个要补的不是 RAG，而是安全------一个没有鉴权、没有限流的 AI 接口，放到生产环境基本等于裸奔。

下一篇专门讲这块：

[04] AI 客服系统安全加固：JWT 鉴权 + Bucket4j 三层限流

覆盖内容：JWT Filter 链接入 Spring Security、三层令牌桶限流（全局 / 用户 / LLM 接口）、链路追踪 Filter、生产密钥强校验。项目里这些已经实现完整，下一篇逐层拆开讲。

RAG 知识库（向量检索、文档切片、混合检索）和转人工流程（HumanTransferTool + 工作台接受侧）计划放在后续篇章单独展开。