Spring AI（3）——Chat Memory

Chat Memory介绍

大型语言模型（LLM）是无状态的，这意味着它们不保留关于以前互动的信息。为了解决这个问题，Spring AI提供了Chat Memory（聊天记忆）功能。通过Chat Memory，用户可以实现在与LLM的多次交互中存储和检索信息。

聊天记忆的底层存储由ChatMemoryRepository处理，其唯一责任是存储和检索消息。决定保留哪些消息及何时删除这些消息的权利在于ChatMemory的实现。策略可能包括保留最近的N条消息，保留一定时间段的消息，或者保留指定最大令牌数的消息。

ChatMemory当前有一种实现：MessageWindowChatMemory。MessageWindowChatMemory 维护一个最多可达到指定最大大小（默认：20 条消息）的消息窗口。当消息数量超过此限制时，旧消息会被驱逐，但系统消息会被保留。如果添加了一条新的系统消息，则会从聊天记忆中删除所有以前的系统消息。这确保了最新的上下文始终可用于对话，同时保持聊天记忆使用在可控范围内。

ChatModel对象使用Chat Memory

不使用Chat Memory的现象

java 复制代码

    @GetMapping("/chat")
    public String chat() {
        String answer = this.chatModel.call("你好，我是老任与码");
        System.out.println(answer);

        String answer2 = this.chatModel.call("我是谁");
        System.out.println(answer2);
        return "success";
    }

输出结果：

根据输出结果可以看到，第一次提问时，虽然已经告知大模型自己的名字，但是第二次提问时，大模型并不能回答出正确答案。这就说明本例中，两次提问是相互独立的，大模型是无状态的。

使用Chat Memory

不需要任何配置，即可直接注入ChatMemory对象：

java 复制代码

@RestController
@RequestMapping("/memory")
public class MemoryController {

    @Resource
    private ZhiPuAiChatModel chatModel;

    @Resource
    private ChatClient chatClient;

    // 可以直接注入，也可以自定义创建
    @Resource
    private ChatMemory chatMemory;

    ......

}

也可以在配置类中创建ChatMemory对象后，再注入：

java 复制代码

    @Bean
    public ChatMemory chatMemory() {
        MessageWindowChatMemory memory = MessageWindowChatMemory.builder()
                .maxMessages(10)
                .chatMemoryRepository(new InMemoryChatMemoryRepository())
                .build();
        return memory;
    }

本例根据MessageWindowChatMemory创建ChatMemory对象，通过maxMessage(10)方法指定存储的最大的消息条数是10条，通过chatMemoryRepository(new InMemoryChatMemoryRepository())方法，表示聊天的上下文消息存储在内存中。

测试代码：

java 复制代码

    
    @GetMapping("/chat3")
    public String chat3(String conversationId) {
        UserMessage userMessage1 = new UserMessage("你好，我是老任与码" + conversationId);
        // 将消息添加到ChatMemory中
        // conversationId表示和大模型的会话id，一般可以使用用户id表示，用于区分不同的用户的聊天上下文信息
        chatMemory.add(conversationId, userMessage1);
        // 提问
        ChatResponse response1 = this.chatModel.call(new Prompt(chatMemory.get(conversationId)));
        System.out.println("answer1:" + response1.getResult().getOutput().getText());
        // 将回答也添加到ChatMemory中
        chatMemory.add(conversationId, response1.getResult().getOutput());

        // 第二次提问，也需要将消息添加到ChatMemory
        UserMessage userMessage2 = new UserMessage("我是谁");
        chatMemory.add(conversationId, userMessage2);
        System.out.println(chatMemory.get(conversationId));
        ChatResponse response2 = this.chatModel.call(new Prompt(chatMemory.get(conversationId)));
        System.out.println("answer2:" + response2.getResult().getOutput().getText());
        return "success";
    }

输出结果：

根据输出结果，大模型根据上下文信息，给出了可能的答案。

通过调试查看ChatMemory对象中的信息：

根据调试可以看到，ChatMemory对象中的聊天上下文数据存储在ConcurrentHashMap对象中，key值表示conversationId，value值是我们聊天过程中的提问和回答的消息。

ChatMemory支持的存储方式

InMemoryChatMemoryRepository表示上下文消息存储在内存中
JdbcChatMemoryRepository表示使用 JDBC 在关系数据库中存储消息，支持PostgreSQL、MySQL / MariaDB、SQL Server、HSQLDB等数据库
CassandraChatMemoryRepository表示使用 Apache Cassandra 分布式数据库存储消息
Neo4jChatMemoryRepository表示将聊天消息作为节点和关系存储在 Neo4j 图数据库

ChatClient对象使用Chat Memory

在ChatClient中使用ChatMemory，需要指定对应的Advisor，支持的Advisor包括：

MessageChatMemoryAdvisor：此Advisor使用提供的实现类来管理对话记忆。在每次互动中，它从记忆中检索对话历史，并将其作为消息的集合包含在提示中。
ChatMemoryPromptChatMemoryAdvisor：此Advisor使用提供的实现来管理对话记忆。在每次互动中，它从记忆中检索对话历史，并将其作为普通文本附加到系统提示（SystemMessage）中。
ChatMemoryVectorStoreChatMemoryAdvisor：此Advisor使用提供的实现来管理对话记忆。在每次互动中，它从向量存储中检索对话历史，并将其作为普通文本附加到系统消息中。

本例使用MessageChatMemoryAdvisor：

java 复制代码

    @Bean
    public ChatClient chatClient(ZhiPuAiChatModel chatModel) {
        return ChatClient
                .builder(chatModel)
                // MessageChatMemoryAdvisor 聊天记忆的advisor
                .defaultAdvisors(MessageChatMemoryAdvisor.builder(chatMemory()).build())
                .build();
    }

测试代码：

java 复制代码

    @GetMapping("/chat4")
    public String chat4(String conversationId) {
        String answer1 = chatClient.prompt()
                .user("我叫老任与码")
                .advisors(a -> a.param(AbstractChatMemoryAdvisor.CHAT_MEMORY_CONVERSATION_ID_KEY, conversationId))
                .call()
                .content();

        System.out.println(answer1);

        String answer2 = chatClient.prompt()
                .user("我叫什么名字")
                .advisors(a -> a.param(AbstractChatMemoryAdvisor.CHAT_MEMORY_CONVERSATION_ID_KEY, conversationId))
                .call()
                .content();

        System.out.println(answer2);

        System.out.println(chatMemory.get(conversationId));

        return "success";
    }

上述代码中，必须指定：

java 复制代码

advisors(a -> a.param(AbstractChatMemoryAdvisor.CHAT_MEMORY_CONVERSATION_ID_KEY, conversationId))

表示根据conversationId区分不同会话的上下文。