pipecat 基于mem0 构建RAG系统

前言

这是一篇pipecat 的代码框架中如何使用记忆能力实现RAG的参考代码。对应的参考代码是：github.com/pipecat-ai/...

这里首先使用到mem0 这个工具，其对应的github和网站如下所示。

github : github.com/mem0ai/mem0

网站：mem0.ai/

本质是上一种记忆保存的向量库．有两种方式，一种是云端，一种是本地保存向量库．

使用

下面是pipecat 中关于mem0的使用教程，提供两种方法，一种是本地端，一种是云端。本地采用ChromaDB的方式，路径是在./chroma 这里。

python 复制代码

    # =====================================================================
    # OPTION 1: Using Mem0 API (cloud-based approach)
    # This approach uses Mem0's cloud service for memory management
    # Requires: MEM0_API_KEY set in your environment
    # =====================================================================
	memory = Mem0MemoryService(
   	 api_key=os.getenv("MEM0_API_KEY"),  # Mem0 云服务 API 密钥
	    user_id=USER_ID,                    # 用户唯一标识
	    agent_id="agent1",                  # 助手标识（可选）
	    run_id="session1",                  # 会话标识（可选）
	    params=Mem0MemoryService.InputParams(
	        search_limit=10,                # 最多检索 10 条记忆
	        search_threshold=0.3,           # 相关性阈值（0-1）
	        api_version="v2",               # API 版本
	        system_prompt="Based on previous conversations, I recall: \n\n",  # 记忆前缀
	        add_as_system_message=True,     # 作为 system 消息注入
	        position=1,                     # 注入位置（第 1 条消息后）
	    ),
	)
    # =====================================================================
    # OPTION 2: Using Mem0 with local configuration (self-hosted approach)
    # This approach uses a local LLM configuration for memory management
    # Requires: Anthropic API key if using Claude model
    # =====================================================================
    # Uncomment the following code and comment out the previous memory initialization to use local config

    # local_config = {
    #     "llm": {
    #         "provider": "anthropic",
    #         "config": {
    #             "model": "claude-3-5-sonnet-20240620",
    #             "api_key": os.getenv("ANTHROPIC_API_KEY"),  # Make sure to set this in your .env
    #         }
    #     },
    #     "embedder": {
    #         "provider": "openai",
    #         "config": {
    #             "model": "text-embedding-3-large"
    #         }
    #     }
    # }

    # # Initialize Mem0 memory service with local configuration
    # memory = Mem0MemoryService(
    #     local_config=local_config,  # Use local LLM for memory processing
    #     user_id=USER_ID,            # Unique identifier for the user
    #     # agent_id="agent1",        # Optional identifier for the agent
    #     # run_id="session1",        # Optional identifier for the run
    # )

管道任务

音视频数据通过 transport （如 Daily/WebRTC）传输，RTVI 主要传输控制信息和元数据。

这里每次都要把历史的聚合信息都发送到大模型中。

python 复制代码

pipeline = Pipeline([
    transport.input(),      # 1️⃣ 接收音频
    rtvi,                   # 2️⃣ RTVI 事件处理
    stt,                    # 3️⃣ 语音 → 文字
    user_aggregator,        # 4️⃣ 聚合用户消息
    memory,                 # 5️⃣ Mem0 记忆检索/存储
    llm,                    # 6️⃣ LLM 生成响应
    tts,                    # 7️⃣ 文字 → 语音
    transport.output(),     # 8️⃣ 播放音频
    assistant_aggregator,   # 9️⃣ 聚合助手响应
])

完整的调用流程按照下面走：

python 复制代码

# 每次 LLM 调用时，发送完整 context：
{
    "model": "gpt-4o-mini",
    "messages": [
        {"role": "system", "content": "You are a personal assistant..."},
        {"role": "system", "content": "Based on memories: User's name is John..."},  # Mem0 注入
        {"role": "user", "content": "Hello!"},              # 历史
        {"role": "assistant", "content": "Hi John!"},       # 历史
        {"role": "user", "content": "What's my dog's name?"} # 新消息
    ]
}

关键点： 聚合器只负责"收集和存储"消息，如何发送到 LLM 是由 LLM 服务决定的。如果大模型可以保存状态信息，每次只接受最新的聚合消息，如果是普通的VL模型，那么则需要把所有历史消息都发送，如上图所示．

‍

任务评估:

python 复制代码

task = PipelineTask(
    pipeline,
    params=PipelineParams(
        enable_metrics=True,        # 收集性能指标（TTFB 等）
        enable_usage_metrics=True,  # 收集 token 使用量
    ),
    idle_timeout_secs=...,          # 空闲超时
    observers=[RTVIObserver(rtvi)], # RTVI 观察者
)

‍

消息事件

python 复制代码

@rtvi.event_handler("on_client_ready")
async def on_client_ready(rtvi):
    # 1. 通知客户端机器人已准备好
    await rtvi.set_bot_ready()
  
    # 2. 获取个性化问候（基于历史记忆）
    greeting = await get_initial_greeting(
        memory_client=memory.memory_client, 
        user_id=USER_ID, 
        agent_id=None,   # 不按 agent 过滤
        run_id=None      # 不按 session 过滤
    )
  
    # 3. 将问候添加到上下文（作为助手消息）
    context.add_message({"role": "assistant", "content": greeting})
  
    # 4. 触发 LLM 运行 → 播放问候语
    await task.queue_frames([LLMRunFrame()])

总结

RAG系统的核心在于如何构建标准的向量库，以及当前检索的信息与向量库信息的高精确度的匹配操作。

‍