Agent 流程编排

Runnable 是 LangChain 里统一的执行协议。

Prompt、Model、Output Parser、普通函数、分支逻辑、并行逻辑，都可以被看成 Runnable。

先记住一句话：Runnable 不是某一个具体功能，而是一套统一的调用方式。

它统一的是这几件事：

invoke：处理一个输入。
batch：批量处理多个输入。
stream：流式返回结果。
config：给本次运行附加配置。
|：把多个 Runnable 串成一条链。

这套链式写法也叫 LCEL，也就是 LangChain Expression Language。

LCEL 的核心不是新语法，而是把 Agent 流程表达成一条可组合、可追踪、可批量、可流式执行的链。

最典型的公式是：

txt 复制代码

Prompt | Model | Parser

为什么需要 Runnable

不用 Runnable 时，一个最小链路通常这样写：

python 复制代码

formatted_prompt = prompt.format(text=input_text)
response = model.invoke(formatted_prompt)
result = parser.invoke(response)

这段代码是命令式的：你手动调用每一步，并手动把上一步结果传给下一步。

Runnable 写法是：

python 复制代码

chain = prompt | model | parser
result = chain.invoke({"text": input_text})

这里不是少执行了步骤，而是把步骤提前组装成一条链。

txt 复制代码

输入
  -> prompt.invoke(...)
  -> model.invoke(...)
  -> parser.invoke(...)
  -> 输出

所以 chain.invoke(...) 是总入口。链内部每个 Runnable 仍然会按顺序执行。

Runnable 真正有价值的地方不只是少写代码。

一旦组件都变成 Runnable，它们就可以共享同一套能力：

txt 复制代码

执行：invoke / batch / stream
配置：config
组合：RunnableSequence / RunnableParallel / RunnableBranch
增强：with_config / with_retry / with_fallbacks
观测：with_listeners / callbacks

安装依赖

bash 复制代码

uv add langchain-core

Runnable 的核心 API

先不要急着看各种组合类。

Runnable 最核心的是三个执行 API：

txt 复制代码

invoke：一次处理一个输入
batch：一次处理多个输入
stream：边处理边返回

这三个 API 都可以接收第二个参数 config。

python 复制代码

chain.invoke(input_data, config=config)
chain.batch(input_list, config=config)
chain.stream(input_data, config=config)

后面的组合、重试、历史管理，本质上都建立在这三个执行入口上。

invoke

invoke 是最基础的单次调用。

普通函数本身没有 invoke，需要先用 RunnableLambda 包装。

python 复制代码

from langchain_core.runnables import RunnableLambda


def add_one(number: int) -> int:
    return number + 1


chain = RunnableLambda(add_one)

print(chain.invoke(1))
# 2

包装成 Runnable 后，它就能进入 LangChain 的统一执行体系。

batch

batch 用同一条链处理多个输入。

python 复制代码

from langchain_core.runnables import RunnableLambda


chain = RunnableLambda(lambda text: text.upper())

print(chain.batch(["hello", "langchain", "runnable"]))
# ['HELLO', 'LANGCHAIN', 'RUNNABLE']

batch 适合这些场景：

批量处理文档 chunk。
批量生成摘要。
批量调用分类链。

如果每个输入之间没有依赖关系，优先考虑 batch，而不是自己写循环一个个 invoke。

stream

stream 适合边生成边返回的场景。

这里用普通函数模拟流式输出。

python 复制代码

from collections.abc import Iterator

from langchain_core.runnables import RunnableLambda


def stream_words(text: str) -> Iterator[str]:
    for word in text.split():
        yield word


chain = RunnableLambda(stream_words)

for chunk in chain.stream("LangChain Runnable stream"):
    print(chunk)

真实模型流式输出时，chunk 通常是模型生成的一小段内容。

在 Agent 里，stream 常用于：

页面上逐字显示回答。
长任务实时显示进度。
调试链路中间过程。

运行配置 config

config 是所有 Runnable 执行 API 都能接收的运行配置。

它不是业务输入，而是这次运行的附加信息。

常见字段有这些：

python 复制代码

config = {
    "configurable": {
        "locale": "zh-CN",
        "session_id": "user-001",
    },
    "tags": ["demo", "runnable"],
    "metadata": {
        "page": "agent-runnable",
    },
    "run_name": "translate_demo",
}

configurable 适合放这次运行需要的自定义参数，比如语言、用户 id、会话 id。

tags / metadata / run_name 更偏观测和追踪，方便后面在日志或 LangSmith 里定位一次运行。

python 复制代码

from langchain_core.runnables import RunnableConfig, RunnableLambda


translations = {
    "zh-CN": {"greeting": "你好"},
    "en-US": {"greeting": "Hello"},
}


def translate(key: str, config: RunnableConfig) -> str:
    locale = config.get("configurable", {}).get("locale", "zh-CN")
    return translations[locale].get(key, key)


chain = RunnableLambda(translate)

print(chain.invoke("greeting", config={"configurable": {"locale": "zh-CN"}}))
print(chain.invoke("greeting", config={"configurable": {"locale": "en-US"}}))

业务输入是 "greeting"。

运行配置是 locale。

这两个不要混在一起。

LCEL 和组合 API

核心 API 解决"怎么执行"。

组合 API 解决"怎么把多个 Runnable 组织成一个流程"。

常见组合有四类：

txt 复制代码

串行：RunnableSequence / |
并行：RunnableParallel
保留中间结果：RunnablePassthrough
条件分支：RunnableBranch

串行组合 RunnableSequence

串行组合就是上一步的输出，自动成为下一步的输入。

txt 复制代码

input -> add_one -> multiply_two -> to_text -> output

python 复制代码

from langchain_core.runnables import RunnableLambda


add_one = RunnableLambda(lambda number: number + 1)
multiply_two = RunnableLambda(lambda number: number * 2)
to_text = RunnableLambda(lambda number: f"最终结果：{number}")

chain = add_one | multiply_two | to_text

print(chain.invoke(3))
# 最终结果：8

这就是 RunnableSequence。

在 Python 里，最常见写法是用 |。你也可以显式使用 RunnableSequence，但日常文档和业务代码里，| 更直观。

Prompt | Model | Parser

真实 LangChain 代码里，最常见的 Runnable 链是：

txt 复制代码

Prompt -> Model -> Parser

这里用普通函数模拟模型，避免 demo 依赖真实 LLM 配置。

python 复制代码

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableLambda


prompt = PromptTemplate.from_template("请用一句话解释：{concept}")


def fake_model(prompt_value) -> str:
    # PromptTemplate.invoke(...) 返回的是 PromptValue，不是普通字符串。
    # to_string() 可以把它转换成真正发给模型的 prompt 文本。
    prompt_text = prompt_value.to_string()
    return f"模型收到：{prompt_text}\n回答：Runnable 是统一执行协议。"


model = RunnableLambda(fake_model)
parser = StrOutputParser()

chain = prompt | model | parser

print(chain.invoke({"concept": "Runnable"}))

这段代码里的数据流是：

txt 复制代码

{"concept": "Runnable"}
  -> PromptTemplate
  -> fake_model
  -> StrOutputParser
  -> str

PromptTemplate、RunnableLambda、StrOutputParser 都遵循 Runnable 协议，所以可以用 | 串起来。

并行组合 RunnableParallel

有些时候，一个输入需要同时走多个处理逻辑。

比如用户输入一个主题，同时生成标题、摘要、关键词。

python 复制代码

from langchain_core.runnables import RunnableLambda, RunnableParallel


title = RunnableLambda(lambda text: f"标题：{text}")
summary = RunnableLambda(lambda text: f"摘要：这是一段关于 {text} 的介绍。")
keywords = RunnableLambda(lambda text: ["Runnable", "LCEL", text])

chain = RunnableParallel(
    title=title,
    summary=summary,
    keywords=keywords,
)

print(chain.invoke("LangChain"))

返回结果是一个字典：

python 复制代码

{
    "title": "标题：LangChain",
    "summary": "摘要：这是一段关于 LangChain 的介绍。",
    "keywords": ["Runnable", "LCEL", "LangChain"],
}

串行和并行的区别：

txt 复制代码

RunnableSequence：上一步输出 -> 下一步输入
RunnableParallel：同一个输入 -> 多个分支 -> 合并成 dict

保留中间结果 RunnablePassthrough

串行链默认只把最后一步结果返回。

如果后面还需要保留原始输入或中间结果，可以用 RunnablePassthrough.assign(...)。

python 复制代码

from langchain_core.runnables import RunnableLambda, RunnablePassthrough


chain = (
    RunnableLambda(lambda text: {"text": text})
    | RunnablePassthrough.assign(
        length=lambda data: len(data["text"]),
        upper=lambda data: data["text"].upper(),
    )
)

print(chain.invoke("hello runnable"))

输出会把原始对象和新增字段合并：

python 复制代码

{
    "text": "hello runnable",
    "length": 14,
    "upper": "HELLO RUNNABLE",
}

在 Agent 开发里，它常用于保留：

原始用户问题。
检索到的文档。
格式化后的 prompt。
模型输出前的中间状态。

条件分支 RunnableBranch

RunnableBranch 类似 if / elif / else。

它会从上往下判断条件，命中第一个分支就执行对应 Runnable。

python 复制代码

from langchain_core.runnables import RunnableBranch, RunnableLambda


branch = RunnableBranch(
    (
        RunnableLambda(lambda user: user["type"] == "vip"),
        RunnableLambda(lambda user: {**user, "discount": 0.7, "message": "VIP 专享 7 折"}),
    ),
    (
        RunnableLambda(lambda user: user["orders"] == 0),
        RunnableLambda(lambda user: {**user, "discount": 0.8, "message": "新用户首单 8 折"}),
    ),
    RunnableLambda(lambda user: {**user, "discount": 1, "message": "普通价格"}),
)

print(branch.invoke({"name": "张三", "type": "normal", "orders": 0}))

这里的执行顺序是：

txt 复制代码

是否 VIP
  -> 是：VIP 分支
  -> 否：继续判断是否新用户
      -> 是：新用户分支
      -> 否：默认分支

Agent 里常见的分支场景：

判断用户问题是否需要调用工具。
判断输入是否需要走 RAG。
判断结果是否需要结构化解析。
判断错误是否需要重试。

增强 API

增强 API 不是新的业务流程，而是在原有 Runnable 外面再包一层能力。

常见增强有：

txt 复制代码

with_config：绑定默认运行配置
with_retry：失败后重试
with_fallbacks：失败后换备用方案
with_listeners：监听开始、结束、错误
callbacks：更细粒度的回调机制

这些能力通常和 config 一起使用。

绑定配置 with_config

如果某个配置会反复使用，可以用 with_config(...) 绑定成一个新的 Runnable。

python 复制代码

from langchain_core.runnables import RunnableConfig, RunnableLambda


def translate(key: str, config: RunnableConfig) -> str:
    locale = config.get("configurable", {}).get("locale", "zh-CN")
    return f"{locale}: {key}"


translate_chain = RunnableLambda(translate)

zh_chain = translate_chain.with_config(
    configurable={"locale": "zh-CN"},
    tags=["zh"],
    metadata={"source": "docs"},
)

en_chain = translate_chain.with_config(
    configurable={"locale": "en-US"},
    tags=["en"],
    metadata={"source": "docs"},
)

print(zh_chain.invoke("greeting"))
print(en_chain.invoke("greeting"))

with_config(...) 不会修改原来的 Runnable，而是返回一个带默认配置的新 Runnable。

失败重试 with_retry

Agent 调用外部服务时，经常会遇到临时错误。

比如模型超时、检索服务抖动、工具接口偶发失败。

with_retry(...) 用来给某个 Runnable 加重试逻辑。

python 复制代码

from langchain_core.runnables import RunnableLambda


class TemporaryServiceError(RuntimeError):
    pass


attempts = 0


def unstable_search(query: str) -> str:
    global attempts
    attempts += 1

    if attempts < 3:
        raise TemporaryServiceError("检索服务临时不可用")

    return f"检索结果：{query}"


search_chain = RunnableLambda(unstable_search).with_retry(
    retry_if_exception_type=(TemporaryServiceError,),
    stop_after_attempt=3,
    wait_exponential_jitter=False,
)

print(search_chain.invoke("Runnable 是什么？"))

重试适合临时故障。

如果错误是参数错了、代码写错了、权限不足，重试通常没有意义。

失败降级 with_fallbacks

with_fallbacks(...) 用来定义备用方案。

比如 RAG 里优先走向量检索，失败后再走关键词检索，最后返回空结果。

python 复制代码

from langchain_core.runnables import RunnableLambda


def vector_search(query: str) -> list[str]:
    raise RuntimeError("向量数据库连接失败")


def keyword_search(query: str) -> list[str]:
    return [f"关键词结果：{query}"]


def empty_result(query: str) -> list[str]:
    return []


retrieve = RunnableLambda(vector_search).with_fallbacks(
    [
        RunnableLambda(keyword_search),
        RunnableLambda(empty_result),
    ]
)

print(retrieve.invoke("报销制度"))

with_retry 是"同一个方案多试几次"。

with_fallbacks 是"这个方案失败后，换另一个方案"。

运行监听 with_listeners

with_listeners(...) 可以监听 Runnable 的开始、结束和错误。

它适合做轻量日志。

python 复制代码

from langchain_core.runnables import RunnableLambda
from langchain_core.tracers.schemas import Run


def normalize_text(text: str) -> str:
    return text.strip().lower()


def on_start(run: Run) -> None:
    print("开始运行:", run.name)


def on_end(run: Run) -> None:
    print("运行结束:", run.name)


def on_error(run: Run) -> None:
    print("运行失败:", run.name)


chain = RunnableLambda(normalize_text).with_listeners(
    on_start=on_start,
    on_end=on_end,
    on_error=on_error,
)

print(chain.invoke("  HELLO Runnable  "))

如果只是关心链的生命周期，用 with_listeners(...) 就够了。

如果要追踪 LLM、Tool、Retriever 等更细粒度事件，就需要 callbacks。

Callbacks

Callbacks 是更完整的回调机制。

它可以监听更细的事件，比如：

txt 复制代码

Chain 开始 / 结束 / 报错
LLM 开始 / 结束 / 报错 / 新 token
Tool 开始 / 结束 / 报错
Retriever 开始 / 结束 / 报错

在真实 Agent 里，callbacks 常用于：

打日志。
统计耗时。
记录 token。
把中间过程上报到观测系统。
在 UI 上展示 Agent 正在做什么。

概念上可以这样理解：

python 复制代码

config = {
    "callbacks": [callback_handler],
    "tags": ["agent", "prod"],
    "metadata": {"user_id": "user-001"},
}

chain.invoke(input_data, config=config)

with_listeners 更像简化版，只看 Runnable 生命周期。

callbacks 更完整，能覆盖模型、工具、检索器等细粒度事件。

带历史的 Runnable

模型本身没有记忆。

RunnableWithMessageHistory 的作用是：把一条普通对话链包装成"自动读写历史消息"的链。

这里会用到真实对话模型，所以先安装对话模型依赖：

bash 复制代码

uv add langchain-openai

python 复制代码

import os

from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import ChatOpenAI


model = ChatOpenAI(
    model=os.environ["AI_MODEL"],
    api_key=os.environ["AI_KEY"],
    base_url=os.environ["AI_BASE_URL"],
    temperature=0,
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "你是一个能基于历史对话回答问题的助手。"),
        # 这里的变量名必须和 RunnableWithMessageHistory 的 history_messages_key 对上。
        # 每次调用时，历史消息会被自动填充到这个位置。
        MessagesPlaceholder("history"),
        # 这里的 question 必须和 input_messages_key 对上。
        # RunnableWithMessageHistory 会把本轮用户输入包装成 HumanMessage，并写入历史。
        ("human", "{question}"),
    ]
)

# base_chain 仍然是一条普通 Runnable 链，本身还不知道"历史"怎么存。
# 历史能力是在下面用 RunnableWithMessageHistory 包装出来的。
base_chain = prompt | model | StrOutputParser()

histories: dict[str, InMemoryChatMessageHistory] = {}


def get_history(session_id: str) -> InMemoryChatMessageHistory:
    # session_id 用来区分不同用户或不同会话。
    # 同一个 session_id 会复用同一份历史；不同 session_id 互不影响。
    if session_id not in histories:
        histories[session_id] = InMemoryChatMessageHistory()

    return histories[session_id]


chain = RunnableWithMessageHistory(
    base_chain,
    # 每次调用前，LangChain 会用 config.configurable.session_id 调这个函数取历史。
    get_history,
    # 输入对象里的 question 会被当成本轮用户消息。
    input_messages_key="question",
    # 取出的历史消息会填到 MessagesPlaceholder("history")。
    history_messages_key="history",
)

config = {"configurable": {"session_id": "user-001"}}

# 第一次调用会把"我叫张三。"写入 user-001 这份历史。
print(chain.invoke({"question": "我叫张三。"}, config=config))
# 第二次调用会先读出 user-001 的历史，所以模型能看到上一轮的名字信息。
print(chain.invoke({"question": "我叫什么？"}, config=config))

关键点只有三个：

MessagesPlaceholder("history")：告诉 prompt 历史消息放在哪里。
get_history(session_id)：根据会话 id 找到对应历史。
configurable.session_id：每次调用时告诉链当前是哪一个会话。

如果 session_id 不同，历史就会分开。

Runnable 怎么选

常用选择可以这样记：

txt 复制代码

单次执行：invoke
批量处理：batch
流式输出：stream
一条流水线：RunnableSequence / |
同一个输入走多个分支：RunnableParallel
保留原始输入或中间结果：RunnablePassthrough.assign
if-else 分支：RunnableBranch
绑定配置：with_config
临时失败重试：with_retry
失败后换备用方案：with_fallbacks
轻量生命周期日志：with_listeners
细粒度观测：callbacks
多轮对话历史：RunnableWithMessageHistory

Runnable 的价值不在于"少写几行代码"。

它真正解决的是：把 Agent 里的 prompt、模型、解析、检索、工具调用、历史管理，都放进同一套可组合的执行协议里。

小结

这一篇要记住几个核心点：

Runnable 是 LangChain 的统一执行协议。
invoke / batch / stream 是 Runnable 的核心执行 API。
config 是这些执行 API 都能接收的运行配置。
| 会创建串行链，上一步输出自动传给下一步。
RunnableParallel 会让同一个输入同时进入多个分支。
RunnablePassthrough.assign 适合保留和合并中间结果。
RunnableBranch 适合表达条件分支。
with_retry 适合临时错误重试。
with_fallbacks 适合失败后的备用方案。
with_config 用来绑定运行配置。
with_listeners 和 callbacks 用来观察 Runnable 的运行过程。
RunnableWithMessageHistory 可以把普通对话链包装成带历史的链。