一文带你了解LangChain数据容器及其使用

在LangChain生态中，State 、Runtime 、RunnableConfig 是三个核心概念，分别承担状态数据容器、环境数据容器、配置数据容器的角色。管理和使用好这些数据容器，直接影响智能体的性能表现。本文带领大家深刻理解这三个概念和它们的使用方法。

一、概念解读

1.1 状态（State）：状态数据容器

State是表示智能体当前状态的内部共享数据结构。它在智能体执行步骤或节点之间传输。

1.1.1 LangChain中的状态（State）

默认状态 AgentState
create_agent接口创建的Agent的默认状态是 langchain.agents.AgentState，它是一个只包含必要键messages的字典（dict），用于存储智能体的短期记忆。

python 复制代码

class AgentState(TypedDict, Generic[ResponseT]):
    """State schema for the agent."""
    messages: Required[Annotated[list[AnyMessage], add_messages]]
    jump_to: NotRequired[Annotated[JumpTo | None, EphemeralValue, PrivateStateAttr]]
    structured_response: NotRequired[Annotated[ResponseT, OmitFromInput]]

自定义状态（State）

智能体默认通过消息状态自动维护对话历史。开发者也可以为智能体配置自定义状态模式，使其在对话过程中记录更多额外信息。自定义状态模式定义为一个继承AgentState的TypedDict类型，并在create_agent接口中传递给state_schema参数。

在LangChain 1.0中，自定义的state schemas必须是TypedDict类型，Pydantic模型和dataclass类型不再支持。此外，自定义状态还可以通过中间件实现。

python 复制代码

from langchain.agents import create_agent,AgentState 

class CustomState(AgentState):
    """定制 Agent State Schema."""
    counter: int = 0

agent = create_agent(
    model="openai:gpt-4o",
    tools=[get_weather],
    system_prompt="你是一个乐于助人的个人助手。",
    state_schema=CustomState,
)
# Run the agent
agent.invoke(
    {"messages": [{"role": "user", "content": "北京今天天气怎么样？"}]}
)

1.1.2 LangGraph中的状态（State）

在LangGraph中，状态类型可以是 TypedDict、dataclass 或Pydantic模型。官方推荐的图状态模式定义方式是使用TypedDict。若需在状态中设置默认值，可使用dataclass。如果需要实现递归数据验证，可以用Pydantic BaseModel定义图的状态（但需注意，Pydantic的性能要低于TypedDict 或 dataclass）。状态的模式会作为图中所有节点（Node）与边（Edge）的输入（input）模式。所有节点（Node）都会输出状态的更新信息，这些信息随后会通过指定的归约函数进行应用于状态（State）。

LangGraph实现了基础的消息状态 langgraph.graph.MessagesState。开发者可以继承MessagesState类来自定义需要的状态。

python 复制代码

class MessagesState(TypedDict):
    messages: Annotated[list[AnyMessage], add_messages]

1.2 运行时（Runtime）：环境数据容器

LangChain通过create_agent方法创建的智能体，底层是基于LangGraph的运行时来执行的。 LangGraph对外暴露一个 langgraph.runtime.Runtime对象，它包含以下信息：

上下文（Context）：静态信息，例如用户 ID、数据库连接，或智能体调用所需的其他依赖项。
存储（Store）：用于长期记忆的 BaseStore 实例。
流写入器（Stream writer）：用于通过自定义流模式传输信息的对象。

1.3 运行配置（RunnableConfig）：配置数据容器

LangChain框架抽象出了一个"可调用、可组合、可序列化" 的组件接口Runable，作为其最基本的工作单元。其核心定义在 langchain-core 模块中，langchain_core.runnables.Runnable。LangChain同时还提供了RunnableConfig类，用于配置Runnable组件的运行时参数。它位于 langchain_core.runnables.RunnableConfig。RunnableConfig是一个TypedDict，包含以下字段：

tags: list[str] 为当前调用及其所有子调用添加标签，用于过滤和分类不同的调用（比如按业务模块、用户ID、任务类型打标签）。
metadata: dict[str, Any] 当前调用及子调用的元数据，要求键是字符串，值可JSON序列化（如字符串、数字、列表、字典）。
callbacks: Callbacks 回调函数集合，用于监听Runnable执行的生命周期事件（启动、成功、失败、结束、工具调用等）。
run_name: str 为当前调用的运行实例命名，默认值为当前Runnable类的名称。
max_concurrency: int | None 控制当前调用中并行子调用的最大数量；默认为ThreadPoolExecutor的默认值（通常为CPU核心数 × 5）。
recursion_limit: int 限制当前调用的最大递归次数，防止无限递归。默认值为 25。
configurable: dict[str, Any] 为Runnable中预先标记为可配置的字段提供运行时动态值。可配置字段通过configurable_fields或configurable_alternatives定义，支持在运行时覆盖默认值，无需修改代码。
run_id: uuid.UUID | None 当前调用的唯一标识（UUID），用于追踪系统中唯一识别一次运行。若未提供，LangChain 会自动生成一个新的 UUID。

RunnableConfig是控制Runnable执行的核心,涵盖可观测性（tags/metadata/callbacks）、性能控制（max_concurrency/recursion_limit）、动态配置（configurable）、追踪（run_name/run_id）等维度。RunnableConfig会透传所有子调用。当前调用的配置会作用于其所有子调用（如 Chain 调用 LLM、图节点调用工具），实现全链路统一配置。

LangGraph是基于langchain_core模块构建的有状态图执行框架，其核心组件（如图、状态机）均实现了Runnable接口，所以LangGraph无缝兼容RunnableConfig。配置会自动传递到节点，支持节点内读取和使用。

二、数据容器传入

2.1 LangChain中的数据容器传入

在LangChain中，通过craete_agent接口的参数state_schema、context_schema、store分别传入状态、上下文、存储的定义。通过智能体的调用方法invoke/ainvoke,stream/astream, batch/abatch等传入初始的状态、运行配置、上下文数据。

python 复制代码

from typing import TypedDict
from langchain_core.runnables import RunnableConfig
from langchain_core.messages import ToolMessage
from langchain.agents import create_agent, AgentState
from langchain.tools import tool, ToolRuntime
from langgraph.types import Command
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.store.memory import InMemoryStore
from model import model


# 自定义状态
class CustomState(AgentState):
    """定制 Agent State Schema."""
    counter: int = 0
# 自定义上下文
class CustomContext(TypedDict):
    """定制 Runtime Context Schema."""
    user_id: str
# 工具
@tool
def counter(query: str, runtime: ToolRuntime) -> Command:
    """用户消息接收次数计数器：每次接收到用户消息调用此工具。"""
    if query.strip() != "":
        runtime.state["counter"] += 1
        store = runtime.store
        store.put(("users",), runtime.context["user_id"], {"counter": runtime.state["counter"]})
        info=store.get(("users",), runtime.context["user_id"])
        print(info)
    return Command (update={"messages": [ToolMessage("消息次数记录成功！", tool_call_id=runtime.tool_call_id)], "counter": runtime.state["counter"]})

# 创建智能体
agent = create_agent(
    model=model,
    tools=[counter],
    system_prompt="你是一个个人助手",
    state_schema=CustomState,         # 指定状态模型
    context_schema=CustomContext,     # 指定上下文模型
    checkpointer=InMemorySaver(),     # 设置检查点
    store=InMemoryStore(),            # 存储（长期记忆）
)

# 创建运行配置
config=RunnableConfig(
    tags=["user-123"],
    metadata={"source": "user-input"},
    run_name="user-query",
    configurable={"thread_id": "1"},
)
# 初始状态
init_state = CustomState(
        messages=[{"role": "user", "content": "你好"}], 
        counter=0
    )
# 上下文
context = CustomContext(user_id="123")

# 调用智能体
result = agent.invoke(
    input=init_state,
    config=config,
    context=context,
)
# 打印结果
for key,value in result.items():
    if key == "messages":
        for msg in value:
            msg.pretty_print()
    else:
        print(key, value)

2.2 LangGraph中的数据容器传入

在LangGraph中，通过StateGraph传入state_schema、context_schema；通过compile方法传入store对象。通过图的调用方法invoke/ainvoke,stream/astream, batch/abatch等传入初始的状态、运行配置、上下文数据。

python 复制代码

from typing import TypedDict
from langchain_core.runnables import RunnableConfig
from langchain.messages import AIMessage, HumanMessage
from langgraph.graph import MessagesState, StateGraph, START, END
from langgraph.runtime import Runtime
from langgraph.store.memory import InMemoryStore
from langgraph.checkpoint.memory import InMemorySaver
from model import model


# 自定义图状态
class MyGraphState(MessagesState):
    """MyGraphState"""
    counter: int = 0
# 自定义上下文
class MyContext(TypedDict):
    """MyContext"""
    user_id: str
# 自定义输出格式
class MyStructuredOutput(TypedDict):
    """MyStructedOutput"""
    response: str
    user_id: str

# 定义Node
def node_1(state: MyGraphState, runtime: Runtime, config: RunnableConfig) -> MyGraphState:
    """node_1"""
    state["counter"] += 1
    user_id = runtime.context["user_id"]
    thread_id = config["configurable"]["thread_id"]
    store = runtime.store
    store.put(("users",),user_id,{"thread_id":thread_id, "counter":state["counter"]})
    state["messages"].append(AIMessage(content=f"user_id为{user_id}的用户的消息次数记录成功！"))
    print(store.get(("users",),user_id))
    return state
def node_2(state: MyGraphState, runtime: Runtime, config: RunnableConfig) -> MyGraphState:
    """node_2"""
    user_id = runtime.context["user_id"]
    thread_id = config["configurable"]["thread_id"]
    store = runtime.store
    user_data = store.get(("users",),user_id)
    if user_data and user_data.value["thread_id"] == thread_id and state["counter"] < 10:
        structed_model = model.with_structured_output(MyStructuredOutput)
        prompt = " ".join([msg.content for msg in state["messages"]])
        response = structed_model.invoke(prompt)
        state["messages"].append(AIMessage(content=response["response"]))
    return state

# 构建图
graph = StateGraph(state_schema=MyGraphState,context_schema=MyContext)
graph.add_node(node_1)
graph.add_node(node_2)
graph.add_edge(START, "node_1")
graph.add_edge("node_1", "node_2")
graph.add_edge("node_2", END)

#编译图
agent = graph.compile(checkpointer=InMemorySaver(), store=InMemoryStore())

# 创建运行配置
config=RunnableConfig(
    tags=["user-123"],
    metadata={"source": "user-input"},
    run_name="user-query",
    configurable={"thread_id": "1"},
)

# 调用
result = agent.invoke(
    input={"messages": [HumanMessage(content="你好")], "counter": 0},
    config=config,
    context={"user_id": "user_123"},
)

# 打印结果
for key,value in result.items():
    if key == "messages":
        for msg in value:
            msg.pretty_print()
    else:
        print(key, value)

三、访问数据容器

3.1 工具（Tools）访问

在LangChain中，工具可以通过 langchain.tools.ToolRuntime访问运行时信息。ToolRuntime作为统一的工具参数，可使工具获取状态、配置、工具调用ID、上下文、存储、流式输出这些运行时信息。在工具函数签名中注入runtime:ToolRuntime参数，即可在函数内部引用相关信息，如runtime.state、runtime.context、runtime.config等。可以访问的信息如下：

context: Runtime context (from langgraph Runtime)
store: BaseStore instance for persistent storage (from langgraph Runtime)
stream_writer: StreamWriter for streaming output (from langgraph Runtime)
state: The current graph state
tool_call_id: The ID of the current tool call
config: RunnableConfig for the current execution

注意：ToolRuntime中的context、store、stream_writer来自LangGraph的Runtime对象。

示例代码可参考2.1节的案例。

3.2 中间件（Middleware）访问

LangChain的中间件可以在每个步骤控制和定制智能体的执行过程。中间件提供两种类型的钩子用于植入智能体的执行过程：

节点类型钩子，如before_agent, before_model, after_model, after_agent；
包裹类型钩子，如wrap_model_call, wrap_tool_call。

节点类型的中间件接收state和runtime参数作为其输入，返回State更新或Command。对于包裹型中间件，wrap_model_call类型的接收ModelRequest和handler参数作为其输入，返回ModelResponse 或 AIMessage；wrap_tool_call类型的接收ToolCallRequest和handler参数作为其输入，返回 ToolMessage 或 Command。ModelRequest和ToolCallRequest中包含了state和runtime信息。 特别要注意的是，中间件不支持访问RunnableConfig。

python 复制代码

from dataclasses import dataclass
from collections.abc import Callable
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import dynamic_prompt, ModelRequest, before_model, after_model, wrap_model_call, wrap_tool_call
from langchain.tools.tool_node import ToolCallRequest
from langchain.tools import tool
from langchain_core.messages import ToolMessage
from langgraph.runtime import Runtime
from langchain_core.runnables import RunnableConfig
from model import model

# 上下文
@dataclass
class Context:
    user_name: str

# 工具
@tool
def think_tool(thought: str) -> str:
    """在回答问题前先进行思考"""
    return thought

# Dynamic prompts（dynamic_prompt为wrap_model_call的子类）
@dynamic_prompt
def dynamic_system_prompt(request: ModelRequest) -> str:
    user_name = request.runtime.context.user_name  
    system_prompt = f"You are a helpful assistant. Address the user as {user_name}. you must use think_tool before you answer the questions."
    return system_prompt

@wrap_model_call
def modify_model_call(request: ModelRequest, handler: Callable) -> dict | None:
    # 从配置中获取线程ID
    user_name = request.runtime.context.user_name
    print(f"Modifying model call for user: {user_name}")
    result = handler(request)
    return result

@wrap_tool_call
def modify_tool_call(request: ToolCallRequest, handler: Callable) -> ToolMessage:
    # 从配置中获取线程ID
    user_name = request.runtime.context.user_name
    print(f"Modifying tool call for user: {user_name}")
    result = handler(request)
    return result


# Before model hook
@before_model
def log_before_model(state: AgentState, runtime: Runtime) -> dict | None:  
    print(f"Processing request for user: {runtime.context.user_name}")  
    return None

# After model hook
@after_model
def log_after_model(state: AgentState, runtime: Runtime) -> dict | None:  
    print(f"Completed request for user: {runtime.context.user_name}")  
    return None

# 创建智能体
agent = create_agent(
    model,
    tools=[think_tool],
    middleware=[dynamic_system_prompt, log_before_model, modify_model_call, modify_tool_call, log_after_model],  
    context_schema=Context
)

# 创建运行配置
config=RunnableConfig(
    tags=["user-123"],
    metadata={"source": "user-input"},
    run_name="user-query",
    configurable={"thread_id": "1"},
)

# 调用
result = agent.invoke(
    {"messages": [{"role": "user", "content": "What's my name?"}]},
    config,
    context=Context(user_name="John Smith")
)

# 打印结果
print(result["messages"][-1].content)

3.3 节点访问

节点本质上是一个Python函数，它接收state、runtime、config等参数作为输入。因此节点函数可以在其内部方便的访问状态、运行时信息、配置等数据。节点函数的返回值通常是更新后的状态。代码示例可参考2.2节的案例。

四、总结

4.1 核心定位与本质

三者的本质身份 和核心使命，这是理解差异的基础：

概念	核心定位	本质类型	类比（便于理解）
State	图的数据载体，存储图执行过程中的所有状态数据，是节点间通信的唯一媒介。	通常为TypedDict（字典）或 `dataclass`（Pydantic 模型），支持结构化数据定义。	程序中的"变量/数据库"，存数据
RunnableConfig	图/节点的配置载体，传递静态/运行时的配置信息，是跨 LangChain/LangGraph 的通用配置结构。	本质是Python 字典（带类型提示的 `RunnableConfig` 类型），包含回调、标签、线程 ID 等配置项。	程序的"配置文件/命令行参数"
Runtime	图的执行环境信息，管理运行时资源与执行流程，是图运行的"底层支撑"。	LangGraph 内置的Runtime 类实例，包含存储、缓存、流写入器等资源。	程序的"操作系统/运行时环境"

4.2 关键维度深度对比

对比维度	State（状态）	RunnableConfig（配置）	Runtime(运行时)
生命周期	与图的单次执行会话绑定：从图启动开始创建，到执行结束销毁（或持久化）；节点每次调用都会接收/修改 State。	与单次节点/图调用绑定：可全局传递（整个图共用），也可按节点/请求单独指定；调用结束后配置失效。	与图的编译/实例化绑定：编译图时创建 Runtime 实例，直到图对象被销毁；同一 Runtime 可支撑多次图执行。
核心作用	1. 存储节点的输入/输出数据； 2. 作为节点间数据传递的唯一媒介； 3. 决定图的分支走向（如条件边的判断依据）。	1. 传递回调函数（监控执行过程）； 2. 传递元数据（线程 ID、用户 ID、标签）； 3. 配置缓存策略、并发限制。	1. 管理运行时资源（存储、流写入器）； 2. 提供上下文环境； 3. 处理持久化、并行执行、事件流等底层能力。
数据流向	节点间双向流动：节点读取 State 中的数据，修改后返回新的 State 供下一个节点使用。	自上而下传递：从图的调用方传递到节点，节点仅读取配置，一般不修改（特殊场景可动态更新）。	全局静态存在：节点可读取 Runtime 中的资源（如缓存、存储），但不能修改 Runtime 本身的核心属性。
可修改性	高度可修改：节点的核心逻辑就是修改 State（如新增/更新字段），是图的"可变数据层"。	几乎不可修改：节点仅读取配置项，修改配置无实际意义（配置是单次调用的静态参数）。	不可修改：Runtime 是执行环境，节点仅使用其提供的资源，不修改 Runtime 实例。
典型属性/字段	业务相关的自定义字段（如 `user_query`、`tool_results`、`agent_answer`）。	`callbacks`（回调）、`tags`（标签）、`metadata`（元数据）、`thread_id`（线程 ID）。	`store`（键值存储）、`stream_writer`（流写入器）、`context`（运行时上下文）。
定义方式	由用户自定义结构化类型（如 TypedDict 定义字段名和类型）。	由用户按框架规范组装字典（可使用 LangChain 的 `RunnableConfig` 类型提示）。	由 LangGraph内置实现，用户仅在编译图时指定配置（如缓存、存储），可自定义上下文。