LangGraph 框架完全指南：构建生产级 AI 工作流

二、核心概念与架构

1. 状态机模型（State Machine）

LangGraph 将应用建模为状态机，每个节点是状态转换函数：

scss 复制代码

┌─────────────┐      ┌─────────────┐      ┌─────────────┐
│   State     │─────▶│    Node     │─────▶│  New State  │
│  (当前状态)  │      │  (处理函数)  │      │  (新状态)   │
└─────────────┘      └─────────────┘      └─────────────┘
        ▲                                          │
        └──────────────────────────────────────────┘
                    (循环边)

2. 三大核心组件

State（状态）

定义应用的数据结构，使用 TypedDict + Annotated：

python 复制代码

from typing import TypedDict, Annotated, List
import operator

class AgentState(TypedDict):
    # Annotated 定义状态如何更新（这里使用 operator.add 累加）
    messages: Annotated[List[dict], operator.add]
    next_node: str
    iteration_count: Annotated[int, lambda x, y: y]  # 直接替换
    user_feedback: str

常用 Reducer 函数：

operator.add：列表累加（追加消息）
lambda x, y: y：直接替换（覆盖值）
自定义函数：复杂合并逻辑

Node（节点）

执行具体任务的 Python 函数：

python 复制代码

def agent_node(state: AgentState):
    """LLM 推理节点"""
    messages = state["messages"]
    response = llm.invoke(messages)
    return {
        "messages": [response],  # 追加到 messages
        "next_node": "action"     # 决定下一步
    }

def tool_node(state: AgentState):
    """工具执行节点"""
    tool_calls = state["messages"][-1].tool_calls
    results = execute_tools(tool_calls)
    return {"messages": results}

Edge（边）

连接节点的路由规则：

python 复制代码

# 普通边：固定流向
graph.add_edge("agent", "action")

# 条件边：动态路由
def router(state: AgentState) -> str:
    if state["next_node"] == "action":
        return "action"
    return END  # 内置常量，表示结束

graph.add_conditional_edges("agent", router, {
    "action": "action",
    END: END
})

三、基础使用模式

模式一：简单顺序流程（入门）

python 复制代码

from langgraph.graph import StateGraph, END
from typing import TypedDict

class SimpleState(TypedDict):
    input: str
    output: str

# 定义节点
def node_a(state: SimpleState):
    return {"output": f"处理A: {state['input']}"}

def node_b(state: SimpleState):
    return {"output": f"处理B: {state['output']}"}

# 构建图
builder = StateGraph(SimpleState)
builder.add_node("node_a", node_a)
builder.add_node("node_b", node_b)

# 设置入口和边
builder.set_entry_point("node_a")
builder.add_edge("node_a", "node_b")
builder.add_edge("node_b", END)

# 编译执行
graph = builder.compile()
result = graph.invoke({"input": "Hello"})
print(result)  # {'input': 'Hello', 'output': '处理B: 处理A: Hello'}

模式二：条件分支（路由）

python 复制代码

from typing import Literal

class RouterState(TypedDict):
    query: str
    query_type: str
    answer: str

def classify_node(state: RouterState):
    """分类节点"""
    prompt = f"分类以下查询（技术/销售/其他）：{state['query']}"
    classification = llm.invoke(prompt).content
    return {"query_type": classification}

def tech_support(state: RouterState):
    return {"answer": "技术支持回答..."}

def sales_support(state: RouterState):
    return {"answer": "销售支持回答..."}

def general_support(state: RouterState):
    return {"answer": "一般咨询回答..."}

def route_query(state: RouterState) -> Literal["tech", "sales", "general"]:
    """路由逻辑"""
    qt = state["query_type"].lower()
    if "技术" in qt:
        return "tech"
    elif "销售" in qt:
        return "sales"
    return "general"

# 构建图
builder = StateGraph(RouterState)
builder.add_node("classify", classify_node)
builder.add_node("tech", tech_support)
builder.add_node("sales", sales_support)
builder.add_node("general", general_support)

builder.set_entry_point("classify")
builder.add_conditional_edges(
    "classify", 
    route_query,
    {"tech": "tech", "sales": "sales", "general": "general"}
)
builder.add_edge("tech", END)
builder.add_edge("sales", END)
builder.add_edge("general", END)

graph = builder.compile()

模式三：循环与迭代（ReAct Agent）

python 复制代码

from langgraph.graph import StateGraph
from langchain.tools import tool

class ReActState(TypedDict):
    messages: Annotated[List, operator.add]
    iteration: int

@tool
def search(query: str) -> str:
    """搜索工具"""
    return f"搜索结果：{query} 的相关信息"

tools = [search]

def agent_node(state: ReActState):
    """Agent 思考节点"""
    response = llm_with_tools.invoke(state["messages"])
    return {
        "messages": [response],
        "iteration": state.get("iteration", 0) + 1
    }

def action_node(state: ReActState):
    """工具执行节点"""
    tool_calls = state["messages"][-1].tool_calls
    results = []
    for call in tool_calls:
        tool = next(t for t in tools if t.name == call["name"])
        result = tool.invoke(call["args"])
        results.append({
            "role": "tool",
            "content": result,
            "tool_call_id": call["id"]
        })
    return {"messages": results}

def should_continue(state: ReActState) -> str:
    """循环条件"""
    last_msg = state["messages"][-1]
    # 如果还有工具调用且未超过最大迭代次数
    if hasattr(last_msg, "tool_calls") and last_msg.tool_calls:
        if state["iteration"] < 5:
            return "continue"
    return "end"

# 构建 ReAct 循环图
builder = StateGraph(ReActState)
builder.add_node("agent", agent_node)
builder.add_node("action", action_node)

builder.set_entry_point("agent")
builder.add_conditional_edges(
    "agent",
    should_continue,
    {"continue": "action", "end": END}
)
builder.add_edge("action", "agent")  # 循环回 Agent

react_graph = builder.compile()

四、高级特性详解

1. 持久化与检查点（Checkpoints）

核心能力：随时暂停、恢复、重放工作流。

python 复制代码

from langgraph.checkpoint.memory import MemorySaver

# 内存检查点（开发测试）
memory = MemorySaver()

# 或持久化到数据库（生产环境）
# from langgraph.checkpoint.sqlite import SqliteSaver
# memory = SqliteSaver.from_conn_string("sqlite:///checkpoints.db")

graph = builder.compile(checkpointer=memory)

# 运行并获取线程 ID
config = {"configurable": {"thread_id": "conversation-123"}}
result = graph.invoke({"input": "Hello"}, config)

# 后续恢复同一线程
result2 = graph.invoke({"input": "继续"}, config)  # 自动恢复之前状态

检查点应用场景：

多轮对话保持上下文
长时间任务断点续传
人工审核后恢复执行
A/B 测试不同分支

2. 人工介入（Human-in-the-loop）

三种介入模式：

模式 A：中断等待人工输入

python 复制代码

def human_review_node(state: AgentState):
    """人工审核节点"""
    return {"waiting_for_human": True}

builder.add_node("human_review", human_review_node)

# 设置中断点
builder.add_edge("agent", "human_review")
builder.add_edge("human_review", "action")

# 编译时指定中断
graph = builder.compile(
    checkpointer=memory,
    interrupt_before=["human_review"]  # 在执行前中断
)

# 运行
result = graph.invoke(state, config)
if result.get("__interrupt__"):
    # 等待人工输入
    human_input = input("请审核并输入反馈：")
    # 恢复执行
    result = graph.invoke(
        {"human_feedback": human_input}, 
        config
    )

模式 B：动态批准

python 复制代码

def should_proceed(state: AgentState) -> str:
    if state.get("human_approved"):
        return "continue"
    return "review"

builder.add_conditional_edges(
    "risky_action",
    should_proceed,
    {"continue": "next_step", "review": "human_review"}
)

3. 并行执行（Fan-out/Fan-in）

python 复制代码

from langgraph.graph import Send  # 动态映射

class ParallelState(TypedDict):
    topics: List[str]
    summaries: Annotated[List[str], operator.add]

def generate_summary(state: dict):
    """为单个主题生成摘要"""
    topic = state["topic"]
    summary = llm.invoke(f"总结：{topic}")
    return {"summaries": [summary.content]}

def aggregator(state: ParallelState):
    """合并所有摘要"""
    return {"final_report": "\n\n".join(state["summaries"])}

# 构建图
builder = StateGraph(ParallelState)
builder.add_node("generate", generate_summary)
builder.add_node("aggregate", aggregator)

# 入口节点：动态生成并行任务
def entry_node(state: ParallelState):
    return [Send("generate", {"topic": t}) for t in state["topics"]]

builder.set_conditional_entry_point(entry_node)
builder.add_edge("generate", "aggregate")
builder.add_edge("aggregate", END)

4. 子图（Subgraphs）

模块化复杂工作流：

python 复制代码

# 定义子图（独立的工作流）
def create_research_subgraph():
    builder = StateGraph(ResearchState)
    builder.add_node("search", search_node)
    builder.add_node("analyze", analyze_node)
    builder.set_entry_point("search")
    builder.add_edge("search", "analyze")
    return builder.compile()

# 在主图中使用
main_builder = StateGraph(MainState)
research_subgraph = create_research_subgraph()

main_builder.add_node("research", research_subgraph)  # 作为节点嵌入
main_builder.add_edge("research", "write_report")

五、多智能体系统（Multi-Agent）

架构模式：主管-工作者（Supervisor）

python 复制代码

from typing import Literal

class MultiAgentState(TypedDict):
    messages: Annotated[List, operator.add]
    next_agent: str
    task_results: dict

# 定义专业 Agent
def research_agent(state: MultiAgentState):
    """研究 Agent"""
    prompt = f"研究以下主题：{state['messages'][-1].content}"
    result = llm.invoke(prompt)
    return {
        "messages": [result],
        "task_results": {"research": result.content}
    }

def writer_agent(state: MultiAgentState):
    """写作 Agent"""
    research = state["task_results"].get("research", "")
    prompt = f"基于研究内容撰写报告：{research}"
    result = llm.invoke(prompt)
    return {
        "messages": [result],
        "task_results": {**state["task_results"], "draft": result.content}
    }

def reviewer_agent(state: MultiAgentState):
    """审核 Agent"""
    draft = state["task_results"].get("draft", "")
    prompt = f"审核以下报告并提出修改建议：{draft}"
    result = llm.invoke(prompt)
    return {
        "messages": [result],
        "task_results": {**state["task_results"], "review": result.content}
    }

# 主管路由逻辑
def supervisor(state: MultiAgentState) -> Literal["research", "writer", "reviewer", END]:
    """决定下一步由哪个 Agent 执行"""
    results = state["task_results"]
    
    if "research" not in results:
        return "research"
    elif "draft" not in results:
        return "writer"
    elif "review" not in results:
        return "reviewer"
    else:
        # 检查是否需要修改
        if "需要修改" in results["review"]:
            return "writer"  # 循环回写作
        return END

# 构建多智能体图
builder = StateGraph(MultiAgentState)
builder.add_node("research", research_agent)
builder.add_node("writer", writer_agent)
builder.add_node("reviewer", reviewer_agent)
builder.add_node("supervisor", supervisor)

builder.set_entry_point("supervisor")
builder.add_conditional_edges("supervisor", supervisor)

# 所有 Agent 完成后返回主管
builder.add_edge("research", "supervisor")
builder.add_edge("writer", "supervisor")
builder.add_edge("reviewer", "supervisor")

multi_agent_graph = builder.compile()

架构模式：去中心化（Network）

python 复制代码

# Agent 可以互相直接通信，无需通过中心节点
class AgentConfig(TypedDict):
    name: str
    system_prompt: str
    tools: List[Tool]

def create_agent_node(config: AgentConfig):
    """工厂函数：创建 Agent 节点"""
    def agent(state: MultiAgentState):
        messages = [{"role": "system", "content": config["system_prompt"]}] + state["messages"]
        response = llm.bind_tools(config["tools"]).invoke(messages)
        return {"messages": [response]}
    return agent

# 定义多个 Agent
agents = {
    "coder": create_agent_node({
        "name": "coder",
        "system_prompt": "你是编程专家...",
        "tools": [search_docs, run_tests]
    }),
    "designer": create_agent_node({
        "name": "designer", 
        "system_prompt": "你是架构师...",
        "tools": [draw_diagram]
    })
}

# 构建全连接图（示例）
builder = StateGraph(MultiAgentState)
for name, node in agents.items():
    builder.add_node(name, node)

# 每个 Agent 可以根据消息内容决定下一个 Agent
def route_message(state: MultiAgentState) -> str:
    last_msg = state["messages"][-1]
    if "代码" in last_msg.content:
        return "coder"
    elif "设计" in last_msg.content:
        return "designer"
    return END

for name in agents:
    builder.add_conditional_edges(name, route_message)

六、与 LangChain 生态集成

与 LCEL 结合

python 复制代码

from langchain_core.runnables import RunnableLambda

# 将 LCEL 链作为 LangGraph 节点
lcel_chain = prompt | llm | StrOutputParser()

def lcel_node(state: AgentState):
    result = lcel_chain.invoke({"query": state["input"]})
    return {"output": result}

builder.add_node("lcel_step", lcel_node)

与 LangSmith 集成（自动追踪）

python 复制代码

import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "ls-xxxx"

# 所有 LangGraph 执行自动被追踪
# 在 LangSmith 平台查看：
# - 完整的节点执行序列（Trace）
# - 每个节点的输入输出
# - 状态变化历史
# - 延迟和 Token 消耗

七、完整实战：智能研究助手

python 复制代码

from langgraph.graph import StateGraph, END
from langchain_openai import ChatOpenAI
from langchain_community.tools import DuckDuckGoSearchRun
from typing import TypedDict, Annotated, List
import operator

# 状态定义
class ResearchState(TypedDict):
    topic: str
    search_queries: List[str]
    search_results: Annotated[List[str], operator.add]
    analysis: str
    report: str
    iteration: int

# 初始化
llm = ChatOpenAI(model="gpt-4o")
search = DuckDuckGoSearchRun()

# 节点实现
def planner_node(state: ResearchState):
    """规划搜索查询"""
    prompt = f"为主题'{state['topic']}'生成3个搜索查询"
    response = llm.invoke(prompt)
    queries = [q.strip() for q in response.content.split("\n") if q.strip()]
    return {"search_queries": queries, "iteration": 0}

def search_node(state: ResearchState):
    """执行搜索"""
    results = []
    for query in state["search_queries"]:
        result = search.run(query)
        results.append(f"查询：{query}\n结果：{result}")
    return {"search_results": results, "iteration": state["iteration"] + 1}

def analysis_node(state: ResearchState):
    """分析结果"""
    context = "\n\n".join(state["search_results"])
    prompt = f"基于以下信息进行分析：\n{context}"
    analysis = llm.invoke(prompt).content
    return {"analysis": analysis}

def writer_node(state: ResearchState):
    """撰写报告"""
    prompt = f"基于分析撰写报告：\n{state['analysis']}"
    report = llm.invoke(prompt).content
    return {"report": report}

def should_continue(state: ResearchState) -> str:
    """检查是否需要深入研究"""
    if state["iteration"] < 2 and "需要更多信息" in state["analysis"]:
        # 生成新的搜索查询继续迭代
        return "continue"
    return "finish"

# 构建图
builder = StateGraph(ResearchState)
builder.add_node("planner", planner_node)
builder.add_node("search", search_node)
builder.add_node("analysis", analysis_node)
builder.add_node("writer", writer_node)

builder.set_entry_point("planner")
builder.add_edge("planner", "search")
builder.add_edge("search", "analysis")

# 条件分支：可能循环回搜索
builder.add_conditional_edges(
    "analysis",
    should_continue,
    {"continue": "search", "finish": "writer"}
)

builder.add_edge("writer", END)

# 添加持久化
from langgraph.checkpoint.memory import MemorySaver
graph = builder.compile(checkpointer=MemorySaver())

# 执行
config = {"configurable": {"thread_id": "research-001"}}
result = graph.invoke(
    {"topic": "量子计算最新进展"}, 
    config
)
print(result["report"])

八、性能优化与最佳实践

1. 状态设计原则

python 复制代码

# ✅ 好的实践：细粒度状态，避免冗余
class GoodState(TypedDict):
    messages: Annotated[List[BaseMessage], add_messages]  # 只追加消息
    metadata: dict  # 轻量级元数据

# ❌ 避免：大对象重复存储
class BadState(TypedDict):
    full_history: str  # 每次都要序列化大字符串
    large_documents: List[Document]  # 应该存引用而非全文

2. 异步执行

python 复制代码

# 异步节点
async def async_node(state: AgentState):
    result = await async_llm.ainvoke(state["input"])
    return {"output": result}

# 编译异步图
async_graph = builder.compile()

# 异步执行
result = await async_graph.ainvoke(state, config)

3. 流式输出

python 复制代码

# 流式获取中间步骤
for event in graph.stream(state, config):
    print(f"节点：{event.keys()}")
    print(f"数据：{event.values()}")
    
# 只流式最终输出
for chunk in graph.astream(state, config, stream_mode="messages"):
    print(chunk.content, end="")

九、技术选型总结

场景	推荐方案	原因
简单 API 调用链	LangChain LCEL	简洁，学习成本低
多步骤 RAG	LangGraph	需要循环验证、来源追踪
工具使用 Agent	LangGraph ReAct	思考-行动循环
多智能体协作	LangGraph Multi-Agent	状态共享、路由灵活
人工审核工作流	LangGraph + Interrupt	原生支持中断恢复
长时间运行任务	LangGraph + Checkpoint	断点续传、容错
实时对话系统	LangGraph + 异步	流式输出、并发处理

十、学习资源

官方文档 ：langchain-ai.github.io/langgraph/
概念指南 ：langchain-ai.github.io/langgraph/c...
示例库 ：github.com/langchain-a...
LangGraph Studio：可视化编辑和调试工具（桌面应用）

LangGraph 通过图结构编排 和状态持久化，解决了复杂 AI 应用的核心痛点：循环逻辑、容错恢复、人工介入。它是构建生产级 Agent 系统的首选框架。