LangGraph 实战指南：构建状态驱动的 LLM 应用架构

前言

随着大语言模型（LLM）应用的发展，越来越多的场景需要构建有状态、多步骤、能够"记忆"上下文的应用。传统的 Chain 模式难以应对复杂的业务逻辑和状态管理需求。

LangGraph 应运而生------作为 LangChain 生态中用于构建有状态、多参与者应用的核心框架，LangGraph 通过图结构定义应用流程，提供了强大的状态管理、循环分支、人机协作等能力。

本文将从 LangGraph 的核心概念、架构设计、实战案例等多个维度，带你深入掌握这一革命性的 AI 应用开发工具。

一、LangGraph 核心概念

1.1 什么是 LangGraph

LangGraph 是 LangChain 团队推出的开源框架，专门用于构建由 LLM 驱动的有状态应用 。它使用有向图（Directed Graph）来定义应用的执行流程。
是
否
循环
开始
Agent 思考
需要工具?
执行工具
生成回复
结束

1.2 为什么需要 LangGraph

传统 Chain	LangGraph
线性执行流程	支持循环、分支
无状态或简单状态	完整的状态管理
单一路径	多路径并行
难以处理错误	内置错误处理和重试

1.3 核心概念

State（状态）：应用运行时共享的数据结构
Nodes（节点）：执行单元（LLM 调用、工具执行等）
Edges（边）：定义节点之间的转换逻辑
Graph（图）：完整的应用工作流定义

二、快速开始

2.1 安装 LangGraph

bash 复制代码

# Python 安装
pip install langgraph

# JavaScript 安装
npm install @langchain/langgraph

2.2 第一个 LangGraph 应用

python 复制代码

from typing import Annotated, TypedDict
from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import create_react_agent

# 定义状态结构
class AgentState(TypedDict):
    messages: Annotated[list, "对话消息历史"]
    user_input: str
    agent_response: str

三、核心组件详解

3.1 状态管理

LangGraph 使用 TypedDict 定义应用状态：

python 复制代码

from typing import Annotated, TypedDict
from langgraph.graph import MessagesState

class GraphState(MessagesState):
    """继承 MessagesState 自动管理消息"""
    pass

# 或自定义复杂状态
class CustomerSupportState(TypedDict):
    customer_id: str
    question: str
    category: str
    context: dict
    history: list
    resolved: bool

3.2 节点定义

节点可以是任何 Python 函数，接收状态并返回状态更新：

python 复制代码

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4")

def analyze_intent(state: CustomerSupportState) -> dict:
    """分析用户意图"""
    prompt = f"分析以下客户问题的类别（技术/账单/投诉）：{state['question']}"

    response = llm.invoke(prompt)
    category = response.content.strip()

    return {"category": category}

def generate_response(state: CustomerSupportState) -> dict:
    """生成回复"""
    prompt = f"作为客服，请回答：{state['question']}"

    response = llm.invoke(prompt)

    return {"agent_response": response.content, "resolved": True}

3.3 条件边

使用条件边实现动态路由：

python 复制代码

from langgraph.graph import StateGraph, START, END

def route_question(state: CustomerSupportState) -> str:
    """根据问题类别路由到不同处理流程"""
    category = state.get("category", "")

    if category == "技术":
        return "technical_support"
    elif category == "账单":
        return "billing_support"
    elif category == "投诉":
        return "complaint_handling"
    else:
        return "general_support"

# 构建图
workflow = StateGraph(CustomerSupportState)
workflow.add_node("analyze", analyze_intent)
workflow.add_node("respond", generate_response)

workflow.add_conditional_edges(
    "analyze",
    route_question
)

四、实战案例：智能客服系统

4.1 系统设计

构建一个具备以下功能的智能客服系统：

自动识别客户问题类型
根据类型分配到专门的处理流程
支持人工介入
记录完整的对话历史

4.2 完整实现

python 复制代码

from typing import Annotated, TypedDict
from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool

# ============ 状态定义 ============
class CustomerServiceState(MessagesState):
    """客服系统状态"""
    customer_id: str
    category: str
    needs_human: bool
    resolution: str

# ============ 工具定义 ============
@tool
def search_knowledge_base(query: str) -> str:
    """搜索知识库"""
    kb = {
        "退款政策": "7天内可无理由退款...",
        "技术问题": "请尝试重启设备...",
        "账单查询": "请在我的账户中查看..."
    }
    return kb.get(query, "未找到相关信息")

@tool
def create_ticket(issue: str, customer_id: str) -> str:
    """创建工单"""
    return f"工单已创建：{customer_id} - {issue}"

# ============ 节点定义 ============
llm = ChatOpenAI(model="gpt-4")

def classify_issue(state: CustomerServiceState) -> dict:
    """分类问题"""
    messages = state["messages"]
    last_message = messages[-1].content if messages else ""

    prompt = f"""分类以下客户问题（技术/账单/投诉/其他）：
    {last_message}

    只返回类别名称。"""

    category = llm.invoke(prompt).content.strip()

    # 判断是否需要人工
    needs_human = category == "投诉"

    return {
        "category": category,
        "needs_human": needs_human
    }

def handle_technical(state: CustomerServiceState) -> dict:
    """处理技术问题"""
    tools = [search_knowledge_base]
    llm_with_tools = llm.bind_tools(tools)

    response = llm_with_tools.invoke(state["messages"])

    return {
        "resolution": response.content,
        "needs_human": False
    }

def handle_billing(state: CustomerServiceState) -> dict:
    """处理账单问题"""
    tools = [search_knowledge_base]
    llm_with_tools = llm.bind_tools(tools)

    response = llm_with_tools.invoke(state["messages"])

    return {
        "resolution": response.content,
        "needs_human": False
    }

def escalate_to_human(state: CustomerServiceState) -> dict:
    """升级到人工"""
    resolution = f"您的问题已转接给人工客服（工号：{state['customer_id']}）"
    return {"resolution": resolution}

def should_escalate(state: CustomerServiceState) -> str:
    """判断是否需要人工"""
    if state.get("needs_human"):
        return "escalate"

    category = state.get("category", "")
    if category in ["技术", "账单"]:
        return category
    return "general"

def continue_chat(state: CustomerServiceState) -> dict:
    """继续对话"""
    if not state.get("resolution"):
        response = llm.invoke(state["messages"])
        return {"messages": [response]}
    return {}

def check_resolved(state: CustomerServiceState) -> bool:
    """检查问题是否已解决"""
    return bool(state.get("resolution"))

# ============ 构建图 ============
def build_customer_service_graph():
    workflow = StateGraph(CustomerServiceState)

    # 添加节点
    workflow.add_node("classify", classify_issue)
    workflow.add_node("technical", handle_technical)
    workflow.add_node("billing", handle_billing)
    workflow.add_node("escalate", escalate_to_human)
    workflow.add_node("continue", continue_chat)

    # 设置入口点
    workflow.set_entry_point("classify")

    # 添加条件边
    workflow.add_conditional_edges(
        "classify",
        should_escalate,
        {
            "escalate": "escalate",
            "technical": "technical",
            "billing": "billing",
            "general": "continue"
        }
    )

    # 添加循环边（继续对话直到解决）
    workflow.add_conditional_edges(
        "continue",
        check_resolved,
        {
            True: END,
            False: "continue"
        }
    )

    # 所有处理流程都结束
    workflow.add_edge("technical", END)
    workflow.add_edge("billing", END)
    workflow.add_edge("escalate", END)

    return workflow

# ============ 使用系统 ============
if __name__ == "__main__":
    # 编译图
    app = build_customer_service_graph()

    # 可视化
    from langgraph.graph import MermaidDrawCallback
    app.get_graph().print_ascii()

    # 调用
    config = {"configurable": {"customer_id": "C12345"}}

    result = app.invoke(
        {
            "messages": [
                {"role": "user", "content": "我的设备无法连接网络"}
            ]
        },
        config=config
    )

    print(f"\n最终回复：{result['messages'][-1].content}")

五、高级功能

5.1 记忆机制

LangGraph 与 LangChain Memory 无缝集成：

python 复制代码

from langgraph.checkpoint.sqlite import MemorySaver
from langchain.memory import ConversationBufferMemory

# 持久化状态
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# 检查点保存器
checkpointer = MemorySaver.from_conn_string(":memory:")

# 编译时添加 checkpointer
app = workflow.compile(checkpointer=checkpointer)

# 使用线程 ID 对话
thread_id = "user_123"
config = {"configurable": {"thread_id": thread_id}}

for message in messages:
    result = app.invoke(
        {"messages": [message]},
        config=config
    )

5.2 时间旅行调试

LangGraph 的状态持久化支持"时间旅行"：

python 复制代码

# 查看历史状态
for state in app.get_state_history(config):
    print(state)

# 回滚到特定状态
app.get_state(config).values["current_state"] = previous_state

5.3 多 Agent 协作

构建多 Agent 协作系统：

python 复制代码

from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI

# 研究员 Agent
researcher = create_react_agent(
    ChatOpenAI(model="gpt-4"),
    tools=[search_tool, wikipedia_tool],
    state_modifier=system_prompt
)

# 写作 Agent
writer = create_react_agent(
    ChatOpenAI(model="gpt-4"),
    tools=[file_tool],
    state_modifier=system_prompt
)

# 审稿 Agent
reviewer = create_react_agent(
    ChatOpenAI(model="gpt-4"),
    tools=[],
    state_modifier=system_prompt
)

# 构建协作图
def research_node(state):
    result = researcher.invoke(state)
    return {"research_data": result}

def write_node(state):
    result = writer.invoke({
        "topic": state["topic"],
        "research": state["research_data"]
    })
    return {"draft": result}

def review_node(state):
    result = reviewer.invoke({"draft": state["draft"]})
    return {"final_output": result, "approved": result["score"] > 8}

# 构建工作流
workflow = StateGraph(ContentCreationState)
workflow.add_node("research", research_node)
workflow.add_node("write", write_node)
workflow.add_node("review", review_node)

workflow.add_edge(START, "research")
workflow.add_edge("research", "write")
workflow.add_edge("write", "review")

workflow.add_conditional_edges(
    "review",
    lambda x: "approved" if x["approved"] else "revision"
)

六、生产部署

6.1 LangServe 部署

python 复制代码

from fastapi import FastAPI
from langgraph.graph import StateGraph
from langserve import add_routes

app = FastAPI()

# 编译图
compiled_graph = workflow.compile()

# 添加 LangServe 路由
add_routes(
    app,
    compiled_graph,
    path="/customer-service",
    playground_type="langgraph"
)

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

6.2 监控与调试

python 复制代码

from langsmith import Langsmith
import os

# 初始化 LangSmith
os.environ["LANGCHAIN_API_KEY"] = "your-key"
os.environ["LANGCHAIN_TRACING_V2"] = "true"

# 所有调用自动追踪
from langgraph.graph import StateGraph

app = workflow.compile()

七、最佳实践

7.1 状态设计原则

原则	说明
最小化状态	只存储必要的数据
明确类型	使用 TypedDict 定义类型
避免嵌套过深	保持状态结构扁平
版本控制	状态字段变更时的兼容性

7.2 错误处理

python 复制代码

from typing import Annotated

class AgentState(TypedDict):
    error: Annotated[str, "错误信息"]
    retry_count: Annotated[int, "重试次数"]
    max_retries: int = 3

def error_handler(state: AgentState) -> dict:
    if state["retry_count"] < state["max_retries"]:
        return {
            "retry_count": state["retry_count"] + 1,
            "error": ""  # 清空错误，重试
        }

    # 超过最大重试次数
    return {
        "error": "处理失败，请联系管理员"
    }

workflow.add_conditional_edges(
    "process",
    lambda x: "retry" if x["error"] else "success"
)

7.3 性能优化

优化策略	实现方法
并行执行	使用 RunnableParallel 并行调用节点
流式输出	在节点中流式生成响应
缓存状态	对不变的计算结果进行缓存
异步调用	使用 ainvoke 进行异步执行

八、总结与展望

核心要点

图结构建模：用有向图定义复杂的应用流程
状态驱动：完整的状态管理和时间旅行调试
多模态支持：支持 LLM、工具、人工等多种节点
生产就绪：内置检查点、持久化、监控等功能
生态集成：与 LangChain、LangSmith 无缝集成

应用场景

智能客服：分类、路由、升级到人工
内容创作：研究、写作、审稿多 Agent 协作
数据分析：多步骤数据处理流程
工作流自动化：复杂业务流程编排
游戏 AI：有状态的游戏 NPC 行为

未来展望

可视化设计器：拖拽式图构建界面
更强的调度：支持优先级、资源分配
分布式执行：跨节点的图执行
版本兼容：自动处理图结构的版本迁移

对于开发者而言，LangGraph 不仅是构建复杂 LLM 应用的工具，更是理解"如何让 AI 系统像软件一样可靠运行"的重要一步。随着 AI 应用的不断发展，LangGraph 将成为构建企业级 AI 应用的基础设施。

参考资料

本文基于 LangGraph 0.1.0 版本撰写，涵盖 Python API。