Langchain_v1.0|核心模块-core_component_06_structured_output

Langchain_v1.0|核心模块-core_component_06_structured_output

@结构化输出

[@Response format 响应格式](#@Response format 响应格式)

[@Provider strategy 提供者策略](#@Provider strategy 提供者策略)

@schema

@strict

[@Tool calling strategy 工具调用策略](#@Tool calling strategy 工具调用策略)

@参数说明

@schema

@tool_message_content

@handle_errors

[@Custom tool message content 自定义工具消息内容](#@Custom tool message content 自定义工具消息内容)

[@Error handling 错误处理](#@Error handling 错误处理)

@多个结构化输出错误

[@Schema validation error 模式验证错误](#@Schema validation error 模式验证错误)

@错误处理策略：

[@1. Custom error message: 自定义错误消息：](#@1. Custom error message: 自定义错误消息：)

[@2. Handle specific exceptions only:](#@2. Handle specific exceptions only:)

[@3. Handle multiple exceptions:](#@3. Handle multiple exceptions:)

[@4. Custom error handler function:](#@4. Custom error handler function:)

[@5. No error handling: 没有错误处理：](#@5. No error handling: 没有错误处理：)

‍

结构化输出

结构化输出允许代理以特定的、可预测的格式返回数据。你无需解析自然语言响应，而是直接获得应用程序可以直接使用的 JSON 对象、Pydantic 模型或数据类等结构化数据。

LangChain 的 create_agent 自动处理结构化输出。用户设置他们期望的结构化输出模式，当模型生成结构化数据时，它会被捕获、验证，并返回在代理状态的 'structured_response' 键中。

python 复制代码

def create_agent(
    ...
    response_format: Union[
        ToolStrategy[StructuredResponseT],
        ProviderStrategy[StructuredResponseT],
        type[StructuredResponseT],
        None,
    ]

Response format 响应格式

使用 response_format 来控制代理返回结构化数据：

ToolStrategy[StructuredResponseT]: 使用工具调用以获得结构化输出
ProviderStrategy[StructuredResponseT] : 使用提供者原生的结构化输出
type[StructuredResponseT] : 模式类型 - 根据模型能力自动选择最佳策略
None : 未明确请求结构化输出

当直接提供模式类型时，LangChain 会自动选择：

ProviderStrategy 如果所选的模型和提供者支持原生结构化输出（例如 OpenAI、Anthropic（Claude）或 xAI（Grok））
ToolStrategy 对于所有其他模型。

使用 langchain>=1.1 时，原生结构化输出功能的支持会动态地从模型的配置数据中读取。如果数据不可用，请使用其他条件或手动指定：

python 复制代码

custom_profile = {
    "structured_output": True,
    # ...
}
model = init_chat_model("...", profile=custom_profile)

如果指定了工具，模型必须支持工具和结构化输出的同时使用。

结构化响应返回在代理的最终状态中的 structured_response 键。

Provider strategy 提供者策略

一些模型提供商会通过其 API 原生支持结构化输出（例如 OpenAI、xAI（Grok）、Gemini、Anthropic（Claude））。当可用时，这是最可靠的方法。

要使用此策略，配置 ProviderStrategy :

python 复制代码

class ProviderStrategy(Generic[SchemaT]):
    schema: type[SchemaT]
    strict: bool | None = None

strict 参数需要 langchain>=1.2 。

schema

required

定义结构化输出格式的模式。支持：

Pydantic 模型： BaseModel 子类带字段验证。返回验证后的 Pydantic 实例。
Dataclasses：带类型注解的 Python 数据类。返回字典。
TypedDict: 类型化字典类。返回 dict。
JSON Schema：包含 JSON 模式规范的字典。返回字典。

可选的布尔参数，用于启用严格模式。部分提供者（例如 OpenAI 和 xAI）支持此功能。默认值为 None （禁用）。

strict

可选的布尔参数，用于启用严格模式。部分提供者（例如 OpenAI 和 xAI）支持此功能。默认值为 None （禁用）

LangChain 在你直接将模式类型传递给 create_agent.response_format 且模型支持原生结构化输出时，会自动使用 ProviderStrategy

基于提供者的结构化输出提供高可靠性和严格验证，因为模型提供者强制执行模式。在可用时使用它。

如果提供者原生支持您选择的模型的结构化输出，那么写 response_format=ProductReview 与 response_format=ProviderStrategy(ProductReview) 在功能上是等效的。

无论哪种情况，如果结构化输出不受支持，代理将回退到工具调用策略。

python 复制代码

from langchain.agents import create_agent
from pydantic import BaseModel , Field
from langchain_01.models.scii_deepseekv3 import model
class ContactInfo(BaseModel):
    name: str = Field(description="The name of the person")
    email: str = Field(description="The email address of the person")
    phone: str = Field(description="The phone number of the person")

agent = create_agent(
    model ,
    # LangChain 在你直接将模式类型传递给 create_agent.response_format
    #  且模型支持原生结构化输出时，会自动使用 ProviderStrategy
    response_format = ContactInfo
)

result = agent.invoke(
    {"messages": [{"role": "user", "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]}
)
print(result["structured_response"])

复制代码

name='John Doe' email='john@example.com' phone='(555) 123-4567'

Tool calling strategy 工具调用策略

对于不支持原生结构化输出的模型，LangChain 使用工具调用来实现相同的结果。这适用于所有支持工具调用的模型（大多数现代模型）。

要使用此策略，配置 ToolStrategy :

python 复制代码

class ToolStrategy(Generic[SchemaT]):
    schema: type[SchemaT]
    tool_message_content: str | None
    handle_errors: Union[
        bool,
        str,
        type[Exception],
        tuple[type[Exception], ...],
        Callable[[Exception], str],
    ]

参数说明

schema

required

The schema defining the structured output format. Supports:

Pydantic models： BaseModel 子类带字段验证。返回验证后的 Pydantic 实例。
JSON schemas：包含 JSON 模式规范的字典。返回字典。
Dataclasses ：带类型注解的 Python 数据类。返回字典。
TypedDict：类型化字典类。返回 dict。
Union types：多种模式选项。模型将根据上下文选择最合适的模式。

tool_message_content

生成结构化输出时工具消息的自定义内容。如果未提供，则默认显示显示结构化响应数据的消息。

handle_errors

结构化输出验证失败的错误处理策略。默认为 True 。

True: Catch all errors with default error template ：默认错误模板捕获所有异常。
str: Catch all errors with this custom message ：自定义错误消息捕获所有异常。
type[Exception]: Only catch this exception type with default message ：仅捕获此异常类型的默认消息。
tuple[type[Exception], ...]: Only catch these exception types with default message ：仅捕获这些异常类型的默认消息。
Callable[[Exception], str]: Custom function that returns error message ：自定义函数返回错误消息。
False: No retry, let exceptions propagate ：不重试，让异常传播。

python 复制代码

from langchain.agents.structured_output import ToolStrategy,ProviderStrategy
from langchain.agents import create_agent
from dataclasses import dataclass
from typing import Union,TypedDict,Literal,Annotated
from pydantic import BaseModel,Field



# 
class ProductReviewPydantic(BaseModel):
    """Analysis of a product review."""
    rating: int | None = Field(description="The rating of the product (1-5)")  # The rating of the product (1-5)
    sentiment: Literal["positive", "negative"] = Field(description="The sentiment of the review")  # The sentiment of the review
    key_points: list[str] = Field(description="The key points of the review")  # The key points of the review

# 定义产品评论TypedDict
class ProductReviewTd(TypedDict):
    """Analysis of a product review."""
    rating: Annotated[int | None, Field(description="The rating of the product (1-5)")]  # The rating of the product (1-5)
    sentiment: Annotated[Literal["positive", "negative"], Field(description="The sentiment of the review")]  # The sentiment of the review
    key_points: Annotated[list[str], Field(description="The key points of the review")]  # The key points of the review

# 定义产品评论数据类
@dataclass
class ProductReview:
    """Analysis of a product review."""
    rating: int | None  # The rating of the product (1-5)
    sentiment: Literal["positive", "negative"]  # The sentiment of the review
    key_points: list[str]  # The key points of the review




agent = create_agent(
    model,
    tools=[],
    response_format=ToolStrategy(ProductReviewTd)
)
result = agent.invoke({
    "messages": [{"role": "user", "content": "Analyze this review: 'Great product: 5 out of 5 stars. Fast shipping, but expensive'"}]
})
result["structured_response"]

复制代码

{'rating': 5,
 'sentiment': 'positive',
 'key_points': ['Great product', 'Fast shipping', 'Expensive']}

Custom tool message content 自定义工具消息内容

tool_message_content 参数允许您自定义在生成结构化输出时，在对话历史中显示的消息：

python 复制代码

from pydantic import BaseModel,Field

class Meeting(BaseModel):
    """会议信息"""
    task: str = Field(description="会议任务")
    priority: str = Field(description="会议优先级")
    assignee: str = Field(description="会议负责人")

# Action item captured and added to meeting notes!
# 翻译为中文：会议任务已捕获并添加到会议笔记中！

agent = create_agent(
    model , 
    tools = [],
    response_format = ToolStrategy(
        schema = Meeting,
        tool_message_content = "会议任务已捕获并添加到会议笔记中！"
    )
)

agent.invoke({
    "messages": [{"role": "user", "content": "会议任务：更新项目时间线 会议优先级：高 会议负责人：Sarah"}]
})

复制代码

{'messages': [HumanMessage(content='会议任务：更新项目时间线 会议优先级：高 会议负责人：Sarah', additional_kwargs={}, response_metadata={}, id='6dbfa4b9-b453-4c58-9e12-5d1f1f1d2766'),
  AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 33, 'prompt_tokens': 360, 'total_tokens': 393, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': None, 'reasoning_tokens': 0, 'rejected_prediction_tokens': None}, 'prompt_tokens_details': None}, 'model_provider': 'openai', 'model_name': 'Qwen/Qwen3-Coder-30B-A3B-Instruct', 'system_fingerprint': '', 'id': '019be14d2522971de8b260f9411771ee', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--019be14d-3561-79e3-8de0-7bf6129e20f9-0', tool_calls=[{'name': 'Meeting', 'args': {'task': '更新项目时间线', 'priority': '高', 'assignee': 'Sarah'}, 'id': '019be14d2d0a3b1234140a6f416291ca', 'type': 'tool_call'}], invalid_tool_calls=[], usage_metadata={'input_tokens': 360, 'output_tokens': 33, 'total_tokens': 393, 'input_token_details': {}, 'output_token_details': {'reasoning': 0}}),
  ToolMessage(content='会议任务已捕获并添加到会议笔记中！', name='Meeting', id='367fcbb3-9ab9-425f-b434-22715e7980d1', tool_call_id='019be14d2d0a3b1234140a6f416291ca')],
 'structured_response': Meeting(task='更新项目时间线', priority='高', assignee='Sarah')}

Error handling 错误处理

模型在通过工具调用生成结构化输出时可能会出错。LangChain 提供了智能重试机制来自动处理这些错误。

多个结构化输出错误

当模型错误地调用多个结构化输出工具时，代理会在 ToolMessage 中提供错误反馈，并提示模型重试：

Schema validation error 模式验证错误

当结构化输出与预期模式不匹配时，代理会提供具体的错误反馈：

上面的错误测试没成功，生产时遇到后解决

错误处理策略：

你可以使用 handle_errors 参数自定义错误处理方式：

1. Custom error message: 自定义错误消息：

python 复制代码

ToolStrategy(
    schema=ProductRating,
    handle_errors="Please provide a valid rating between 1-5 and include a comment."
)

如果 handle_errors 是一个字符串，代理将始终提示模型使用固定的工具消息重试：

2. Handle specific exceptions only:

仅处理特定异常：

python 复制代码

ToolStrategy(
    schema=ProductRating,
    handle_errors=ValueError  # Only retry on ValueError, raise others
)

如果 handle_errors 是一个异常类型，代理只有在抛出的异常是指定类型时才会重试（使用默认错误消息）。在其他所有情况下，异常将被抛出。

3. Handle multiple exceptions:

处理多个异常类型：

python 复制代码

ToolStrategy(
    schema=ProductRating,
    handle_errors=(ValueError, TypeError)  # Retry on ValueError or TypeError, raise others
)

如果 handle_errors 是一个异常的元组，代理将仅在使用默认错误消息的情况下重试，前提是引发的异常是指定的类型之一。在其他所有情况下，异常将被抛出。

4. Custom error handler function:

自定义错误处理函数：

python 复制代码

from langchain.agents.structured_output import StructuredOutputValidationError
from langchain.agents.structured_output import MultipleStructuredOutputsError


def custom_error_handler(error: Exception) -> str:
    if isinstance(error, StructuredOutputValidationError):
        return "There was an issue with the format. Try again."
    elif isinstance(error, MultipleStructuredOutputsError):
        return "Multiple structured outputs were returned. Pick the most relevant one."
    else:
        return f"Error: {str(error)}"

5. No error handling: 没有错误处理：

python 复制代码

ToolStrategy(
    schema=ProductRating,
    handle_errors=False  # All errors raised
)

python 复制代码

# 多个结构化输出错误
from pydantic import BaseModel, Field
from typing import Union
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

qwen_model = model.with_config(
    {"configurable": {"model": "MiniMaxAI/MiniMax-M2"}}
)


class ContactInfo(BaseModel):
    name: str = Field(description="Person's name")
    email: str = Field(description="Email address")

class EventDetails(BaseModel):
    event_name: str = Field(description="Name of the event")
    date: str = Field(description="Event date")

agent = create_agent(
    model,
    tools=[],
    response_format=ToolStrategy(
        Union[EventDetails, ContactInfo],
        handle_errors="这是一个错误，请忽略"
        )  # Default: handle_errors=True
)

agent.invoke({
    "messages": [{"role": "user", "content": "Extract info: John Doe (john@email.com) is organizing Tech Conference on March 15th"}]
})

# 一直阻塞生成不了，没有测试成功

python 复制代码

#   模式验证错误

from pydantic import BaseModel, Field
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy

deepseek_model = model.with_config(
    {"configurable": {"model": "deepseek-ai/DeepSeek-V3.2"}}
)


class ProductRating(BaseModel):
    rating: int | None = Field(description="Rating", ge=1, le=5)
    comment: str = Field(description="Review comment")

agent = create_agent(
    model=qwen_model,
    tools=[],
    response_format=ToolStrategy(ProductRating, handle_errors="这是一个错误，请忽略"),  # Default: handle_errors=True
    system_prompt="You are a helpful assistant that parses product reviews. Do not make any field or value up."
)

agent.invoke({
    "messages": [{"role": "user", "content": "Parse this: Amazing product: im rate is  6 !"}]
})

# 没复现问题， 控制在5分， 10分会被截断为5分

复制代码

{'messages': [HumanMessage(content='Parse this: Amazing product: im rate is  6 !', additional_kwargs={}, response_metadata={}, id='d79e5bd5-b95f-4c91-a3ca-be8d2b841f1d'),
  AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 36, 'prompt_tokens': 243, 'total_tokens': 279, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': None, 'reasoning_tokens': 0, 'rejected_prediction_tokens': None}, 'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}, 'prompt_cache_hit_tokens': 0, 'prompt_cache_miss_tokens': 243}, 'model_provider': 'openai', 'model_name': 'MiniMaxAI/MiniMax-M2', 'system_fingerprint': '', 'id': '019be157fab240858ec5c3a5c01a9444', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--019be158-0b43-71f3-a5fa-58617e82ff79-0', tool_calls=[{'name': 'ProductRating', 'args': {'rating': 5, 'comment': 'Amazing product'}, 'id': '019be157fd4b8563de9914c2d4fd7249', 'type': 'tool_call'}], invalid_tool_calls=[], usage_metadata={'input_tokens': 243, 'output_tokens': 36, 'total_tokens': 279, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 0}}),
  ToolMessage(content="Returning structured response: rating=5 comment='Amazing product'", name='ProductRating', id='4e103521-0b27-4659-9b8a-64c5b528a9fd', tool_call_id='019be157fd4b8563de9914c2d4fd7249')],
 'structured_response': ProductRating(rating=5, comment='Amazing product')}

python 复制代码

# 自定义错误处理函数

from langchain.agents.structured_output import StructuredOutputValidationError
from langchain.agents.structured_output import MultipleStructuredOutputsError

def custom_error_handler(error: Exception) -> str:
    if isinstance(error, StructuredOutputValidationError):
        return "There was an issue with the format. Try again."
    elif isinstance(error, MultipleStructuredOutputsError):
        return "Multiple structured outputs were returned. Pick the most relevant one."
    else:
        return f"Error: {str(error)}"


agent = create_agent(
    model,
    tools=[],
    response_format=ToolStrategy(
                        schema=Union[ContactInfo, EventDetails],
                        handle_errors=custom_error_handler
                    )  # Default: handle_errors=True
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "Extract info: John Doe (john@email.com) is organizing Tech Conference on March 15th"}]
})

for msg in result['messages']:
    # If message is actually a ToolMessage object (not a dict), check its class name
    if type(msg).__name__ == "ToolMessage":
        print(msg.content)
    # If message is a dictionary or you want a fallback
    elif isinstance(msg, dict) and msg.get('tool_call_id'):
        print(msg['content'])