09-大模型智能体开发工程师：结构化输出与JSON Schema

系列文章导航：AI系列文章导航目录-持续更新中

第09课：结构化输出与JSON Schema

📝 本文摘要：本文讲解Agent开发中结构化输出的必要性（工具调用、信息提取、多步规划、分类路由、多Agent通信），对比三种实现方法（Prompt约束→JSON Mode→Structured Outputs/JSON Schema），详解JSON Schema核心语法（description/required/additionalProperties/enum）、Pydantic定义Schema、不同模型的结构化输出支持情况，以及输出验证重试和降级策略的工程实践。
Agent要调用工具、要和其他系统交互，就必须输出结构化数据。让模型可靠地输出JSON，是Agent开发的基本功。

一、为什么需要结构化输出

1.1 问题

一句话理解：LLM默认输出自由文本，但你的代码需要结构化数据。结构化输出就是让模型按照你定义的格式（通常是JSON）输出。

复制代码

你让模型"列出3个城市"，它可能输出:
  - "北京、上海、广州"           ← 逗号分隔
  - "1. 北京\n2. 上海\n3. 广州"  ← 编号列表
  - "三个城市是北京上海广州"      ← 一句话
  - {"cities": ["北京","上海","广州"]}  ← JSON

如果你要在代码里用这些结果，只有最后一种是可靠的。

为什么？因为代码需要确定性：
  result = model_output["cities"][0]  # 取第一个城市
  
  如果模型输出"北京、上海、广州"，你怎么用代码取第一个城市？
  → 要写正则表达式？要split？如果模型换了格式呢？
  → 不可靠！

  如果模型输出{"cities": ["北京","上海","广州"]}：
  → json.loads(output)["cities"][0]  → "北京"
  → 稳定可靠！

1.2 Agent中结构化输出的关键场景

复制代码

1. Function Calling: 模型必须输出特定格式的工具调用
   例: {"name": "get_weather", "arguments": {"city": "北京"}}
   → 你的代码才能知道调哪个函数、传什么参数

2. 信息提取: 从文本中提取结构化数据
   例: 从新闻中提取 {"人物": "马化腾", "事件": "发布AI战略", "时间": "2026-05-20"}
   → 存入数据库或做后续处理

3. 多步规划: 模型输出执行计划
   例: {"steps": [{"id": 1, "action": "查询订单"}, {"id": 2, "action": "创建退款"}]}
   → 你的代码按步骤执行

4. 分类路由: 模型输出意图分类，用于路由到不同处理流程
   例: {"intent": "退货", "confidence": 0.95}
   → 代码根据intent路由到退货处理流程

5. 多Agent通信: Agent之间用结构化数据传递信息
   例: Agent A输出 {"research_result": "...", "confidence": 0.8}
   → Agent B读取这个结构化结果继续工作

本质: 结构化输出 = 让LLM的输出能被代码可靠地解析和使用

二、结构化输出的三种方法

2.1 Prompt约束（最基础）

python 复制代码

prompt = """
从以下文本中提取人名、地点和事件。
请严格按照以下JSON格式输出，不要输出其他内容：

{
  "persons": ["人名1", "人名2"],
  "locations": ["地点1"],
  "events": ["事件1"]
}

文本：张三在北京参加了AI峰会，李四在上海发布了新产品。
"""

问题：

模型可能输出Markdown代码块包裹的JSON（```json ... ```）
模型可能加解释性文字（"以下是提取结果：\n{...}"）
格式不稳定，需要复杂的后处理
可靠性约70-80%，生产环境不够用

适用场景：快速原型、对可靠性要求不高的场景

2.2 JSON Mode（中等可靠）

python 复制代码

# OpenAI API
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "列出3个中国城市"}],
    response_format={"type": "json_object"}  # 强制JSON输出
)

# 模型一定会输出合法的JSON，但字段名和结构不一定符合你的期望

注意：JSON Mode只保证输出是合法JSON，不保证结构符合你的Schema。

复制代码

你期望: {"cities": ["北京", "上海", "广州"], "country": "中国"}
实际可能: {"result": "北京、上海、广州"}  ← 合法JSON，但结构不对！

你仍需在Prompt中描述期望的结构。可靠性约90%。

2.3 Structured Outputs / JSON Schema（最可靠）⭐

python 复制代码

from openai import OpenAI
from pydantic import BaseModel

class CityList(BaseModel):
    cities: list[str]
    country: str

client = OpenAI()

response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[{"role": "user", "content": "列出3个中国城市"}],
    response_format=CityList  # 直接用Pydantic模型定义
)

result = response.choices[0].message.parsed
print(result.cities)   # ["北京", "上海", "广州"]
print(result.country)  # "中国"

或者用JSON Schema定义：

python 复制代码

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "列出3个中国城市"}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "city_list",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "cities": {
                        "type": "array",
                        "items": {"type": "string"},
                        "description": "城市名称列表"
                    },
                    "country": {
                        "type": "string",
                        "description": "国家名称"
                    }
                },
                "required": ["cities", "country"],
                "additionalProperties": False
            }
        }
    }
)

import json
result = json.loads(response.choices[0].message.content)

关键：strict: True 确保模型严格按Schema输出，100%符合结构。

三种方法对比：

复制代码

┌──────────────────┬──────────┬──────────────┬────────────────────┐
│ 方法             │ 可靠性   │ 实现难度     │ 适用场景           │
├──────────────────┼──────────┼──────────────┼────────────────────┤
│ Prompt约束       │ 70-80%   │ 最简单       │ 快速原型           │
│ JSON Mode        │ ~90%     │ 简单         │ 结构简单的场景     │
│ Structured Output│ 100%     │ 需要写Schema │ 生产环境（推荐）   │
└──────────────────┴──────────┴──────────────┴────────────────────┘

三、JSON Schema核心语法

作为Agent开发者，你必须能写JSON Schema：

json 复制代码

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "title": "OrderExtraction",
  "description": "从文本中提取订单信息",
  "properties": {
    "order_id": {
      "type": "string",
      "description": "订单编号"
    },
    "items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "name": {"type": "string", "description": "商品名称"},
          "quantity": {"type": "integer", "description": "数量"},
          "price": {"type": "number", "description": "单价"}
        },
        "required": ["name", "quantity", "price"],
        "additionalProperties": false
      }
    },
    "total_amount": {
      "type": "number",
      "description": "总金额"
    },
    "status": {
      "type": "string",
      "enum": ["pending", "shipped", "delivered", "cancelled"],
      "description": "订单状态"
    }
  },
  "required": ["order_id", "items", "total_amount", "status"],
  "additionalProperties": false
}

关键规则

复制代码

1. description很重要！模型根据描述理解每个字段的含义
2. required必须显式列出，否则模型可能省略
3. `additionalProperties: false` 防止模型自创字段（即不允许输出Schema中未定义的额外字段）
4. `enum`限制取值范围（枚举类型，只能在指定列表中选值），比free text（自由文本，任意字符串）更可靠
5. 嵌套结构要每层都设required和additionalProperties

四、不同模型的结构化输出支持

模型	方法	可靠性
GPT-4o / GPT-4.1	Structured Outputs (JSON Schema)	★★★★★ 100%符合
Claude 3.5+	Tool Use (JSON Schema)	★★★★★
DeepSeek-V3	JSON Mode + Prompt	★★★★☆
Qwen2.5	JSON Mode + Prompt	★★★★☆
开源模型(本地)	纯Prompt约束	★★★☆☆

本地模型的结构化输出

python 复制代码

# Ollama也支持结构化输出（部分模型）
response = client.chat.completions.create(
    model="qwen2.5:7b",
    messages=[{"role": "user", "content": "列出3个中国城市"}],
    response_format={
        "type": "json_object"
    },
    # Ollama扩展参数
    extra_body={
        "format": "json"  # Ollama的JSON mode
    }
)

五、结构化输出的工程化实践

5.1 用Pydantic定义Schema

python 复制代码

from pydantic import BaseModel, Field
from typing import Optional
from enum import Enum

class OrderStatus(str, Enum):
    PENDING = "pending"
    SHIPPED = "shipped"
    DELIVERED = "delivered"
    CANCELLED = "cancelled"

class OrderItem(BaseModel):
    name: str = Field(description="商品名称")
    quantity: int = Field(description="数量", ge=1)
    price: float = Field(description="单价", ge=0)

class Order(BaseModel):
    order_id: str = Field(description="订单编号")
    items: list[OrderItem] = Field(description="商品列表")
    total_amount: float = Field(description="总金额")
    status: OrderStatus = Field(description="订单状态")
    notes: Optional[str] = Field(default=None, description="备注")

# 自动生成JSON Schema
schema = Order.model_json_schema()

5.2 输出验证与重试

python 复制代码

import json
from pydantic import ValidationError

def call_llm_with_schema(prompt: str, schema: type[BaseModel], max_retries=3):
    """调用LLM并验证结构化输出"""
    
    for attempt in range(max_retries):
        try:
            # 方法1: 用Structured Outputs（如果模型支持）
            response = client.beta.chat.completions.parse(
                model="gpt-4o",
                messages=[{"role": "user", "content": prompt}],
                response_format=schema
            )
            return response.choices[0].message.parsed
            
        except ValidationError as e:
            if attempt < max_retries - 1:
                # 把错误信息反馈给模型，让它修正
                prompt = f"""上一次输出验证失败：
{str(e)}

请修正后重新输出。原始请求：{prompt}"""
            else:
                raise

5.3 降级策略

python 复制代码

def robust_extract(text: str, schema: type[BaseModel]):
    """多级降级的结构化提取"""
    
    # Level 1: Structured Outputs（最可靠）
    try:
        return call_with_structured_output(text, schema)
    except Exception:
        pass
    
    # Level 2: JSON Mode + Prompt
    try:
        result = call_with_json_mode(text, schema)
        return schema.model_validate_json(result)
    except Exception:
        pass
    
    # Level 3: 纯Prompt + 正则提取
    try:
        result = call_with_prompt(text, schema)
        # 尝试从输出中提取JSON
        json_str = extract_json_from_text(result)
        return schema.model_validate_json(json_str)
    except Exception:
        pass
    
    # Level 4: 返回默认值
    return schema.model_construct_default()

📝 作业

作业1：设计一个信息提取的结构化输出

需求：从新闻文本中提取关键信息，包括标题、发布日期、分类（科技/财经/体育/娱乐）、关键人物、关键事件摘要。

参考答案：

python 复制代码

from pydantic import BaseModel, Field
from typing import Optional
from enum import Enum

class NewsCategory(str, Enum):
    TECH = "科技"
    FINANCE = "财经"
    SPORTS = "体育"
    ENTERTAINMENT = "娱乐"
    OTHER = "其他"

class KeyPerson(BaseModel):
    name: str = Field(description="人物姓名")
    role: Optional[str] = Field(default=None, description="人物身份/职务")

class NewsExtraction(BaseModel):
    title: str = Field(description="新闻标题")
    publish_date: Optional[str] = Field(default=None, description="发布日期，格式YYYY-MM-DD")
    category: NewsCategory = Field(description="新闻分类")
    key_persons: list[KeyPerson] = Field(default_factory=list, description="关键人物列表")
    summary: str = Field(description="关键事件摘要，50字以内")

# 测试
news_text = """
2026年5月20日，腾讯CEO马化腾在数字中国峰会上宣布，
公司将投入100亿元用于AI大模型研发。该消息引发了科技股的集体上涨，
其中腾讯股价当日涨幅超过5%。分析师李明认为，此举将加速中国AI产业发展。
"""

# 用Structured Outputs提取
from openai import OpenAI
client = OpenAI()

response = client.beta.chat.completions.parse(
    model="gpt-4o-mini",
    messages=[{
        "role": "user", 
        "content": f"请从以下新闻中提取关键信息：\n{news_text}"
    }],
    response_format=NewsExtraction
)

result = response.choices[0].message.parsed
print(result.model_dump_json(indent=2))

作业2：实现一个通用的结构化输出工具函数

写一个函数，接受Pydantic模型和提示词，返回解析后的模型实例，包含降级和重试逻辑。

参考答案：

python 复制代码

import json
import re
from pydantic import BaseModel, ValidationError
from openai import OpenAI

client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")

def structured_extract(
    prompt: str, 
    schema: type[BaseModel], 
    model: str = "qwen2.5:7b",
    max_retries: int = 2
) -> BaseModel:
    """
    通用结构化提取函数
    
    Args:
        prompt: 提示词
        schema: Pydantic模型类
        model: 模型名称
        max_retries: 最大重试次数
    """
    schema_json = json.dumps(schema.model_json_schema(), indent=2, ensure_ascii=False)
    
    system_prompt = f"""你是一个信息提取专家。请严格按照以下JSON Schema输出结果：
{schema_json}

重要规则：
1. 只输出JSON，不要输出其他任何内容
2. 不要用Markdown代码块包裹
3. 确保输出是合法的JSON
4. 所有required字段都必须包含"""

    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt}
    ]
    
    for attempt in range(max_retries + 1):
        try:
            response = client.chat.completions.create(
                model=model,
                messages=messages,
                temperature=0.0,
                max_tokens=2000
            )
            
            content = response.choices[0].message.content.strip()
            
            # 清理可能的Markdown代码块
            content = re.sub(r'^```(?:json)?\s*\n?', '', content)
            content = re.sub(r'\n?```\s*$', '', content)
            content = content.strip()
            
            # 解析验证
            return schema.model_validate_json(content)
            
        except (json.JSONDecodeError, ValidationError) as e:
            if attempt < max_retries:
                # 反馈错误，让模型修正
                messages.append({
                    "role": "assistant", 
                    "content": content
                })
                messages.append({
                    "role": "user",
                    "content": f"输出验证失败：{str(e)}\n请修正后重新输出。"
                })
            else:
                raise ValueError(f"结构化输出失败，已重试{max_retries}次: {e}")

# 测试
class SentimentResult(BaseModel):
    text: str = Field(description="原文摘要")
    sentiment: str = Field(description="情感：positive/negative/neutral")
    confidence: float = Field(description="置信度0-1", ge=0, le=1)

result = structured_extract(
    "分析情感：这个产品太棒了，超出预期！",
    SentimentResult
)
print(result)

下一篇文章见：AI系列文章导航目录-持续更新中