【LangChain学习笔记】输出解析器

核心定义

输出解析器是 LangChain 核心组件之一，核心作用有三：

为大模型提供标准化格式化提示词，引导模型按指定格式输出；
校验大模型输出是否符合预期格式要求；
解析大模型输出内容，转换为可用格式（字符串/结构化对象等）；

常用输出解析器及使用

StrOutputParser（字符串解析器）

核心作用

用于清理大模型输出内容，去除输出前后的换行符（\n），仅保留纯文本核心输出，返回标准字符串格式。

完整使用示例

python 复制代码

from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

# 大模型初始化（参数配置见专项笔记）
llm = ChatOpenAI(
    # 模型名称
    model_name="qwen-plus",
    # 模型基础地址
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    # 替换为个人有效API Key
    api_key="[你的API Key]"
)

# 定义提示模板
prompt_template = PromptTemplate.from_template("你是一个专业的数学助手，{question}")

# 初始化字符串解析器
str_output_parser = StrOutputParser()

# 构建链式调用
chain = prompt_template | llm | str_output_parser

# 执行调用
response = chain.invoke(input={"question": "计算10和20的和"})

# 打印解析后结果
print(response)

提示：StrOutputParser 不存在 get_format_instructions 方法，无需配置格式化指令。

PydanticOutputParser（结构化对象解析器）

核心作用

将大模型输出解析为符合 Pydantic 模型定义的结构化对象，便于后续数据提取与校验，返回值为 Pydantic 模型实例对象。

关键方法

get_format_instructions()：获取 Pydantic 模型对应的格式化指令，用于传递给提示模板，引导大模型按格式输出。

完整使用示例

python 复制代码

from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field

# 大模型初始化（参数配置见专项笔记）
llm = ChatOpenAI(
    # 模型名称
    model_name="qwen-plus",
    # 模型基础地址
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    # 替换为个人有效API Key
    api_key="[你的API Key]"
)

# 定义Pydantic数据模型
class SumArgs(BaseModel):
    # 第一个数（整数类型）
    num1: int = Field(description="第一个数")
    # 第二个数（整数类型）
    num2: int = Field(description="第二个数")
    # 两个数的和（整数类型）
    sum: int = Field(description="两个数的和")

# 初始化Pydantic输出解析器
pydantic_output_parser = PydanticOutputParser(pydantic_object=SumArgs)

# 定义提示模板
prompt_template = PromptTemplate.from_template(
    """
    你是一个专业的数学助手
    问题内容：{question}
    输出格式：{format_instructions}
    """
)

# 预填充格式化指令
prompt_template = prompt_template.partial(
    format_instructions=pydantic_output_parser.get_format_instructions()
)

# 构建链式调用
chain = prompt_template | llm | pydantic_output_parser

# 执行调用
response = chain.invoke(input={"question": "计算10和20的和"})

# 打印解析后结果（Pydantic实例对象）
print(response)

3. JsonOutputParser（JSON格式解析器）

核心作用

将大模型输出解析为标准 JSON 格式，使用方式与 PydanticOutputParser 类似，核心区别在于解析后的输出格式为 JSON 字符串/对象。

关键方法

get_format_instructions()：获取 JSON 格式的标准化指令，返回字符串类型，用于引导大模型输出合规 JSON 内容。

完整使用示例

python 复制代码

from langchain_core.output_parsers import JsonOutputParser
from pydantic import BaseModel, Field

# 定义Pydantic数据模型（约束JSON字段）
class SumArgs(BaseModel):
    # 第一个数（整数类型）
    num1: int = Field(description="第一个数")
    # 第二个数（整数类型）
    num2: int = Field(description="第二个数")
    # 两个数的和（整数类型）
    sum: int = Field(description="两个数的和")

# 初始化JSON输出解析器
json_output_parser = JsonOutputParser(pydantic_object=SumArgs)

# 定义提示模板
prompt_template = PromptTemplate.from_template(
    """
    你是一个专业的数学助手
    问题内容：{question}
    输出格式：{format_instructions}
    """
)

# 预填充格式化指令
prompt_template = prompt_template.partial(
    format_instructions=json_output_parser.get_format_instructions()
)

# 构建链式调用
chain = prompt_template | llm | json_output_parser

# 执行调用
response = chain.invoke(input={"question": "计算10和20的和"})

# 打印解析后结果
print(response)

核心关键要点

通用流程：所有输出解析器的核心使用流程一致：定义提示模板 → 初始化解析器 → 预填充格式化指令 → 构建链式调用 → 执行并解析结果；
格式化指令 ：get_format_instructions() 是核心方法，用于引导大模型按预期格式输出，缺失会导致解析失败；
返回类型差异：
- StrOutputParser：返回纯字符串；
- PydanticOutputParser：返回 Pydantic 实例对象；
- JsonOutputParser：返回标准 JSON 格式；
错误处理：若大模型输出不符合解析器要求，会直接抛出解析错误，需优化提示模板或模型参数。