用 LangChain 构建 LLM 应用：从模型调用到输出解析

文章目录

- [1. 基础篇：模型调用](#1. 基础篇：模型调用)
- - [1.1 最简单的模型调用](#1.1 最简单的模型调用)
- [2. Prompt 模板：让提示词可复用](#2. Prompt 模板：让提示词可复用)
- - [2.1 分别定义 System 和 Human 模板](#2.1 分别定义 System 和 Human 模板)
  - [2.2 更简洁的方式：ChatPromptTemplate](#2.2 更简洁的方式：ChatPromptTemplate)
  - [2.3 批量处理多个翻译任务](#2.3 批量处理多个翻译任务)
- [3. Few-shot Prompt：给模型"打样"](#3. Few-shot Prompt：给模型"打样")
- - [3.1 定义示例模板和示例数据](#3.1 定义示例模板和示例数据)
  - [3.2 构建 Few-shot 模板](#3.2 构建 Few-shot 模板)
- [4. 输出解析器：让输出结构化](#4. 输出解析器：让输出结构化)
- - [4.1 列表解析器](#4.1 列表解析器)
  - [4.2 JSON 解析器（基于 Pydantic）](#4.2 JSON 解析器（基于 Pydantic）)
- [5. Chain：把组件串联起来](#5. Chain：把组件串联起来)
- - [5.1 手动调用 vs Chain 调用](#5.1 手动调用 vs Chain 调用)
- [6. 补充：原生 OpenAI 调用方式](#6. 补充：原生 OpenAI 调用方式)
- - [6.1 基础调用](#6.1 基础调用)
  - [6.2 带参数的调用](#6.2 带参数的调用)
  - [6.3 Few-shot 示例](#6.3 Few-shot 示例)
  - [6.4 分类任务](#6.4 分类任务)
  - [6.5 链式推理（Chain of Thought）](#6.5 链式推理（Chain of Thought）)

LangChain 是目前最流行的 LLM 应用开发框架之一，它把大语言模型开发中的常见模式（Prompt 管理、输出解析、链式调用等）抽象成了标准化的组件，让我们可以像搭积木一样快速构建复杂的 AI 应用。

这篇文章会通过 6 个 Jupyter Notebook 的实战案例，带你一步步掌握 LangChain 的核心用法。所有代码都基于通义千问模型（qwen-plus），通过 OpenAI 兼容模式调用。

1. 基础篇：模型调用

1.1 最简单的模型调用

首先当然是调用模型。LangChain 提供了统一的 ChatOpenAI 接口，通过配置 openai_api_base 就可以接入任意兼容 OpenAI 协议的服务。

python 复制代码

from langchain_openai import ChatOpenAI
from langchain.schema.messages import HumanMessage, SystemMessage

model = ChatOpenAI(
    model="qwen-plus",
    openai_api_key="sk-xxx",  # 替换成你的 API Key
    openai_api_base="https://dashscope.aliyuncs.com/compatible-mode/v1",
    temperature=1.2,
    max_tokens=300
)

messages = [
    SystemMessage(content="请你作为我的数学课助教，用通俗易懂且直接的语言帮我解释数学原理。"),
    HumanMessage(content="什么是勾股定理？"),
]

response = model.invoke(messages)
print(response.content)

这里 SystemMessage 用于设定 AI 的角色和行为，HumanMessage 是用户输入。运行后你会得到一个非常生动的勾股定理解释。

运行结果：

复制代码

勾股定理，一句话说就是：

✅ **直角三角形中，两条直角边的平方和，等于斜边的平方。**

用公式写出来就是：  
**a² + b² = c²**  

其中：  
- a 和 b 是直角三角形的两条**直角边**（也就是夹着直角的那两条边），  
- c 是**斜边**（直角对面、最长的那条边）。

🌰 举个最经典的例子：  
一个三角形，两条直角边分别是 3 和 4，那么斜边就是：  
3² + 4² = 9 + 16 = 25 → √25 = **5**  
...

2. Prompt 模板：让提示词可复用

每次都手动拼接 Prompt 太麻烦，LangChain 提供了 ChatPromptTemplate 来管理提示词模板。

2.1 分别定义 System 和 Human 模板

python 复制代码

from langchain.prompts import (
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)

system_template_text = """你是一位专业的翻译，能够将{input_language}翻译成{output_language}，并且输出文本会根据用户要求的任何语言风格进行调整。请只输出翻译后的文本，不要有任何其它内容。"""
system_prompt_template = SystemMessagePromptTemplate.from_template(system_template_text)

human_template_text = "文本：{text}\n语言风格：{style}"
human_prompt_template = HumanMessagePromptTemplate.from_template(human_template_text)

# 格式化模板
system_prompt = system_prompt_template.format(input_language="汉语", output_language="汉语")
human_prompt = human_prompt_template.format(text="勿以善小而不为，勿以恶小而为之", style="白话文")

response = model.invoke([system_prompt, human_prompt])
print(response.content)

运行结果：

复制代码

不要因为善事很小就不去做，也不要因为恶事很小就去做。

2.2 更简洁的方式：ChatPromptTemplate

上面的写法有点啰嗦，LangChain 提供了更简洁的写法：

python 复制代码

from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", "你是一位专业的翻译，能够将{input_language}翻译成{output_language}，并且输出文本会根据用户要求的任何语言风格进行调整。请只输出翻译后的文本，不要有任何其它内容。"),
        ("human", "文本：{text}\n语言风格：{style}"),
    ]
)

prompt_value = prompt_template.invoke({
    "input_language": "汉语", 
    "output_language": "汉语",
    "text": "勿以善小而不为，勿以恶小而为之。", 
    "style": "白话文"
})

response = model.invoke(prompt_value)
print(response.content)

2.3 批量处理多个翻译任务

python 复制代码

input_variables = [
    {"input_language": "汉语", "output_language": "汉语", "text": "勿以善小而不为，勿以恶小而为之。", "style": "白话文"},
    {"input_language": "法语", "output_language": "英语", "text": "Je suis désolé pour ce que tu as fait", "style": "古英语"},
    {"input_language": "俄语", "output_language": "意大利语", "text": "Сегодня отличная погода", "style": "网络用语"},
    {"input_language": "韩语", "output_language": "日语", "text": "너 정말 짜증나", "style": "口语"},
]

for input in input_variables:
    response = model.invoke(prompt_template.invoke(input))
    print(response.content)

运行结果：

复制代码

不要因为善事很小就不去做，也不要因为恶事很小就去做。
I am sorry for that which thou hast done.
Oggi il tempo è fantastico! 🌞
あんた、本当にイライラするわ！

可以看到，模型很好地根据不同的语言风格进行了翻译。

3. Few-shot Prompt：给模型"打样"

有时模型理解不了你的意图，或者你希望输出格式更规范，就可以用 Few-shot 的方式给模型几个例子。

3.1 定义示例模板和示例数据

python 复制代码

from langchain.prompts import FewShotChatMessagePromptTemplate

example_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "格式化以下客户信息：\n姓名 -> {customer_name}\n年龄 -> {customer_age}\n 城市 -> {customer_city}"),
        ("ai", "##客户信息\n- 客户姓名：{formatted_name}\n- 客户年龄：{formatted_age}\n- 客户所在地：{formatted_city}")
    ]
)

examples = [
    {
        "customer_name": "张三", 
        "customer_age": "27",
        "customer_city": "长沙",
        "formatted_name": "张三",
        "formatted_age": "27岁",
        "formatted_city": "湖南省长沙市"
    },
    {
        "customer_name": "李四", 
        "customer_age": "42",
        "customer_city": "广州",
        "formatted_name": "李四",
        "formatted_age": "42岁",
        "formatted_city": "广东省广州市"
    },
]

3.2 构建 Few-shot 模板

python 复制代码

few_shot_template = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
)

final_prompt_template = ChatPromptTemplate.from_messages(
    [
        few_shot_template,
        ("human", "{input}"),
    ]
)

final_prompt = final_prompt_template.invoke({
    "input": "格式化以下客户信息：\n姓名 -> 王五\n年龄 -> 31\n 城市 -> 郑州"
})

response = model.invoke(final_prompt)
print(response.content)

运行结果：

复制代码

##客户信息  
- 客户姓名：王五  
- 客户年龄：31岁  
- 客户所在地：河南省郑州市

模型根据前两个例子的格式，自动把"郑州"扩展成了"河南省郑州市"。

4. 输出解析器：让输出结构化

大模型返回的是字符串，但很多时候我们需要结构化的数据（比如 JSON、列表）。LangChain 提供了各种 OutputParser 来解析模型的输出。

4.1 列表解析器

python 复制代码

from langchain.output_parsers import CommaSeparatedListOutputParser
from langchain.prompts import ChatPromptTemplate

output_parser = CommaSeparatedListOutputParser()
parser_instructions = output_parser.get_format_instructions()
print(parser_instructions)

prompt = ChatPromptTemplate.from_messages([
    ("system", "{parser_instructions}"),
    ("human", "列出5个{subject}国家的汽车品牌。")
])

final_prompt = prompt.invoke({
    "subject": "中国", 
    "parser_instructions": parser_instructions
})

response = model.invoke(final_prompt)
print(response.content)

# 解析成列表
result = output_parser.invoke(response)
print(result)

运行结果：

复制代码

Your response should be a list of comma separated values, eg: `foo, bar, baz`

比亚迪, 吉利, 长城, 奇瑞, 红旗

['比亚迪', '吉利', '长城', '奇瑞', '红旗']

4.2 JSON 解析器（基于 Pydantic）

更复杂的场景可以用 PydanticOutputParser，它会根据你定义的 Pydantic 模型生成 JSON Schema，并自动解析返回的 JSON。

python 复制代码

from typing import List
from langchain.output_parsers import PydanticOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field

class FilmInfo(BaseModel):
    film_name: str = Field(description="电影的名字", example="拯救大兵瑞恩")
    author_name: str = Field(description="电影的导演", example="斯皮尔伯格")
    genres: List[str] = Field(description="电影的题材", example=["历史", "战争"])

output_parser = PydanticOutputParser(pydantic_object=FilmInfo)
print(output_parser.get_format_instructions())

prompt = ChatPromptTemplate.from_messages([
    ("system", "{parser_instructions} 你输出的结果请使用中文。"),
    ("human", "请你帮我从电影概述中，提取电影名、导演，以及电影的体裁。电影概述会被三个#符号包围。\n###{film_introduction}###")
])

film_introduction = """
《《唐人街探案》是由万达影视传媒有限公司、上海骋亚影视文化传媒有限公司出品，陈思诚执导，陈思诚、程佳客、刘凯、白鹤编剧，王宝强、刘昊然领衔主演...的喜剧电影。
该片讲述了唐仁、秦风必须在躲避警察追捕、匪帮追杀、黑帮围剿的同时，在短短七天内，完成找到"失落的黄金"、查明"真凶"、为自己"洗清罪名"这些"逆天"任务的故事。
"""

final_prompt = prompt.invoke({
    "film_introduction": film_introduction,
    "parser_instructions": output_parser.get_format_instructions()
})

response = model.invoke(final_prompt)
print(response.content)

result = output_parser.invoke(response)
print(result)
print(result.film_name)
print(result.genres)

运行结果：

json 复制代码

{
  "film_name": "唐人街探案",
  "author_name": "陈思诚",
  "genres": ["喜剧"]
}

复制代码

FilmInfo(film_name='唐人街探案', author_name='陈思诚', genres=['喜剧'])
唐人街探案
['喜剧']

5. Chain：把组件串联起来

前面我们都是一步一步手动调用的，LangChain 的 | 操作符可以把 Prompt、Model、OutputParser 串联成一个链，代码更简洁。

5.1 手动调用 vs Chain 调用

手动调用：

python 复制代码

result = output_parser.invoke(model.invoke(prompt.invoke({"subject": "中国", "parser_instructions": parser_instructions})))

用 Chain 串联：

python 复制代码

chat_model_chain = prompt | model | output_parser
result = chat_model_chain.invoke({"subject": "中国", "parser_instructions": parser_instructions})
print(result)

运行结果：

复制代码

['比亚迪', '吉利', '长城', '蔚来', '小鹏']

prompt | model | output_parser 这行代码看起来像流水线：输入数据 → 生成 Prompt → 调用模型 → 解析输出。非常优雅！

6. 补充：原生 OpenAI 调用方式

如果你不想用 LangChain，直接用 OpenAI 库调用也可以，LangChain 本质上是对原生调用的封装。

6.1 基础调用

python 复制代码

from openai import OpenAI

client = OpenAI(
    api_key="sk-xxx",
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

completion = client.chat.completions.create(
    model="qwen-plus",
    messages=[{'role': 'user', 'content': '你是谁？'}]
)
print(completion.choices[0].message.content)

6.2 带参数的调用

python 复制代码

response = client.chat.completions.create(
    model="qwen-plus",
    messages=[{"role": "user", "content": "生成一个豆瓣高分电影清单..."}],
    max_tokens=300,
    frequency_penalty=-2
)
print(response.choices[0].message.content)

6.3 Few-shot 示例

python 复制代码

response = client.chat.completions.create(
    model="qwen-plus",
    messages=[
        {"role": "user", "content": "格式化以下信息：\n姓名 -> 张三\n年龄 -> 17\n学号 -> 001"},
        {"role": "assistant", "content": "##学生信息\n- 学生姓名：张三\n- 客户年龄：17岁\n- 学号：001"},
        {"role": "user", "content": "格式化以下信息：\n姓名 -> 王五\n年龄 -> 13\n学号 -> 003"}
    ]
)

6.4 分类任务

python 复制代码

category_list = ["产品规格", "使用咨询", "功能比较", "用户反馈", "价格查询", "故障问题", "其它"]

classify_prompt_template = """
你的任务是为用户对产品的疑问进行分类。
请仔细阅读用户的问题内容，给出所属类别。类别应该是这些里面的其中一个：{categories}。
直接输出所属类别，不要有任何额外的描述或补充内容。
用户的问题内容会以三个#符号进行包围。

###
{question}
###
"""

for q in q_list:
    formatted_prompt = classify_prompt_template.format(
        categories="，".join(category_list),
        question=q
    )
    response = get_openai_response(client, formatted_prompt)
    print(response)

6.5 链式推理（Chain of Thought）

python 复制代码

response = client.chat.completions.create(
    model="qwen-plus",
    messages=[
        {"role": "user", "content": "该组中的奇数加起来为偶数：15、12、5、3、72、17、1，对吗？让我们来分步骤思考。"}
    ]
)
print(response.choices[0].message.content)