LangChain链的调试与故障排查深度解析
一、LangChain链的基本架构与运行原理
1.1 LangChain的核心组件概述
LangChain作为构建语言模型驱动应用的框架,其核心由一系列可组合的组件构成。这些组件包括LLM
(大语言模型)、PromptTemplate
(提示模板)、Chain
(链)、Agent
(智能体)等。LLM
负责生成文本,PromptTemplate
用于构建结构化提示,Chain
将多个组件串联实现复杂任务,Agent
则通过动态选择工具执行任务 。
从源码层面看,LangChain
的核心类位于langchain
包下。例如,LLM
接口定义了模型调用的基本方法:
python
from abc import ABC, abstractmethod
class LLM(ABC):
"""大语言模型接口"""
@abstractmethod
def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
"""执行模型调用,返回生成的文本"""
pass
def __call__(self, prompt: str, stop: Optional[List[str]] = None) -> str:
return self._call(prompt, stop)
Chain
类则作为链条的基类,定义了运行和调用的基本逻辑:
python
class Chain(ABC):
"""链基类"""
@property
@abstractmethod
def input_keys(self) -> List[str]:
"""定义链的输入键"""
pass
@property
@abstractmethod
def output_keys(self) -> List[str]:
"""定义链的输出键"""
pass
@abstractmethod
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
"""执行链的核心逻辑"""
pass
def __call__(self, inputs: Union[Dict[str, Any], List[Dict[str, Any]]], return_only_outputs: bool = False) -> Union[Dict[str, Any], List[Dict[str, Any]]]:
"""调用链,处理单个或多个输入"""
if isinstance(inputs, list):
return [self._call(i) for i in inputs]
return self._call(inputs)
1.2 链的运行流程剖析
当一个Chain
被调用时,其运行流程可分为以下几个关键步骤:
- 输入验证 :检查输入数据是否包含
input_keys
定义的所有键值对。 - 提示构建 :如果链包含
PromptTemplate
,则根据输入数据填充模板生成完整提示。 - 模型调用 :将构建好的提示传递给
LLM
,获取模型生成的输出。 - 输出处理 :对模型输出进行解析和处理,转换为符合
output_keys
定义的格式。 - 返回结果:将处理后的结果返回给调用方。
例如,LLMChain
的运行逻辑如下:
python
class LLMChain(Chain):
llm: LLM
prompt: PromptTemplate
@property
def input_keys(self) -> List[str]:
return self.prompt.input_variables
@property
def output_keys(self) -> List[str]:
return ["text"]
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
# 1. 验证输入
if not all(key in inputs for key in self.input_keys):
raise ValueError(f"Missing some input keys: {set(self.input_keys) - set(inputs.keys())}")
# 2. 构建提示
prompt = self.prompt.format(**inputs)
# 3. 调用模型
text = self.llm(prompt)
# 4. 处理输出
return {"text": text}
1.3 链的组合与嵌套机制
LangChain支持链的组合与嵌套,通过SimpleSequentialChain
、SequentialChain
等类实现。这些组合链会按顺序执行子链,并将前一个链的输出作为后一个链的输入。
以SimpleSequentialChain
为例,其核心逻辑如下:
python
class SimpleSequentialChain(Chain):
chains: List[Chain]
@property
def input_keys(self) -> List[str]:
return self.chains[0].input_keys
@property
def output_keys(self) -> List[str]:
return self.chains[-1].output_keys
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
x = inputs
for chain in self.chains:
x = chain(x)
return x
这种机制使得开发者可以将复杂任务拆解为多个子任务,每个子任务由独立的链处理,最终组合成完整的解决方案 。
二、常见故障类型与表现形式
2.1 输入输出不匹配问题
输入输出不匹配是最常见的故障类型之一,通常表现为:
- 输入缺失:调用链时缺少必要的输入键,导致提示构建失败。
- 输出格式错误:模型生成的输出无法被正确解析或不符合预期格式。
从源码角度看,Chain
类在__call__
方法中对输入进行验证:
python
def __call__(self, inputs: Union[Dict[str, Any], List[Dict[str, Any]]], return_only_outputs: bool = False) -> Union[Dict[str, Any], List[Dict[str, Any]]]:
if isinstance(inputs, list):
for inp in inputs:
if not all(key in inp for key in self.input_keys):
raise ValueError(f"Missing some input keys: {set(self.input_keys) - set(inp.keys())}")
else:
if not all(key in inputs for key in self.input_keys):
raise ValueError(f"Missing some input keys: {set(self.input_keys) - set(inputs.keys())}")
# 后续处理逻辑...
而输出处理部分,如果未按预期解析模型输出,也会导致错误。例如在LLMChain
中:
python
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
text = self.llm(self.prompt.format(**inputs))
try:
# 假设期望输出为JSON格式
parsed_output = json.loads(text)
except json.JSONDecodeError:
raise ValueError(f"Model output is not valid JSON: {text}")
return {"parsed_output": parsed_output}
2.2 模型调用失败
模型调用失败可能由多种原因导致:
- API密钥问题:访问外部模型(如OpenAI)时,API密钥无效或权限不足。
- 网络连接问题:无法连接到模型服务。
- 提示超限:提示长度超过模型支持的最大上下文长度。
以OpenAI
模型调用为例,OpenAI
类继承自LLM
,其_call
方法实现如下:
python
class OpenAI(LLM):
openai_api_key: str
max_tokens: int = 1024
def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
try:
response = openai.Completion.create(
engine="text-davinci-003",
prompt=prompt,
max_tokens=self.max_tokens,
stop=stop,
api_key=self.openai_api_key
)
return response.choices[0].text.strip()
except openai.error.AuthenticationError:
raise ValueError("Invalid OpenAI API key")
except openai.error.APIConnectionError:
raise ConnectionError("Failed to connect to OpenAI API")
except openai.error.InvalidRequestError as e:
if "maximum context length" in str(e):
raise ValueError("Prompt exceeds maximum context length")
raise
2.3 链逻辑错误
链逻辑错误通常发生在链的组合与嵌套场景中,表现为:
- 子链顺序错误:导致数据传递不符合预期。
- 输出键不匹配:前一个链的输出无法作为后一个链的有效输入。
- 循环依赖:链之间形成循环调用,导致死循环。
例如,在SequentialChain
中,若子链的output_keys
与下一个子链的input_keys
不匹配,会引发错误:
python
class SequentialChain(Chain):
chains: List[Chain]
input_variables: List[str]
output_variables: List[str]
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
x = {k: v for k, v in inputs.items() if k in self.input_variables}
for i, chain in enumerate(self.chains):
if i > 0:
prev_output = self.chains[i - 1](x)
x.update({k: v for k, v in prev_output.items() if k in chain.input_keys})
x = chain(x)
return {k: v for k, v in x.items() if k in self.output_variables}
若chain.input_keys
与前一个链的输出不匹配,x.update
操作将无法正确执行。
2.4 外部工具调用异常
当链中集成外部工具(如搜索引擎、文件读取器)时,可能出现以下问题:
- 工具配置错误:如文件路径错误、API端点错误。
- 权限不足:无法访问所需资源。
- 工具返回格式异常:工具返回的数据无法被链正确处理。
以文件读取工具为例,若路径错误会引发异常:
python
class FileReaderTool:
def run(self, file_path: str) -> str:
try:
with open(file_path, 'r') as f:
return f.read()
except FileNotFoundError:
raise ValueError(f"File not found: {file_path}")
当该工具被集成到链中时,任何文件读取异常都会中断链的执行。
三、调试工具与环境配置
3.1 日志记录与监控
LangChain内置了日志记录功能,通过Python的logging
模块实现。开发者可以配置日志级别,记录链的运行细节。
在LangChain
的基类中,通常会初始化日志记录器:
python
import logging
class Chain(ABC):
def __init__(self):
self.logger = logging.getLogger(self.__class__.__name__)
self.logger.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
ch = logging.StreamHandler()
ch.setFormatter(formatter)
self.logger.addHandler(ch)
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
self.logger.info(f"Received inputs: {inputs}")
# 后续逻辑...
通过调整日志级别,开发者可以获取更详细的信息,例如:
python
import langchain
langchain.logger.setLevel(logging.DEBUG)
3.2 断点调试与交互式环境
使用Python调试工具(如pdb
、ipdb
)可以在链的关键节点设置断点,逐步追踪代码执行过程。
例如,在LLMChain
中设置断点:
python
import ipdb
class LLMChain(Chain):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
prompt = self.prompt.format(**inputs)
ipdb.set_trace() # 设置断点
text = self.llm(prompt)
return {"text": text}
当代码执行到断点处时,会进入交互式调试环境,开发者可以检查变量值、执行语句,定位问题。
3.3 环境变量与配置文件
LangChain支持通过环境变量和配置文件管理敏感信息(如API密钥)和运行参数。
例如,OpenAI
模型的API密钥可以通过环境变量设置:
python
import os
import openai
openai.api_key = os.environ.get("OPENAI_API_KEY")
也可以使用配置文件(如.env
)管理环境变量,并通过python-dotenv
库加载:
python
from dotenv import load_dotenv
load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
配置文件还可用于设置链的参数,如最大令牌数、超时时间等。
3.4 可视化工具
可视化工具可以帮助开发者直观地理解链的结构和运行流程。graphviz
库常用于绘制链的拓扑图:
python
from langchain import LLMChain, PromptTemplate
from langchain.llms import OpenAI
import graphviz
llm = OpenAI()
prompt = PromptTemplate(input_variables=["question"], template="Answer: {question}")
chain = LLMChain(llm=llm, prompt=prompt)
def visualize_chain(chain):
dot = graphviz.Digraph()
dot.node(str(id(chain)), chain.__class__.__name__)
if hasattr(chain, 'llm'):
dot.node(str(id(chain.llm)), chain.llm.__class__.__name__)
dot.edge(str(id(chain)), str(id(chain.llm)), label="调用LLM")
if hasattr(chain, 'prompt'):
dot.node(str(id(chain.prompt)), "PromptTemplate")
dot.edge(str(id(chain)), str(id(chain.prompt)), label="使用提示模板")
return dot
visualize_chain(chain).render('llm_chain', view=True)
通过可视化,可以快速定位链中组件的连接关系和数据流向。
四、源码级故障定位方法
4.1 调用栈分析
当链运行出错时,首先查看调用栈信息,确定错误发生的具体位置。Python的异常处理机制会自动打印调用栈:
python
try:
# 调用链
result = my_chain({"input_key": "value"})
except Exception as e:
import traceback
traceback.print_exc()
输出的调用栈信息会显示从顶层调用到出错位置的完整路径,例如:
arduino
Traceback (most recent call last):
File "main.py", line 10, in <module>
result = my_chain({"input_key": "value"})
File "/path/to/langchain/chain.py", line 50, in __call__
return self._call(inputs)
File "/path/to/langchain/llm_chain.py", line 30, in _call
text = self.llm(prompt)
File "/path/to/langchain/openai.py", line 20, in __call__
raise ValueError("Invalid OpenAI API key")
ValueError: Invalid OpenAI API key
通过分析调用栈,可以快速定位到openai.py
中API密钥无效的问题。
4.2 变量状态追踪
在链的执行过程中,追踪关键变量的状态有助于发现问题。可以在代码中添加打印语句或使用调试工具检查变量值。
例如,在LLMChain
中检查提示生成:
python
class LLMChain(Chain):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
prompt = self.prompt.format(**inputs)
print(f"Generated prompt: {prompt}") # 打印提示
text = self.llm(prompt)
return {"text": text}
如果模型输出不符合预期,检查生成的提示是否正确,可能发现输入数据错误或提示模板配置问题。
4.3 单元测试与断言
编写单元测试可以验证链的各个组件是否按预期工作。使用unittest
或pytest
框架,对链的输入输出进行断言:
python
import unittest
from langchain import LLMChain, PromptTemplate
from langchain.llms import OpenAI
class TestLLMChain(unittest.TestCase):
def test_llm_chain(self):
llm = OpenAI()
prompt = PromptTemplate(input_variables=["question"], template="Answer: {question}")
chain = LLMChain(llm=llm, prompt=prompt)
inputs = {"question": "What is the capital of France?"}
result = chain(inputs)
self.assertEqual(set(result.keys()), {"text"}) # 断言输出键
self.assertTrue(isinstance(result["text"], str)) # 断言输出类型
if __name__ == '__main__':
unittest.main()
通过单元测试,可以在开发阶段及时发现组件的逻辑错误。
4.4 对比调试
对比调试是指通过比较正常运行和异常运行的链,找出差异。可以使用相同的输入,在不同环境或配置下运行链,观察输出变化。
例如,对比两个不同版本的Chain
实现:
python
chain_v1 = OldVersionChain()
chain_v2 = NewVersionChain()
input_data = {"key": "value"}
output_v1 = chain_v1(input_data)
output_v2 = chain_v2(input_data)
import json
print(json.dumps(output_v1, indent=4))
print(json.dumps(output_v2, indent=4))
通过对比输出,可以发现新版本链在处理逻辑上的变化是否引入了错误。
五、输入输出相关故障排查
5.1 输入数据验证与清洗
输入数据错误是导致链运行失败的常见原因。在链的入口处进行严格的输入验证和清洗:
python
class MyChain(Chain):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
# 验证输入键
required_keys =
输入数据错误是导致链运行失败的常见原因。在链的入口处进行严格的输入验证和清洗:
python
class MyChain(Chain):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
# 验证输入键
required_keys = {"input_text", "parameters"}
missing_keys = required_keys - set(inputs.keys())
if missing_keys:
raise ValueError(f"Missing required inputs: {missing_keys}")
# 清洗和转换输入数据
if "input_text" in inputs:
inputs["input_text"] = inputs["input_text"].strip()
# 类型验证
if not isinstance(inputs["parameters"], dict):
try:
inputs["parameters"] = json.loads(inputs["parameters"])
except (TypeError, json.JSONDecodeError):
raise ValueError("'parameters' must be a dictionary or JSON string")
# 其他验证逻辑...
return self._process(inputs) # 进入核心处理逻辑
在LangChain中,PromptTemplate
也会进行输入变量验证,但其错误信息可能不够具体。通过在链层添加自定义验证,可以提供更友好的错误反馈。
5.2 提示构建故障排查
提示构建错误通常表现为模板变量未被正确替换,或生成的提示格式不符合预期。可以通过以下方法排查:
- 打印中间提示 :在
LLMChain
中添加日志,输出构建后的完整提示:
python
class DebugLLMChain(LLMChain):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
prompt_text = self.prompt.format(**inputs)
self.logger.debug(f"Generated prompt: \n{prompt_text}")
return super()._call(inputs)
- 验证模板变量 :检查
PromptTemplate
的input_variables
属性,确保与输入数据匹配:
python
template = PromptTemplate(
input_variables=["question", "context"],
template="Answer: {question} based on: {context}"
)
# 验证
input_keys = {"question", "additional_info"} # 缺少"context"
missing_vars = set(template.input_variables) - input_keys
if missing_vars:
raise ValueError(f"Prompt template requires variables: {missing_vars}")
- 处理特殊字符:如果提示中包含引号、括号等特殊字符,可能导致模型解析异常。可以使用转义或预处理:
python
def escape_special_chars(text: str) -> str:
replacements = {
'"': '\\"',
'\\': '\\\\',
'\n': '\\n',
'\t': '\\t'
}
return ''.join(replacements.get(c, c) for c in text)
# 在构建提示前处理输入
inputs["question"] = escape_special_chars(inputs["question"])
5.3 输出解析与格式化
模型输出可能不符合预期格式,需要进行健壮的解析和处理:
- 结构化输出解析:如果期望JSON格式输出,但模型返回非标准格式:
python
class JSONOutputParser:
def parse(self, text: str) -> Dict[str, Any]:
# 尝试直接解析
try:
return json.loads(text)
except json.JSONDecodeError:
# 处理常见的非标准格式
# 例如,提取JSON块
import re
match = re.search(r'\{.*\}', text, re.DOTALL)
if match:
try:
return json.loads(match.group(0))
except json.JSONDecodeError:
pass
# 作为最后手段,返回原始文本的包装
return {"raw_output": text}
- 使用示例引导输出格式:在提示中提供格式化示例,引导模型生成预期结构:
python
prompt_template = """
Generate a JSON object describing the person:
{input}
Example output:
{{
"name": "John Doe",
"age": 30,
"hobbies": ["reading", "swimming"]
}}
Response:
"""
- 错误恢复机制:在链中添加恢复逻辑,当解析失败时采取备选方案:
python
class RobustLLMChain(LLMChain):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
output = super()._call(inputs)
try:
parsed_output = self.output_parser.parse(output["text"])
except Exception as e:
self.logger.warning(f"Output parsing failed: {e}. Using fallback.")
parsed_output = {"fallback": output["text"]}
return parsed_output
5.4 输入输出长度管理
模型对输入和输出的长度有限制,超出限制会导致错误。需要进行长度检查和处理:
- 计算输入长度:使用与模型相同的分词器计算提示长度:
python
from transformers import GPT2Tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
def count_tokens(text: str) -> int:
return len(tokenizer.encode(text))
# 在构建提示后检查长度
prompt_text = self.prompt.format(**inputs)
token_count = count_tokens(prompt_text)
if token_count > MAX_TOKENS:
raise ValueError(f"Prompt exceeds maximum token length: {token_count}/{MAX_TOKENS}")
- 截断或压缩输入:当输入过长时,可选择性截断或提取关键部分:
python
def truncate_text(text: str, max_tokens: int) -> str:
tokens = tokenizer.encode(text)
if len(tokens) <= max_tokens:
return text
truncated_tokens = tokens[:max_tokens]
return tokenizer.decode(truncated_tokens)
# 应用截断
inputs["long_text"] = truncate_text(inputs["long_text"], MAX_TOKENS - 100) # 预留空间
- 处理输出长度 :设置合理的
max_tokens
参数,避免输出过长:
python
llm = OpenAI(max_tokens=500) # 限制最大输出长度
或在生成后截断输出:
python
output = llm(prompt)
if len(output) > MAX_OUTPUT_LENGTH:
output = output[:MAX_OUTPUT_LENGTH]
六、模型调用故障排查
6.1 API认证与权限问题
调用外部模型API时,认证和权限问题常见于:
- API密钥无效或缺失:检查环境变量或配置文件中密钥是否正确:
python
# 验证OpenAI密钥
import openai
openai.api_key = os.environ.get("OPENAI_API_KEY")
try:
openai.Completion.create(
engine="text-davinci-003",
prompt="Test",
max_tokens=1
)
except openai.error.AuthenticationError as e:
print(f"认证失败: {e}")
# 检查密钥是否存在且正确
if not openai.api_key:
print("API密钥未设置")
else:
print("API密钥无效")
- 权限不足:某些模型需要特定权限或订阅级别。例如,调用GPT-4可能需要申请访问权限:
python
try:
response = openai.Completion.create(
engine="gpt-4", # 需要权限
prompt="Hello",
max_tokens=10
)
except openai.error.PermissionError as e:
print(f"权限错误: {e}. 可能需要升级API访问权限。")
- 组织ID问题:如果使用组织特定的API密钥,需要设置正确的组织ID:
python
openai.organization = os.environ.get("OPENAI_ORG_ID")
6.2 网络连接与超时
网络问题可能导致API调用失败或超时:
- 连接超时设置 :调整
timeout
参数,避免长时间等待无响应的请求:
python
llm = OpenAI(timeout=30) # 设置30秒超时
- 错误重试机制 :使用重试库(如
tenacity
)处理临时网络故障:
python
from tenacity import retry, stop_after_attempt, wait_exponential
class RetryOpenAI(OpenAI):
@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=4, max=10))
def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
return super()._call(prompt, stop)
- 代理设置:在需要通过代理访问API的环境中,配置代理:
python
os.environ["HTTP_PROXY"] = "http://proxy.example.com:8080"
os.environ["HTTPS_PROXY"] = "http://proxy.example.com:8080"
6.3 模型响应异常
模型可能返回意外格式的响应或错误:
- 处理API错误:捕获并解析API返回的错误信息:
python
try:
response = openai.Completion.create(...)
except openai.error.OpenAIError as e:
if isinstance(e, openai.error.InvalidRequestError):
print(f"请求格式错误: {e}")
elif isinstance(e, openai.error.RateLimitError):
print(f"速率限制错误: {e}. 考虑降低请求频率或升级API计划。")
elif isinstance(e, openai.error.ServiceUnavailableError):
print(f"服务不可用: {e}")
else:
print(f"未知错误: {e}")
- 验证响应结构:确保模型返回的数据符合预期结构:
python
def validate_response(response: Dict[str, Any]) -> bool:
if "choices" not in response or not isinstance(response["choices"], list):
return False
if len(response["choices"]) == 0:
return False
if "text" not in response["choices"][0]:
return False
return True
# 在使用响应前验证
if not validate_response(response):
raise ValueError(f"Invalid API response: {response}")
- 处理空响应:模型可能返回空结果,需要添加检查:
python
if not response["choices"][0]["text"].strip():
raise ValueError("Received empty response from model")
6.4 模型配置参数优化
不正确的模型参数设置可能导致性能问题或意外输出:
- 温度参数(temperature):控制输出的随机性。过高的值可能导致不稳定输出,过低的值可能导致回答僵化:
python
# 创造性任务使用较高温度
llm_creative = OpenAI(temperature=0.8)
# 确定性任务使用较低温度
llm_factual = OpenAI(temperature=0.2)
- 最大令牌数(max_tokens):设置过小会导致回答截断,过大会增加响应时间和成本:
python
# 根据任务预估合理的max_tokens
llm = OpenAI(max_tokens=1000) # 适合较长回答
- 停止序列(stop):指定模型停止生成的序列:
python
llm = OpenAI(stop=["\n\n", "###"]) # 遇到空行或###时停止
七、链组合与执行流程调试
7.1 顺序链(SequentialChain)调试
顺序链将多个子链按顺序连接,调试时需关注数据传递和输出键匹配:
- 检查输入输出键映射:确保前一个链的输出键与后一个链的输入键匹配:
python
# 定义子链
chain1 = LLMChain(
llm=llm,
prompt=PromptTemplate(
input_variables=["question"],
output_key="intermediate_answer"
)
)
chain2 = LLMChain(
llm=llm,
prompt=PromptTemplate(
input_variables=["intermediate_answer"], # 必须匹配chain1的输出键
output_key="final_answer"
)
)
# 创建顺序链
seq_chain = SequentialChain(
chains=[chain1, chain2],
input_variables=["question"],
output_variables=["final_answer"]
)
- 中间结果记录:在顺序链中添加日志记录中间结果:
python
class DebugSequentialChain(SequentialChain):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
x = inputs.copy()
for i, chain in enumerate(self.chains):
chain_name = chain.__class__.__name__
self.logger.info(f"Executing chain {i+1}/{len(self.chains)}: {chain_name}")
self.logger.debug(f"Input to {chain_name}: {x}")
x = chain(x)
self.logger.debug(f"Output from {chain_name}: {x}")
return x
7.2 并行链(ParallelChain)调试
并行链同时执行多个子链,调试时需注意:
- 线程安全问题:如果子链共享资源,可能出现线程安全问题。确保子链是独立的:
python
# 错误示例:共享LLM实例可能在线程中冲突
llm = OpenAI()
chain1 = LLMChain(llm=llm, ...)
chain2 = LLMChain(llm=llm, ...) # 共享同一个LLM实例
# 正确示例:为每个链创建独立的LLM实例
chain1 = LLMChain(llm=OpenAI(), ...)
chain2 = LLMChain(llm=OpenAI(), ...)
- 并行执行监控:记录每个并行链的执行时间和结果:
python
class DebugParallelChain(ParallelChain):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
results = {}
start_times = {}
end_times = {}
# 启动所有链
for name, chain in self.chains.items():
start_times[name] = time.time()
results[name] = chain.arun(inputs) # 异步运行
# 等待所有链完成
for name, future in results.items():
try:
results[name] = future.result(timeout=self.timeout)
end_times[name] = time.time()
self.logger.info(f"Chain {name} completed in {end_times[name] - start_times[name]:.2f}s")
except TimeoutError:
self.logger.error(f"Chain {name} timed out after {self.timeout}s")
results[name] = None
return results
7.3 条件链(ConditionalChain)调试
条件链根据条件选择执行不同的子链,调试时需验证条件逻辑:
- 条件函数验证:确保条件函数正确评估:
python
def condition_function(inputs: Dict[str, Any]) -> bool:
# 示例条件:检查输入是否包含特定键
return "special_key" in inputs
# 测试条件函数
test_input = {"special_key": "value"}
print(f"条件评估结果: {condition_function(test_input)}") # 应输出True
- 条件链执行跟踪:在条件链中添加日志,记录条件评估和选择的路径:
python
class DebugConditionalChain(ConditionalChain):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
condition_result = self.condition(inputs)
self.logger.info(f"条件评估结果: {condition_result}")
if condition_result:
self.logger.info(f"选择链: {self.true_chain.__class__.__name__}")
return self.true_chain(inputs)
else:
self.logger.info(f"选择链: {self.false_chain.__class__.__name__}")
return self.false_chain(inputs)
7.4 递归链调试
递归链在处理复杂结构(如树或图)时可能出现栈溢出或无限循环,调试时需注意:
- 终止条件验证:确保递归有明确的终止条件:
python
def recursive_process(data: List[Any], depth: int = 0) -> List[Any]:
if not data or depth > MAX_DEPTH: # 终止条件
return []
result = []
for item in data:
if isinstance(item, list):
# 递归处理子列表
result.append(recursive_process(item, depth + 1))
else:
result.append(process_item(item))
return result
- 深度跟踪:记录递归深度,避免过深的调用:
python
class DebugRecursiveChain(Chain):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
depth = inputs.get("recursion_depth", 0)
self.logger.info(f"递归深度: {depth}")
if depth > MAX_DEPTH:
self.logger.warning(f"达到最大递归深度: {MAX_DEPTH}")
return {"result": "Max depth reached"}
# 递归调用
new_inputs = {**inputs, "recursion_depth": depth + 1}
return self.recursive_chain(new_inputs)
八、智能体(Agent)调试技术
8.1 智能体决策过程追踪
智能体通过推理决定使用哪些工具,调试时需追踪决策过程:
- 日志记录工具选择 :在
Agent
类中添加日志,记录工具选择过程:
python
class DebugAgent(Agent):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
tools = self.get_tools(inputs)
self.logger.info(f"可用工具: {[tool.name for tool in tools]}")
# 获取代理的思考过程
thoughts = self.plan(inputs, tools)
self.logger.debug(f"思考过程: {thoughts}")
# 执行工具选择
action = self._get_next_action(thoughts)
self.logger.info(f"选择工具: {action.tool} with parameters: {action.tool_input}")
# 执行工具
output = self.execute_action(action)
return {"output": output}
- 使用追踪回调:LangChain提供了回调机制,可记录智能体的每一步操作:
python
from langchain.callbacks import get_openai_callback
with get_openai_callback() as cb:
result = agent({"input": "What's the capital of France?"})
print(f"总令牌消耗: {cb.total_tokens}")
print(f"思考步骤: {cb.steps}")
8.2 工具调用异常处理
智能体调用工具时可能出现异常,需要健壮的错误处理:
- 工具包装器:创建工具包装器,捕获并处理工具异常:
python
class SafeToolWrapper:
def __init__(self, tool):
self.tool = tool
def run(self, *args, **kwargs):
try:
return self.tool.run(*args, **kwargs)
except Exception as e:
error_msg = f"工具执行失败: {str(e)}"
self.tool.logger.error(error_msg)
return error_msg
# 包装所有工具
safe_tools = [SafeToolWrapper(tool) for tool in agent.tools]
- 错误恢复策略:在智能体中添加错误恢复逻辑:
python
class FaultTolerantAgent(Agent):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
max_retries = 3
retries = 0
while retries < max_retries:
try:
return super()._call(inputs)
except Exception as e:
retries += 1
self.logger.warning(f"尝试 {retries}/{max_retries}: {str(e)}")
# 可以添加额外的恢复逻辑,如清除缓存、重置状态等
return {"error": f"经过 {max_retries} 次尝试后仍然失败"}
8.3 工具注册与发现问题
智能体需要正确注册和发现工具,调试时需检查:
- 工具注册验证:确保所有工具都已正确注册到智能体:
python
# 验证工具注册
agent_tools = {tool.name for tool in agent.tools}
expected_tools = {"search", "calculator", "database_lookup"}
missing_tools = expected_tools - agent_tools
if missing_tools:
raise ValueError(f"缺少工具: {missing_tools}")
- 工具描述检查:工具描述影响智能体选择,需确保描述准确:
python
# 检查工具描述
for tool in agent.tools:
print(f"工具: {tool.name}")
print(f"描述: {tool.description}")
print("-" * 40)
8.4 智能体陷入循环
智能体可能因错误的工具选择策略而陷入循环,需要检测和预防:
- 循环检测:记录智能体的历史操作,检测重复模式:
python
class LoopDetectionAgent(Agent):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.history = []
self.max_history_length = 10
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
# 记录当前操作
action = self._get_next_action(inputs)
self.history.append((action.tool, action.tool_input))
# 保持历史记录在合理长度
if len(self.history) > self.max_history_length:
self.history.pop(0)
# 检测循环
if len(self.history) >= 3: # 至少需要3步才能形成循环
last_three = self.history[-3:]
if last_three.count(last_three[0]) > 1:
self.logger.warning("检测到可能的循环操作")
# 可以采取措施,如强制选择不同工具
return super()._call(inputs)
- 设置最大步数:限制智能体的最大操作步数,防止无限循环:
python
class BoundedStepsAgent(Agent):
def __init__(self, max_steps: int = 10, *args, **kwargs):
super().__init__(*args, **kwargs)
self.max_steps = max_steps
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
for step in range(self.max_steps):
# 执行一步
result = super()._call(inputs)
# 检查是否完成
if self._is_done(result):
return result
self.logger.info(f"步骤 {step+1}/{self.max_steps} 完成")
self.logger.warning(f"达到最大步数: {self.max_steps}")
return {"output": "Reached maximum steps without completion"}
九、性能优化与调试
9.1 执行时间分析
识别链执行中的性能瓶颈:
- 时间记录装饰器:为关键函数添加时间记录:
python
import time
def timeit(func):
def wrapper(*args, **kwargs):
start_time = time.time()
result = func(*args, **kwargs)
end_time = time.time()
print(f"{func.__name__} 执行时间: {end_time - start_time:.2f}s")
return result
return wrapper
# 应用于链的关键方法
class LLMChain(Chain):
@timeit
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
return super()._call(inputs)
- 使用cProfile进行性能分析:
python
import cProfile
import pstats
profiler = cProfile.Profile()
profiler.enable()
# 执行链
result = my_chain({"input": "test"})
profiler.disable()
stats = pstats.Stats(profiler)
stats.strip_dirs().sort_stats('cumulative').print_stats(20) # 显示前20个耗时最多的函数
9.2 缓存机制应用
使用缓存减少重复计算:
- 内存缓存 :使用
lru_cache
或自定义缓存:
python
from functools import lru_cache
class CachedLLMChain(LLMChain):
@lru_cache(maxsize=128)
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
return super()._call(inputs)
- 持久化缓存:使用磁盘或数据库存储缓存结果:
python
from langchain.cache import SQLiteCache
from langchain.llms import OpenAI
# 配置缓存
langchain.llm_cache = SQLiteCache(database_path=".langchain_cache.db")
# 使用缓存的LLM
llm = OpenAI(cache=True)
9.3 并行与异步执行
优化链的并行执行能力:
- 异步链执行 :使用
async
方法执行链:
python
class AsyncLLMChain(LLMChain):
async def acall(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
# 异步调用LLM
response = await self.llm.agenerate([self.prompt.format(**inputs)])
return {"text": response.generations[0][0].text}
- 并行多链执行 :使用
asyncio
并行执行多个链:
python
import asyncio
async def run_chains_parallel(chains, inputs):
tasks = [chain.acall(inputs) for chain in chains]
return await asyncio.gather(*tasks)
# 使用示例
results = asyncio.run(run_chains_parallel([chain1, chain2, chain3], {"input": "data"}))
9.4 资源使用监控
监控链执行过程中的资源使用情况:
- 内存监控 :使用
memory_profiler
监控内存使用:
python
from memory_profiler import profile
@profile
def run_chain_with_memory_profile():
return my_chain({"input": "large_data"})
# 运行并查看内存使用报告
run_chain_with_memory_profile()
- GPU资源监控:如果使用GPU加速,监控GPU利用率:
python
import torch
def get_gpu_usage():
if torch.cuda.is_available():
return {
"memory_allocated": torch.cuda.memory_allocated() / 1024**2, # MB
"memory_cached": torch.cuda.memory_cached() / 1024**2, # MB
"gpu_count": torch.cuda.device_count(),
"current_device": torch.cuda.current_device()
}
return {"status": "GPU not available"}
十、高级调试技术与最佳实践
10.1 自定义回调与钩子
LangChain提供回调机制,可在链执行的不同阶段插入自定义逻辑:
- 实现自定义回调处理器:
python
from langchain.callbacks.base import BaseCallbackHandler
class DebugCallbackHandler(BaseCallbackHandler):
def on_llm_start(self, serialized: Dict[str, Any], prompts: List[str], **kwargs) -> None:
print(f"LLM开始执行,提示数: {len(prompts)}")
print(f"第一个提示: {prompts[0][:100]}...")
def on_llm_end(self, response: LLMResult, **kwargs) -> None:
print(f"LLM执行完成,生成了 {len(response.generations)} 个结果")
def on_chain_start(self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs) -> None:
print(f"链开始执行: {serialized.get('name', 'Unknown')}")
print(f"输入: {json.dumps(inputs, indent=2)}")
def on_chain_end(self, outputs: Dict[str, Any], **kwargs) -> None:
print(f"链执行完成")
print(f"输出: {json.dumps(outputs, indent=2)}")
- 使用自定义回调:
python
callback = DebugCallbackHandler()
result = my_chain({"input": "data"}, callbacks=[callback])
10.2 测试框架与断言
编写单元测试和集成测试确保链的正确性:
- 单元测试示例:
python
import unittest
from langchain import LLMChain, PromptTemplate
from langchain.llms import OpenAI
class TestLLMChain(unittest.TestCase):
def setUp(self):
self.llm = OpenAI(temperature=0)
self.prompt = PromptTemplate(
input_variables=["question"],
template="Answer this question: {question}"
)
self.chain = LLMChain(llm=self.llm, prompt=self.prompt)
def test_basic_functionality(self):
result = self.chain.run("What's 2+2?")
self.assertIsInstance(result, str)
self.assertIn("4", result) # 简单断言,实际可能需要更复杂的验证
def test_input_validation(self):
with self.assertRaises(ValueError):
self.chain.run({"wrong_key": "value"})
- 集成测试示例:
python
class TestSequentialChain(unittest.TestCase):
def test_sequential_chain(self):
# 设置第一个链:问题生成
prompt1 = PromptTemplate(
input_variables=["topic"],
template="Generate a question about {topic}."
)
chain1 = LLMChain(llm=self.llm, prompt=prompt1)
# 设置第二个链:回答问题
prompt2 = PromptTemplate(
input_variables=["question"],
template="Answer: {question}"
)
chain2 = LLMChain(llm=self.llm, prompt=prompt2)
# 创建顺序链
seq_chain = SequentialChain(
chains=[chain1, chain2],
input_variables=["topic"],
output_variables=["text"]
)
# 执行测试
result = seq_chain.run("Python programming")
self.assertIsInstance(result, str)
self.assertTrue(len(result) > 10) # 简单长度检查
10.3 版本控制与环境隔离
确保链在不同环境中的一致性:
- 依赖管理 :使用
requirements.txt
或poetry
管理依赖:
python
# requirements.txt
langchain==0.0.152
openai==0.27.7
python-dotenv==1.0.0
- 环境变量管理 :使用
.env
文件存储敏感信息:
env
OPENAI_API_KEY=your_api_key_here
HUGGINGFACE_API_KEY=your_huggingface_key
- Docker化部署:使用Docker容器化链,确保环境一致性:
dockerfile
# Dockerfile
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "main.py"]
10.4 文档与注释最佳实践
为链组件添加清晰的文档和注释:
- 类和方法文档字符串:
python
class MyCustomChain(Chain):
"""
自定义链,执行特定领域的问答任务
参数:
llm (LLM): 用于生成回答的语言模型
knowledge_base (KnowledgeBase): 用于检索相关信息的知识库
max_retries (int, optional): 检索失败时的最大重试次数,默认为3
"""
def __init__(self, llm: LLM, knowledge_base: KnowledgeBase, max_retries: int = 3):
self.llm = llm
self.knowledge_base = knowledge_base
self.max_retries = max_retries
@property
def input_keys(self) -> List[str]:
"""返回链的输入键,包括'question'和可选的'context'"""
return ["question", "context"]
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
"""
执行链的核心逻辑
参数:
inputs (Dict[str, Any]): 包含'question'和可选'context'的输入字典
返回:
Dict[str, Any]: 包含'answer'和'references'的输出字典
"""
question = inputs["question"]
context = inputs.get("context", "")
# 从知识库检索相关信息
retries = 0
while retries < self.max_retries:
try:
relevant_info = self.knowledge_base.retrieve(question, context)
break
except Exception as e:
retries += 1
if retries >= self.max_retries:
raise ValueError("Failed to retrieve relevant information after retries")
# 构建提示并生成回答
prompt = self._construct_prompt(question, relevant_info)
answer = self.llm(prompt)
return {"answer": answer, "references": relevant_info}
- 复杂逻辑注释:
python
def _construct_prompt(self, question: str, context: str) -> str:
# 构建三部分提示:指令、上下文和问题
# 指令部分指导模型如何处理信息
instruction = "You are an expert in this domain. Use the provided context to answer the question concisely and accurately."
# 上下文部分提供相关背景信息
context_section = f"Context: {context}" if context else "No additional context provided."
# 问题部分是用户的具体问题
question_section = f"Question: {question}"
# 组合各部分,使用明确的分隔符
return f"{instruction}\n\n{context_section}\n\n{question_section}\n\nAnswer:"
10.5 错误处理与恢复策略
构建健壮的错误处理机制,确保链在遇到异常时能优雅降级或恢复:
- 全局异常处理器:捕获未处理的异常并记录详细信息:
python
import sys
import traceback
def global_exception_handler(exc_type, exc_value, exc_traceback):
print(f"未处理异常: {exc_type.__name__}: {exc_value}")
traceback.print_exception(exc_type, exc_value, exc_traceback)
# 可以添加额外的恢复逻辑,如保存当前状态、通知管理员等
# 设置全局异常处理器
sys.excepthook = global_exception_handler
- 重试机制:对临时性错误(如网络波动)实现重试:
python
from tenacity import retry, stop_after_attempt, wait_exponential
@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=2, max=10))
def call_flaky_api(url, data):
response = requests.post(url, json=data)
response.raise_for_status()
return response.json()
- 降级策略:当关键组件不可用时,提供替代方案:
python
class FallbackLLM(LLM):
def __init__(self, primary_llm, fallback_llm):
self.primary_llm = primary_llm
self.fallback_llm = fallback_llm
def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
try:
return self.primary_llm(prompt, stop=stop)
except Exception as e:
print(f"主LLM失败,切换到备用LLM: {e}")
return self.fallback_llm(prompt, stop=stop)
10.6 安全与隐私保护
在调试和生产环境中确保数据安全和隐私:
- 敏感信息过滤:在日志和错误信息中过滤敏感数据:
python
def filter_sensitive_data(data: Dict[str, Any]) -> Dict[str, Any]:
SENSITIVE_KEYS = {"api_key", "password", "credit_card"}
return {
k: "[FILTERED]" if k.lower() in SENSITIVE_KEYS else v
for k, v in data.items()
}
# 在日志记录前过滤敏感信息
log_data = filter_sensitive_data(user_input)
self.logger.info(f"处理用户输入: {log_data}")
- 数据加密:对存储的敏感数据进行加密:
python
from cryptography.fernet import Fernet
# 生成密钥(仅执行一次,安全存储)
key = Fernet.generate_key()
cipher_suite = Fernet(key)
# 加密数据
encrypted_data = cipher_suite.encrypt(b"sensitive_data")
# 解密数据
decrypted_data = cipher_suite.decrypt(encrypted_data)
- 合规性检查:确保符合隐私法规(如GDPR):
python
def ensure_compliance(user_data):
# 检查是否有必要的用户同意
if not user_data.get("consent_given", False):
raise ValueError("Missing user consent for data processing")
# 检查数据保留期限
if "timestamp" in user_data:
age = (datetime.now() - user_data["timestamp"]).days
if age > MAX_DATA_RETENTION_DAYS:
raise ValueError("Data retention period exceeded")
return True
10.7 持续集成与自动化测试
建立自动化测试流程,确保代码变更不会引入新问题:
- GitHub Actions配置示例:
yaml
# .github/workflows/tests.yml
name: Run Tests
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.9
uses: actions/setup-python@v4
with:
python-version: 3.9
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Run tests
run: |
pytest tests/ --cov=langchain_project --cov-report=xml
- name: Upload coverage report
uses: codecov/codecov-action@v3
with:
token: ${{ secrets.CODECOV_TOKEN }}
file: ./coverage.xml
- 测试覆盖率要求:
python
# pytest.ini
[pytest]
addopts = --cov=my_chain --cov-report=term-missing --cov-fail-under=80
10.8 监控与告警
建立生产环境的监控系统,及时发现和响应问题:
- Prometheus指标收集:
python
from prometheus_client import Counter, Histogram, start_http_server
# 定义指标
REQUEST_COUNTER = Counter(
'chain_requests_total',
'Total number of requests to the chain',
['chain_name', 'status']
)
REQUEST_DURATION = Histogram(
'chain_request_duration_seconds',
'Duration of requests to the chain',
['chain_name']
)
# 在链执行前后记录指标
class InstrumentedChain(Chain):
def __init__(self, name, *args, **kwargs):
super().__init__(*args, **kwargs)
self.name = name
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
start_time = time.time()
try:
result = super()._call(inputs)
REQUEST_COUNTER.labels(self.name, 'success').inc()
return result
except Exception as e:
REQUEST_COUNTER.labels(self.name, 'failure').inc()
raise
finally:
duration = time.time() - start_time
REQUEST_DURATION.labels(self.name).observe(duration)
# 启动指标服务器
start_http_server(8000)
- 告警规则配置:
yaml
# alert.rules
groups:
- name: chain-alerts
rules:
- alert: HighErrorRate
expr: rate(chain_requests_total{status="failure"}[5m]) / rate(chain_requests_total[5m]) > 0.1
for: 5m
labels:
severity: critical
annotations:
summary: "链 {{ $labels.chain_name }} 错误率过高 (当前值: {{ $value }})"
description: "链 {{ $labels.chain_name }} 的错误率在过去5分钟内超过10%"
10.9 调试工作流优化
建立高效的调试工作流,减少定位问题的时间:
- 问题报告模板:
markdown
### 问题描述
简要描述遇到的问题
### 复现步骤
1. 执行链...
2. 输入参数...
3. 观察到错误...
### 预期行为
描述你期望发生的事情
### 实际行为
描述实际发生的事情
### 环境信息
- LangChain版本:
- Python版本:
- 操作系统:
- 使用的LLM:
### 错误日志
```python
# 粘贴完整的错误堆栈跟踪
python
2. **调试工具包**:
```python
# debug_utils.py
import inspect
import json
from datetime import datetime
def dump_object(obj):
"""递归地将对象转换为可序列化的字典"""
if isinstance(obj, (int, float, str, bool, type(None))):
return obj
elif isinstance(obj, dict):
return {k: dump_object(v) for k, v in obj.items()}
elif isinstance(obj, list):
return [dump_object(item) for item in obj]
elif hasattr(obj, '__dict__'):
return {k: dump_object(v) for k, v in obj.__dict__.items() if not callable(v)}
else:
return str(obj)
def log_step(step_name, data):
"""记录调试步骤和相关数据"""
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print(f"[{timestamp}] Step: {step_name}")
print(json.dumps(dump_object(data), indent=2))
print("-" * 50)
def get_current_function_name():
"""获取当前执行的函数名"""
return inspect.currentframe().f_back.f_code.co_name
10.10 性能调优最佳实践
优化链的性能,减少响应时间和资源消耗:
- 批量处理:将多个请求合并为一个批次处理:
python
def batch_process(items, batch_size=10):
batches = [items[i:i+batch_size] for i in range(0, len(items), batch_size)]
results = []
for batch in batches:
# 构建批量提示
batch_prompt = "\n\n".join([f"Item {i+1}: {item}" for i, item in enumerate(batch)])
batch_response = llm(batch_prompt)
# 解析批量响应
batch_results = parse_batch_response(batch_response, len(batch))
results.extend(batch_results)
return results
- 模型量化:使用量化技术减少模型大小和推理时间:
python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# 加载并量化模型
model_id = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16, # 使用半精度
low_cpu_mem_usage=True
)
# 使用量化模型的LLM包装器
class QuantizedLLM(LLM):
def __init__(self, model, tokenizer):
self.model = model
self.tokenizer = tokenizer
def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
inputs = self.tokenizer(prompt, return_tensors="pt").to(self.model.device)
outputs = self.model.generate(
**inputs,
max_length=100,
temperature=0.7,
pad_token_id=self.tokenizer.eos_token_id
)
return self.tokenizer.decode(outputs[0], skip_special_tokens=True)
- 预热与缓存:在服务启动时预热模型和缓存常用结果:
python
def warmup_model():
"""在服务启动时执行一次推理,预热模型"""
print("预热模型中...")
llm("Hello, this is a warm-up request.")
print("模型预热完成")
# 在应用启动时调用
warmup_model()
# 预缓存常用结果
COMMON_PROMPTS = {
"welcome_message": "Generate a welcome message for new users.",
"faq_answer": "Answer the most frequently asked question."
}
PRE_CACHED_RESULTS = {
prompt: llm(prompt) for prompt in COMMON_PROMPTS.values()
}
十一、特定领域链的调试技巧
11.1 知识库问答链调试
知识库问答链将检索与生成结合,调试时需关注:
- 检索结果验证:
python
def debug_retrieval(question, retriever):
"""调试检索过程,返回检索结果和相似度分数"""
docs = retriever.get_relevant_documents(question)
print(f"问题: {question}")
print(f"检索到 {len(docs)} 个文档")
for i, doc in enumerate(docs):
print(f"\n文档 {i+1}:")
print(f" 相似度分数: {doc.metadata.get('score', 'N/A')}")
print(f" 内容: {doc.page_content[:200]}...")
return docs
- 提示构建与验证:
python
def build_qa_prompt(question, context_docs):
"""构建知识库问答提示"""
context = "\n\n".join([f"Document {i+1}:\n{doc.page_content}"
for i, doc in enumerate(context_docs)])
prompt = f"""
Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
{context}
Question: {question}
Answer:
"""
return prompt.strip()
# 调试提示构建
question = "What is LangChain?"
context_docs = debug_retrieval(question, retriever)
prompt = build_qa_prompt(question, context_docs)
print("\n构建的提示:")
print(prompt)
11.2 多模态链调试
多模态链处理文本和其他模态数据(如图像、音频),调试时需注意:
- 模态转换验证:
python
def debug_image_to_text(image_path, image_processor, model):
"""调试图像到文本的转换过程"""
try:
# 加载图像
image = Image.open(image_path).convert("RGB")
# 处理图像
inputs = image_processor(images=image, return_tensors="pt")
# 生成描述
outputs = model.generate(**inputs)
caption = image_processor.decode(outputs[0], skip_special_tokens=True)
print(f"图像描述: {caption}")
return caption
except Exception as e:
print(f"图像到文本转换失败: {e}")
return None
- 多模态提示构建:
python
def build_multimodal_prompt(image_caption, text_input):
"""构建多模态提示"""
return f"""
Image description: {image_caption}
Based on the image and the following text, answer the question:
{text_input}
"""
11.3 数学推理链调试
数学推理链需要精确的步骤验证和错误定位:
- 中间步骤验证:
python
class MathChain(LLMChain):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
question = inputs["question"]
# 第一步:分解问题
decomposition = self._decompose_question(question)
print(f"问题分解: {decomposition}")
# 第二步:解决每个子问题
sub_answers = {}
for i, sub_question in enumerate(decomposition):
sub_answer = self.llm(sub_question)
sub_answers[f"sub_{i}"] = sub_answer
print(f"子问题 {i+1} 答案: {sub_answer}")
# 第三步:综合答案
final_answer = self._synthesize_answers(decomposition, sub_answers)
print(f"最终答案: {final_answer}")
return {"answer": final_answer}
- 符号验证:
python
def validate_math_answer(question, answer):
"""使用sympy验证数学答案"""
try:
from sympy.parsing import mathematica
from sympy import simplify
# 尝试解析问题和答案
expr_question = mathematica.parse(question)
expr_answer = mathematica.parse(answer)
# 验证答案是否正确
is_correct = simplify(expr_question - expr_answer) == 0
return is_correct
except Exception as e:
print(f"无法验证数学答案: {e}")
return False
11.4 代码生成链调试
代码生成链需要验证生成代码的语法正确性和功能完整性:
- 语法检查:
python
import ast
def check_code_syntax(code):
"""检查生成的代码是否符合Python语法"""
try:
ast.parse(code)
return True, "语法正确"
except SyntaxError as e:
return False, f"语法错误: {e}"
# 在链中使用
generated_code = my_chain({"input": "Write a Python function to add two numbers"})["output"]
is_valid, message = check_code_syntax(generated_code)
print(f"代码语法检查: {message}")
- 单元测试生成与执行:
python
def generate_unit_tests(function_code):
"""为生成的函数代码生成单元测试"""
prompt = f"""
Generate pytest unit tests for the following Python function:
{function_code}
Return only the test code:
"""
test_code = llm(prompt)
return test_code
def run_unit_tests(test_code, function_code):
"""执行单元测试并返回结果"""
import tempfile
import subprocess
# 创建临时文件
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
f.write(f"{function_code}\n\n{test_code}")
temp_file = f.name
# 执行测试
result = subprocess.run(
["pytest", "-v", temp_file],
capture_output=True,
text=True
)
# 删除临时文件
os.unlink(temp_file)
return result.stdout, result.stderr, result.returncode
十二、与外部系统集成的调试方法
12.1 数据库集成调试
当链与数据库集成时,需验证连接、查询和数据处理:
- 连接测试:
python
def test_database_connection(connection_string):
"""测试数据库连接"""
try:
conn = psycopg2.connect(connection_string)
cursor = conn.cursor()
cursor.execute("SELECT 1")
result = cursor.fetchone()
conn.close()
if result[0] == 1:
return True, "连接成功"
else:
return False, "连接测试失败"
except Exception as e:
return False, f"连接错误: {e}"
# 使用示例
connection_string = "postgresql://user:password@localhost:5432/mydb"
success, message = test_database_connection(connection_string)
print(f"数据库连接测试: {message}")
- 查询验证:
python
def debug_database_query(query, connection_string):
"""调试数据库查询,返回执行结果和性能指标"""
try:
conn = psycopg2.connect(connection_string)
cursor = conn.cursor()
# 记录执行时间
start_time = time.time()
cursor.execute(query)
# 获取查询结果
if query.strip().lower().startswith("select"):
results = cursor.fetchall()
columns = [desc[0] for desc in cursor.description]
result_data = [dict(zip(columns, row)) for row in results]
else:
result_data = {"rows_affected": cursor.rowcount}
execution_time = time.time() - start_time
conn.close()
return {
"success": True,
"data": result_data,
"execution_time": execution_time
}
except Exception as e:
return {
"success": False,
"error": str(e)
}
12.2 API集成调试
调试与外部API的集成时,需验证请求、响应和错误处理:
- 请求构建与发送:
python
def debug_api_request(url, method="GET", headers=None, params=None, json_data=None):
"""调试API请求,返回详细的请求和响应信息"""
import requests
try:
# 准备请求
request_info = {
"url": url,
"method": method,
"headers": headers,
"params": params,
"json": json_data
}
print(f"发送API请求: {method} {url}")
# 发送请求
response = requests.request(
method,
url,
headers=headers,
params=params,
json=json_data,
timeout=30
)
# 解析响应
response_info = {
"status_code": response.status_code,
"headers": dict(response.headers),
"text": response.text,
"elapsed_time": response.elapsed.total_seconds()
}
# 尝试解析JSON
try:
response_info["json"] = response.json()
except json.JSONDecodeError:
response_info["json"] = None
return {
"request": request_info,
"response": response_info,
"success": response.ok
}
except Exception as e:
return {
"error": str(e),
"success": False
}
- API响应处理:
python
class APITool:
def run(self, endpoint: str, params: Dict[str, Any] = None) -> str:
base_url = "https://api.example.com/v1"
full_url = f"{base_url}/{endpoint}"
result = debug_api_request(full_url, params=params)
if not result["success"]:
error_msg = result.get("error", "Unknown error")
return f"API调用失败: {error_msg}"
response = result["response"]
if response["status_code"] != 200:
return f"API返回非成功状态码: {response['status_code']}"
# 提取相关数据
data = response.get("json", {})
return json.dumps(data, indent=2)
12.3 文件系统集成调试
当链与文件系统交互时,需验证文件操作和权限:
- 文件读写测试:
python
def test_file_operations(directory: str):
"""测试文件读写操作"""
try:
# 检查目录是否存在
if not os.path.exists(directory):
try:
os.makedirs(directory)
print(f"创建目录: {directory}")
except Exception as e:
return False, f"无法创建目录: {e}"
# 写入测试文件
test_file = os.path.join(directory, "test.txt")
with open(test_file, 'w') as f:
f.write("This is a test file.")
# 读取测试文件
with open(test_file, 'r') as f:
content = f.read()
# 删除测试文件
os.remove(test_file)
if content == "This is a test file.":
return True, "文件操作测试成功"
else:
return False, "读取内容不匹配"
except Exception as e:
return False, f"文件操作错误: {e}"
- 文件路径处理:
python
def resolve_file_path(relative_path: str, base_dir: str = None):
"""解析文件路径,处理相对路径和绝对路径"""
if base_dir is None:
base_dir = os.getcwd()
# 如果是绝对路径,直接返回
if os.path.isabs(relative_path):
return relative_path
# 构建完整路径
full_path = os.path.join(base_dir, relative_path)
# 规范化路径
normalized_path = os.path.normpath(full_path)
# 验证路径是否在基目录内(防止路径遍历攻击)
if not normalized_path.startswith(base_dir):
raise ValueError(f"不安全的路径: {relative_path}")
return normalized_path
12.4 第三方服务集成调试
调试与第三方服务(如搜索引擎、翻译API)的集成时,需验证认证和功能:
- 搜索引擎集成:
python
class SearchTool:
def __init__(self, api_key: str):
self.api_key = api_key
self.search_engine = GoogleSearchAPI(api_key)
def run(self, query: str) -> str:
try:
# 验证API密钥
if not self.api_key:
return "错误: 缺少API密钥"
# 执行搜索
results = self.search_engine.search(query, num_results=3)
if not results:
return "未找到相关结果"
# 格式化结果
formatted_results = []
for i, result in enumerate(results):
formatted_results.append(f"结果 {i+1}: {result['title']}\n{result['snippet']}\n{result['link']}\n")
return "\n\n".join(formatted_results)
except Exception as e:
return f"搜索失败: {str(e)}"
- 翻译服务集成:
python
def debug_translation(text: str, target_lang: str, translator):
"""调试翻译服务,返回原始文本、翻译结果和中间信息"""
try:
print(f"翻译文本: '{text[:50]}...' 到 {target_lang}")
# 执行翻译
translation = translator.translate(text, target_lang=target_lang)
print(f"翻译结果: '{translation[:50]}...'")
return {
"original_text": text,
"translated_text": translation,
"target_language": target_lang,
"success": True
}
except Exception as e:
print(f"翻译错误: {e}")
return {
"error": str(e),
"success": False
}
十三、分布式与并行链调试
13.1 分布式链架构调试
分布式链架构涉及多个节点,调试时需关注节点间通信和协调:
- 节点通信测试:
python
def test_node_communication(node_addresses):
"""测试分布式节点间的通信"""
results = {}
for source in node_addresses:
for target in node_addresses:
if source == target:
continue
test_id = f"{source} -> {target}"
try:
# 模拟节点间通信
response = requests.get(f"{target}/health", timeout=5)
if response.status_code == 200:
results[test_id] = {"status": "success", "latency": response.elapsed.total_seconds()}
else:
results[test_id] = {"status": "failure", "code": response.status_code, "reason": response.text}
except Exception as e:
results[test_id] = {"status": "error", "message": str(e)}
return results
- 分布式追踪:
python
from opentelemetry import trace
from opentelemetry.exporter.jaeger.thrift import JaegerExporter
from opentelemetry.sdk.resources import SERVICE_NAME, Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
# 配置追踪器
resource = Resource(attributes={SERVICE_NAME: "langchain_distributed"})
jaeger_exporter = JaegerExporter(
agent_host_name="localhost",
agent_port=6831,
)
provider = TracerProvider(resource=resource)
processor = BatchSpanProcessor(jaeger_exporter)
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
tracer = trace.get_tracer(__name__)
# 在链执行中添加追踪
class TracedChain(Chain):
def _call(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
with tracer.start_as_current_span("chain_execution") as span:
span.set_attribute("chain_name", self.__class__.__name__)
span.set_attribute("inputs", str(inputs))
try:
result = super()._call(inputs)
span.set_attribute("success", True)
span.set_attribute("outputs", str(result))
return result
except Exception as e:
span.set_attribute("success", False)
span.set_attribute("error", str(e))
raise
13.2 并行链性能调试
优化并行链性能时,需分析资源利用和任务分配:
- 并行任务分配分析:
python
def analyze_parallel_tasks(tasks, workers=4):
"""分析并行任务分配和执行时间"""
from concurrent.futures import ThreadPoolExecutor
import time
def run_task(task):
start_time = time.time()
# 模拟任务执行
time.sleep(task["duration"])
end_time = time.time()
return {
"task_id": task["id"],
"start_time": start_time,
"end_time": end_time,
"duration": end_time - start_time
}
# 执行并行任务
with ThreadPoolExecutor(max_workers=workers) as executor:
results = list(executor.map(run_task, tasks))
# 分析结果
total_time = max(r["end_time"] for r in results) - min(r["start_time"] for r in results)
total_task_time = sum(r["duration"] for r in results)
efficiency = total_task_time / (total_time * workers)
print(f"总执行时间: {total_time:.2f}s")
print(f"总任务时间: {total_task_time:.2f}s")
print(f"效率: {efficiency:.2%}")
return {
"results": results,
"total_time": total_time,
"efficiency": efficiency
}
- 资源竞争检测:
python
def detect_resource_contention(chain, inputs, iterations=10):
"""检测链执行中的资源竞争"""
import threading
from queue import Queue
error_queue = Queue()
def run_chain():
try:
chain(inputs)
except Exception as e:
error_queue.put(e)
# 并行执行链
threads = [threading.Thread(target=run_chain) for _ in range(iterations)]
for t in threads:
t.start()
for t in threads:
t.join()
# 收集错误
errors = []
while not error_queue.empty():
errors.append(error_queue.get())
if errors:
print(f"检测到 {len(errors)} 个错误")
for i, e in enumerate(errors):
print(f"错误 {i+1}: {e}")
return False
else:
print("未检测到资源竞争")
return True
13.3 负载均衡调试
调试负载均衡时,需验证请求分配和节点健康状态:
- 负载均衡验证:
python
def verify_load_balancing(load_balancer, requests=100):
"""验证负载均衡器的请求分配"""
node_counts = {}
for _ in range(requests):
node = load_balancer.select_node()
node_counts[node] = node_counts.get(node, 0) + 1
# 计算分配比例
total = sum(node_counts.values())
distribution = {node: count/total for node, count in node_counts.items()}
print(f"请求分布 ({requests} 次请求):")
for node, ratio in distribution.items():
print(f" {node}: {ratio:.2%}")
# 计算标准差,评估分配均匀性
mean = 1 / len(node_counts)
variance = sum((ratio - mean) ** 2 for ratio in distribution.values()) / len(distribution)
std_dev = variance ** 0.5
print(f"分配标准差: {std_dev:.4f}")
return {
"distribution": distribution,
"standard_deviation": std_dev
}
- 节点健康检查:
python
def check_node_health(nodes):
"""检查所有节点的健康状态"""
results = {}
for node in nodes:
try:
response = requests.get(f"{node}/health", timeout=3)
if response.status_code == 200:
results[node] = {"status": "healthy", "details": response.json()}
else:
results[node] = {"status": "unhealthy", "code": response.status_code, "reason": response.text}
except Exception as e:
results[node] = {"status": "down", "error": str(e)}
return results
13.4 分布式缓存调试
调试分布式缓存时,需验证缓存命中率和数据一致性:
- 缓存命中率分析:
python
def analyze_cache_hit_rate(cache, test_keys, populate_cache=True):
"""分析缓存命中率"""
hits = 0
misses = 0
# 预热缓存
if populate_cache:
for key in test_keys:
cache.set(key, f"value_{key}")
# 测试缓存
for key in test_keys:
if cache.get(key) is not None:
hits += 1
else:
misses += 1
total = hits + misses
hit_rate = hits / total if total > 0 else 0
print(f"缓存命中率: {hit_rate:.2%} ({hits}/{total})")
return hit_rate
- 缓存一致性检查:
python
def check_cache_consistency(cache_nodes, key, value):
"""检查分布式缓存节点间的数据一致性"""
results = {}
# 在所有节点设置值
for node in cache_nodes:
node.set(key, value)
# 从所有节点获取值
for node in cache_nodes:
retrieved_value = node.get(key)
results[node.name] = retrieved_value
# 检查一致性
unique_values = set(results.values())
if len(unique_values) == 1:
print(f"缓存一致: {value}")
return True, results
else:
print(f"缓存不一致: {results}")
return False, results
十四、LangChain与大模型交互的调试
14.1 模型响应分析
深入分析大模型的响应模式和特点:
- 温度参数对响应的影响:
python
def analyze_temperature_effect(prompt, temperatures=[0.1, 0.5, 0.8, 1.0]):
"""分析不同温度参数对模型响应的影响"""
results = {}
for temp in temperatures:
llm = OpenAI(temperature=temp)
responses = []
# 多次生成以观察变化
for _ in range(5):
response = llm(prompt)
responses.append(response)
results[temp] = responses
# 打印分析结果
for temp, responses in results.items():
print(f"\n温度 = {temp}:")
for i, response in enumerate(responses):
print(f" 生成 {i+1}: {response[:100]}...")
return results
- 模型响应的多样性评估:
python
def evaluate_response_diversity(responses):
"""评估模型响应的多样性"""
from difflib import SequenceMatcher
# 计算相似度矩阵
n = len(responses)
similarity_matrix = [[0.0 for _ in range(n)] for _ in range(n)]
for i in range(n):
for j in range(i+1, n):
similarity = SequenceMatcher(None, responses[i], responses[j]).ratio()
similarity_matrix[i][j] = similarity
similarity_matrix[j][i] = similarity
# 计算平均相似度
avg_similarity = sum(sum(row) for row in similarity_matrix) / (n * (n-1)) if n > 1 else 0
# 计算多样性得分 (1 - 平均相似度)
diversity = 1.0 - avg_similarity
print(f"响应多样性得分: {diversity:.2f} (0表示完全相同,1表示完全不同)")
return diversity
14.2 提示工程调试
优化提示设计以获得更好的模型响应:
- 提示迭代实验:
python
def run_prompt_experiment(prompt_templates, test_inputs, llm):
"""运行提示模板实验,比较不同提示的效果"""
results = {}
for name, template in prompt_templates.items():
prompt_results = []
for input_data in test_inputs:
# 格式化提示
prompt = template.format(**input_data)
# 获取模型响应
response = llm(prompt)
# 记录结果
prompt_results.append({
"input": input_data,
"prompt": prompt,
"response": response
})
results[name] = prompt_results
return results
- 少样本学习提示调试:
python
def debug_few_shot_prompt(example_pairs, test_input, llm):
"""调试少样本学习提示"""
# 构建少样本提示
few_shot_prompt = "\n\n".join([
f"Q: {pair['question']}\nA: {pair['answer']}"
for pair in example_pairs
])
# 添加测试问题
full_prompt = f"{few_shot_prompt}\n\nQ: {test_input}\nA:"
# 获取模型响应
response = llm(full_prompt)
print(f"少样本提示:\n{full_prompt}")
print(f"\n模型响应: {response}")
return response
14.3 模型限制与边界调试
了解模型的限制和边界条件,避免无效请求:
- 上下文长度限制测试:
python
def test_context_length_limit(llm, base_prompt, increment_text="additional text "):
"""测试模型的上下文长度限制"""
current_prompt = base_prompt
tokenizer = get_tokenizer_for_model(llm) # 自定义函数,获取模型的分词器
while True:
# 计算当前提示的token数
token_count = len(tokenizer.encode(current_prompt))
try:
# 尝试调用模型
response = llm(current_prompt)
print(f"成功: {token_count} tokens")
# 增加提示长度
current_prompt += increment_text
except Exception as e:
print(f"失败: {token_count} tokens")
print(f"错误: {e}")
break
return token_count
- 输出长度限制测试:
python
def test_max_output_length(llm, prompt, max_tokens_values=[10, 50, 100, 500, 1000]):
"""测试不同max_tokens设置下的输出长度"""
results = {}
for max_tokens in max_tokens_values:
try:
# 设置max_tokens参数
custom_llm = OpenAI(max_tokens=max_tokens)
# 获取模型响应
response = custom_llm(prompt)
# 计算实际生成的token数
tokenizer = get_tokenizer_for_model(custom_llm)
token_count = len(tokenizer.encode(response))
results[max_tokens] = {
"response": response,
"token_count": token_count,
"truncated": token_count >= max_tokens
}
print(f"max_tokens={max_tokens}, 生成token数={token_count}, 截断={results[max_tokens]['truncated']}")
except Exception as e:
results[max_tokens] = {"error": str(e)}
print(f"max_tokens={max_tokens}, 错误: {e}")
return results
14.4 模型对齐与校准
确保模型输出符合预期的对齐目标:
- 对齐测试框架:
python
def alignment_test(llm, test_cases, alignment_function):
"""运行模型对齐测试"""
results = []
for i, test_case in enumerate(test_cases):
print(f"执行测试 {i+1}/{len(test_cases)}")
# 获取模型响应
response = llm(test_case["prompt"])
# 评估对齐程度
is_aligned, score, explanation = alignment_function(response, test_case["expected"])
# 记录结果
results.append({
"test_case": test_case,
"response": response,
"is_aligned": is_aligned,
"score": score,
"explanation": explanation
})
# 打印结果
print(f" 测试结果: {'通过' if is_aligned else '失败'} (得分: {score:.2f})")
print(f" 解释: {explanation}")
# 计算总体准确率
accuracy = sum(1 for r in results if r["is_aligned"]) / len(results)
print(f"\n总体准确率: {accuracy:.2%}")
return {
"results": results,
"accuracy": accuracy
}
- 价值观对齐评估:
python
def evaluate_value_alignment(response, expected_values):
"""评估模型响应与预期价值观的对齐程度"""
# 简单示例:检查响应是否包含禁止词汇
forbidden_words = ["hate", "violence", "discrimination"]
contains_forbidden = any(word in response.lower() for word in forbidden_words)
# 检查是否符合预期价值观(这里简化为关键词匹配)
alignment_score = 0.0
for value in expected_values:
if value in response.lower():
alignment_score += 1.0
alignment_score /= len(expected_values) if expected_values else 1.0
# 确定是否对齐
is_aligned = not contains_forbidden and alignment_score >= 0.5
# 生成解释
explanation = []
if contains_forbidden:
explanation.append("包含禁止词汇")
if alignment_score < 0.5:
explanation.append(f"价值观匹配度不足 ({alignment_score:.2f} < 0.5)")
if not explanation:
explanation.append("符合预期价值观")
return is_aligned, alignment_score, ", ".join(explanation)
十五、生产环境部署与监控
15.1 生产环境配置最佳实践
为生产环境优化链的配置:
- 配置文件管理:
python
from pydantic import BaseSettings, Field
class AppSettings(BaseSettings):
"""应用配置类"""
# LLM配置
openai_api_key: str = Field(..., env="OPENAI_API_KEY")
model_name: str = "gpt-3.5-turbo"
temperature: float = 0.7
max_tokens: int = 1000
# 缓存配置
use_cache: bool = True
cache_backend: str = "redis"
redis_url: str = "redis://localhost:6379/0"
# 日志配置
log_level: str = "INFO"
log_format: str = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
# 性能配置
max_concurrent_requests: int = 10
request_timeout: int = 60
class Config:
env_file = ".env.prod"
env_file_encoding = "utf-8"
# 实例化配置
settings = AppSettings()
- 安全配置:
python
def secure_chain_config(chain):
"""为链添加安全配置"""
# 限制输入长度,防止提示注入
max_input_length = 1000
if hasattr(chain, "prompt") and hasattr(chain.prompt, "template"):
chain.prompt.template = truncate_text(chain.prompt.template, max_input_length)
# 配置安全的LLM参数
if hasattr(chain, "llm"):
chain.llm.temperature = min(0.7, chain.llm.temperature) # 限制随机性
chain.llm.max_tokens = min(2000, chain.llm.max_tokens) # 限制最大输出长度
# 添加内容审核
if not hasattr(chain, "output_parser") or chain.output_parser is None:
chain.output_parser = SafeOutputParser()
return chain
15.2 容器化部署
使用Docker和Kubernetes进行容器化部署:
- Dockerfile配置:
dockerfile
# Dockerfile
FROM python:3.9-slim
# 设置工作目录
WORKDIR /app
# 安装系统依赖
RUN apt-get update && apt-get install -y \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# 安装Python依赖
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# 复制应用代码
COPY . .
# 设置环境变量
ENV PYTHONUNBUFFERED=1
ENV LOG_LEVEL=INFO
# 暴露端口
EXPOSE 8000
# 启动应用
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
- Kubernetes部署配置:
yaml
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: langchain-app
labels:
app: langchain-app
spec:
replicas: 3
selector:
matchLabels:
app: langchain-app
template:
metadata:
labels:
app: langchain-app
spec:
containers:
- name: langchain-app
image: your-registry/langchain-app:latest
ports:
- containerPort: 8000
env:
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: langchain-secrets
key: openai-api-key
- name: LOG_LEVEL
value: "INFO"
resources:
requests:
cpu: 200m
memory: 512Mi
limits:
cpu: 500m
memory: 1Gi
readinessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 15
periodSeconds: 20
15.3 实时监控系统
建立全面的监控系统,实时跟踪链的运行状态:
- Prometheus指标配置:
python
# metrics.py
from prometheus_client import Counter, Histogram, Summary, Gauge
# 请求指标
REQUEST_COUNTER = Counter(
'langchain_requests_total',
'Total number of requests',
['endpoint', 'method', 'status_code']
)
REQUEST_DURATION = Histogram(
'langchain_request_duration_seconds',
'Duration of requests in seconds',
['endpoint', 'method']
)
# LLM调用指标
LLM_CALL_COUNTER = Counter(
'langchain_llm_calls_total',
'Total number of LLM calls',
['model_name']
)
LLM_TOKEN_USAGE = Summary(
'langchain_llm_token_usage',
'Token usage per LLM call',
['model_name', 'type'] # type: prompt, completion
)
# 缓存指标
CACHE_HIT_COUNTER = Counter(
'langchain_cache_hits_total',
'Total number of cache hits',
['cache_type']
)
CACHE_MISS_COUNTER = Counter(
'langchain_cache_misses_total',
'Total number of cache misses',
['cache_type']
)
# 系统资源指标
SYSTEM_MEMORY_USAGE = Gauge(
'langchain_system_memory_usage_bytes',
'System memory usage in bytes'
)
SYSTEM_CPU_USAGE = Gauge(
'langchain_system_cpu_usage_percent',
'System CPU usage percentage'
)
- Grafana仪表盘示例:
yaml
# dashboard.json (部分)
{
"title": "LangChain Performance Dashboard",
"uid": "langchain-overview",
"panels": [
{
"title": "Request Rate",
"type": "graph",
"targets": [
{
"expr": "rate(langchain_requests_total[5m])",
"legendFormat": "{{endpoint}} {{method}}"
}
],
"gridPos": {
"x": 0,
"y": 0,
"w": 12,
"h": 8
}
},
{
"title": "LLM Call Duration",
"type": "histogram",
"targets": [
{
"expr": "histogram_quantile(0.95, sum(rate(langchain_llm_call_duration_seconds_bucket[5m])) by (le, model_name))",
"legendFormat": "{{model_name}} 95th percentile"
}
],
"gridPos": {
"x": 12,
"y": 0,
"w": 12,
"h": 8
}
},
{
"title": "Cache Hit Ratio",
"type": "singlestat",
"targets": [
{
"expr": "sum(langchain_cache_hits_total) / (sum(langchain_cache_hits_total) + sum(langchain_cache_misses_total))",
"format": "percent"
}
],
"gridPos": {
"x": 0,
"y": 8,
"w": 6,
"h": 4
}
},
{
"title": "Token Usage",
"type": "timeseries",
"targets": [
{
"expr": "sum(rate(langchain_llm_token_usage_sum[5m])) by (type)",
"legendFormat": "{{type}}"
}
],
"gridPos": {
"x": 6,
"y": 8,
"w": 18,
"h": 4
}
}
]
}
15.4 告警策略
设置合理的告警阈值,及时发现和响应问题:
- Prometheus告警规则:
yaml
# alerts.yml
groups:
- name: langchain-alerts
rules:
- alert: HighErrorRate
expr: rate(langchain_requests_total{status_code=~"5.."}[5m]) / rate(langchain_requests_total[5m]) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate (instance {{ $labels.instance }})"
description: "Error rate is {{ $value | printf \"%.2f\" }}%, above threshold (5%)"
- alert: LLMHighLatency
expr: histogram_quantile(0.95, rate(langchain_llm_call_duration_seconds_bucket[5m])) > 10
for: 5m
labels:
severity: warning
annotations:
summary: "High LLM latency (instance {{ $labels.instance }})"
description: "95th percentile LLM call latency is {{ $value | printf \"%.2f\" }}s, above threshold (10s)"
- alert: LowCacheHitRatio
expr: sum(langchain_cache_hits_total) / (sum(langchain_cache_hits_total) + sum(langchain_cache_misses_total)) < 0.7
for: 10m
labels:
severity: warning
annotations:
summary: "Low cache hit ratio (instance {{ $labels.instance }})"
description: "Cache hit ratio is {{ $value | printf \"%.2f\" }}, below threshold (0.7)"
- alert: HighTokenUsage
expr: sum(rate(langchain_llm_token_usage_sum[5m])) > 100000
for: 15m
labels:
severity: warning
annotations:
summary: "High token usage (instance {{ $labels.instance }})"
description: "Token usage rate is {{ $value | printf \"%.0f\" }} tokens per minute, above threshold (100000)"
- 告警通知配置:
yaml
# alertmanager.yml
route:
group_by: ['alertname', 'service']
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
receiver: 'slack'
receivers:
- name: 'slack'
slack_configs:
- channel: '#alerts-langchain'
send_resolved: true
text: |-
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .CommonLabels.alertname }}
{{ range .Alerts }}
*Instance:* {{ .Labels.instance }}
*Summary:* {{ .Annotations.summary }}
*Description:* {{ .Annotations.description }}
*Details:* {{ .Labels | toJson }}
{{ end }}
15.5 日志管理与分析
建立集中式日志管理系统:
- 日志聚合配置:
yaml
# fluent-bit.conf
[INPUT]
Name tail
Path /var/log/langchain/*.log
Parser json
Tag langchain.*
Mem_Buf_Limit 5MB
Skip_Long_Lines On
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Merge_Log On
[OUTPUT]
Name es
Match *
Host elasticsearch
Port 9200
Index langchain-%Y.%m.%d
Type _doc
Logstash_Format On
Logstash_Prefix langchain
- 日志查询与分析:
python
from elasticsearch import Elasticsearch
def search_logs(query, time_range="1h", size=100):
"""在Elasticsearch中搜索日志"""
es = Elasticsearch([{"host": "elasticsearch", "port": 9200}])
# 构建查询体