手搓 ReAct Agent:告别框架,回归本质
- [🚀 手搓 ReAct Agent:告别框架,回归本质](#🚀 手搓 ReAct Agent:告别框架,回归本质)
-
- [🧠 核心理念:ReAct 循环的本质](#🧠 核心理念:ReAct 循环的本质)
- [🛠️ 第一步:构建工具系统 (`tools.py`)](#🛠️ 第一步:构建工具系统 (
tools.py)) - [🧠 第二步:实现 ReAct Agent (`agent.py`)](#🧠 第二步:实现 ReAct Agent (
agent.py)) - [🚀 第三步:组装与运行 (`main.py`)](#🚀 第三步:组装与运行 (
main.py)) - 结语
🚀 手搓 ReAct Agent:告别框架,回归本质
"真正的控制权来自于对每一行代码的深刻理解,而不是对黑盒框架的盲目依赖。"
在这个充斥着各种 "Agent 框架" 的时代,我们似乎忘记了构建智能系统的初衷------利用 LLM 强大的推理能力解决问题。LangChain、AutoGPT 等框架虽然提供了便利,但也引入了巨大的复杂性和不可控性。
本文将带你深入剖析如何用纯 Python 实现一个生产级可用的 ReAct (Reasoning + Acting) Agent。我们将看到,剥离了繁杂的抽象层后,核心逻辑是多么的优雅和清晰。
🧠 核心理念:ReAct 循环的本质
ReAct (Reasoning + Acting) 模式的核心在于一个简单的反馈循环:
- Reasoning (Thought): 模型根据当前上下文进行思考。
- Acting (Action): 模型决定调用某个工具。
- Observing (Observation): 工具执行,将结果反馈给模型。
No
Yes
Final Answer
Action
Error
No Action
Start
Initialize History & Tools
Steps < Max Steps?
Max Steps Reached / Stop
LLM Inference
Parse Output
Success: Return Answer
Execute Tool
Log Error
Prompt: Please Continue
Get Observation
Append to History
在我们的实现中,这个循环被显式地编写为一个 while 循环,而不是隐藏在某个 AgentExecutor 的深处。
🛠️ 第一步:构建工具系统 (tools.py)
首先,需要一个机制来让 LLM 知道有哪些工具可用,并能够调用它们。拒绝使用复杂的 BaseTool 类继承体系,而是使用 Python 原生的装饰器和类型提示(Type Hints)。
这个 ToolRegistry 类负责:
- 利用
inspect模块自动从 Python 函数签名生成 JSON Schema。 - 管理工具的注册和查找。
- 统一执行工具并捕获异常。
- (高级) 支持集成 MCP (Model Context Protocol) 协议的工具。
Tool Registration Process
Extract Types & Docstring
Store Callable
@registry.register
Python Function
inspect.signature
JSON Schema
ToolRegistry
完整代码:tools.py
python
import inspect
import json
import functools
from typing import Callable, Dict, Any, List
class ToolRegistry:
"""
一个轻量级的工具注册表,不依赖任何第三方框架。
使用 Python 原生类型提示自动生成 Schema。
"""
def __init__(self):
self._tools: Dict[str, Callable] = {}
self._schemas: List[Dict[str, Any]] = []
def register(self, func: Callable = None, *, name: str = None, description: str = None):
"""
装饰器:注册一个函数为工具。
用法:
@registry.register
def my_tool(arg: int): ...
@registry.register(name="custom_name")
def my_tool(arg: int): ...
"""
# 支持 @register 和 @register() 两种写法
if func is None:
return functools.partial(self.register, name=name, description=description)
tool_name = name or func.__name__
tool_description = description or func.__doc__ or "No description provided."
# === 核心逻辑:利用 inspect 模块自动生成 JSON Schema ===
sig = inspect.signature(func)
parameters = {
"type": "object",
"properties": {},
"required": []
}
for param_name, param in sig.parameters.items():
# 默认类型为 string
param_type = "string"
# 根据 Python 类型注解映射到 JSON Schema 类型
if param.annotation == int:
param_type = "integer"
elif param.annotation == bool:
param_type = "boolean"
elif param.annotation == float:
param_type = "number"
elif param.annotation == dict:
param_type = "object"
elif param.annotation == list:
param_type = "array"
parameters["properties"][param_name] = {
"type": param_type,
# 理想情况下应该解析 docstring 获取参数描述,这里简化处理
"description": f"Parameter {param_name}"
}
# 如果没有默认值,则标记为必填
if param.default == inspect.Parameter.empty:
parameters["required"].append(param_name)
tool_schema = {
"name": tool_name,
"description": tool_description,
"parameters": parameters
}
self._tools[tool_name] = func
self._schemas.append(tool_schema)
# 保持原函数元数据
@functools.wraps(func)
def wrapper(*args, **kwargs):
return func(*args, **kwargs)
return wrapper
def get_tools_schema(self) -> List[Dict[str, Any]]:
"""返回所有注册工具的 JSON Schema,用于传给 LLM"""
return self._schemas
def get_tool(self, name: str) -> Callable:
return self._tools.get(name)
def execute(self, name: str, arguments: Dict[str, Any]) -> str:
"""
统一执行入口。
负责参数传递和异常捕获,确保 Agent 循环不会因为工具报错而崩溃。
"""
func = self.get_tool(name)
if not func:
return f"Error: Tool '{name}' not found."
try:
# 动态解包参数调用函数
result = func(**arguments)
return str(result)
except Exception as e:
return f"Error executing tool '{name}': {str(e)}"
def register_mcp_server(self, client):
"""
连接 MCP (Model Context Protocol) 客户端并注册其工具。
这展示了系统的可扩展性:我们可以轻松集成外部协议。
"""
try:
tools = client.list_tools()
for tool_info in tools:
self.register_mcp_tool(client, tool_info)
print(f"[System] Registered {len(tools)} tools from MCP server.")
except Exception as e:
print(f"[System] Failed to register tools from MCP server: {e}")
def register_mcp_tool(self, client, tool_info: Dict[str, Any]):
"""将 MCP 工具适配到我们的本地注册表中"""
tool_name = tool_info["name"]
tool_description = tool_info.get("description", "No description")
tool_schema = tool_info.get("inputSchema", {})
# 创建一个闭包来捕获 client 和 tool_name
def mcp_wrapper(**kwargs):
result = client.call_tool(tool_name, kwargs)
# 格式化 MCP 结果
if result.get("isError"):
return f"Error from MCP tool: {result}"
content = result.get("content", [])
output_parts = []
for item in content:
if item.get("type") == "text":
output_parts.append(item.get("text", ""))
elif item.get("type") == "image":
output_parts.append("[Image Content]")
elif item.get("type") == "resource":
output_parts.append(f"[Resource: {item.get('resource', {}).get('uri')}]")
else:
output_parts.append(str(item))
return "\n".join(output_parts)
self._tools[tool_name] = mcp_wrapper
self._schemas.append({
"name": tool_name,
"description": tool_description,
"parameters": tool_schema
})
# 全局注册表实例,方便在其他地方 import 使用
registry = ToolRegistry()
tool = registry.register
🧠 第二步:实现 ReAct Agent (agent.py)
这是整个系统的大脑。没有使用任何复杂的 Agent 类库,而是手动实现了 ReAct 循环。
主要亮点:
- System Prompt 构建:动态注入工具描述。
- 手写解析器 (
_parse_output):不依赖能够自动转 JSON 的库,而是使用正则和容错逻辑来处理 LLM 可能输出的不完美 JSON。这是 "Protocol-Level" 控制的体现。 - 显式状态机 (
run方法) :使用while循环清晰地展示 Reasoning -> Acting -> Observing 的过程。
完整代码:agent.py
python
import os
import json
import re
from typing import List, Dict, Any, Optional
from openai import OpenAI
from tools import ToolRegistry
class ReActAgent:
def __init__(self, model: str = "gpt-4o", tools_registry: ToolRegistry = None, api_key: str = None, base_url: str = None):
# 初始化 OpenAI 客户端
self.client = OpenAI(api_key=api_key, base_url=base_url)
self.model = model
self.registry = tools_registry
self.max_steps = 10 # 防止死循环
self.history = []
# 动态构建工具描述字符串
self.tool_descriptions = self._build_tool_descriptions()
self.tool_names = ", ".join([s["name"] for s in self.registry.get_tools_schema()])
# === 核心 Prompt ===
# 强制要求模型遵循 ReAct 格式,并输出 Strict JSON 作为 Action Input
self.system_prompt = f"""
Answer the following questions as best you can. You have access to the following tools:
{self.tool_descriptions}
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{self.tool_names}]
Action Input: the input to the action, MUST be a valid JSON string, e.g., {{"arg1": "value1"}}
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
IMPORTANT RULES:
1. You MUST use "Action Input:" to specify arguments.
2. The "Action Input" content MUST be a strict JSON object (use double quotes for keys and string values).
3. Do NOT include markdown code blocks (```json) in your Action Input.
4. Example of valid output:
Action: get_weather
Action Input: {{"city": "Beijing"}}
Begin!
"""
def _build_tool_descriptions(self) -> str:
"""将 JSON Schema 转换为 Prompt 中可读的文本描述"""
schemas = self.registry.get_tools_schema()
lines = []
for s in schemas:
lines.append(f"{s['name']}: {s['description']}")
lines.append(f" Args: {json.dumps(s['parameters'])}")
return "\n".join(lines)
def _parse_output(self, text: str):
"""
手写解析器:从 LLM 的非结构化文本输出中提取结构化指令。
这是构建 Agent 最脆弱也最关键的部分。
"""
# 1. 检查是否结束
if "Final Answer:" in text:
return "finish", text.split("Final Answer:")[-1].strip()
# 2. 使用正则提取 Action 和 Action Input
action_match = re.search(r"Action:\s*(.*?)\n", text)
action_input_match = re.search(r"Action Input:\s*(.*)", text, re.DOTALL)
if action_match and action_input_match:
action = action_match.group(1).strip()
action_input_str = action_input_match.group(1).strip()
# 清理 Markdown 标记 (容错处理)
action_input_str = action_input_str.replace("```json", "").replace("```", "").strip()
try:
# 尝试解析 JSON
action_input = json.loads(action_input_str)
return "action", (action, action_input)
except json.JSONDecodeError:
# 尝试使用 ast.literal_eval 修复单引号等非标准 JSON 错误
try:
import ast
action_input = ast.literal_eval(action_input_str)
if isinstance(action_input, dict):
return "action", (action, action_input)
except:
pass
return "error", f"Failed to parse Action Input as JSON: {action_input_str}"
# 3. 处理某些模型可能省略 "Action Input" 前缀的边缘情况
if "Action:" in text and "Action Input:" not in text:
# 尝试寻找 "Input Params" 等变体
alternate_input_match = re.search(r"(?:Input Params|Arguments|Parameters):\s*(.*)", text, re.DOTALL | re.IGNORECASE)
if alternate_input_match:
# ... (省略部分重复的解析逻辑,详见源码)
pass
# 尝试直接寻找最后一个 JSON 对象
try:
json_matches = re.findall(r"\{.*?\}", text, re.DOTALL)
if json_matches:
possible_json = json_matches[-1]
action_input = json.loads(possible_json)
action_match = re.search(r"Action:\s*(.*?)(?:\n|$)", text)
if action_match:
action = action_match.group(1).strip()
return "action", (action, action_input)
except:
pass
return "error", "Found Action but missing Action Input."
return "continue", None
def run(self, question: str):
"""
执行 Agent 主循环。
"""
print(f"\n🚀 Starting Agent Task: {question}\n")
# 初始化消息历史
messages = [
{"role": "system", "content": self.system_prompt},
{"role": "user", "content": f"Question: {question}"}
]
steps = 0
# === 核心循环 ===
while steps < self.max_steps:
steps += 1
# 1. Reasoning: 调用 LLM
# stop=["Observation:"] 极其重要,防止模型自己编造工具的执行结果
try:
response = self.client.chat.completions.create(
model=self.model,
messages=messages,
stop=["Observation:"]
)
except Exception as e:
print(f"❌ API Call Failed: {e}")
return f"Error: API Call Failed: {e}"
content = response.choices[0].message.content
print(f"🤖 Step {steps} LLM Output:\n{content}\n")
messages.append({"role": "assistant", "content": content})
# 2. Parsing: 决策分析
status, payload = self._parse_output(content)
if status == "finish":
print(f"✅ Final Answer: {payload}")
return payload
if status == "action":
tool_name, tool_args = payload
print(f"🛠️ Executing Tool: {tool_name} with args {tool_args}")
# 3. Acting: 执行工具
observation = self.registry.execute(tool_name, tool_args)
print(f"👀 Observation: {observation}")
# 4. Observing: 反馈结果
# 将观察结果作为 User 消息反馈给模型,触发下一轮思考
messages.append({"role": "user", "content": f"Observation: {observation}"})
elif status == "error":
# 错误纠正机制:将错误信息反馈给模型,让其重试
print(f"⚠️ Parse Error: {payload}")
messages.append({"role": "user", "content": f"System Error: {payload}. Please correct your format."})
else:
# 模型可能还在思考 (Thought),没有产生 Action
if "Thought:" in content:
messages.append({"role": "user", "content": "Please continue. If you need to use a tool, specify Action and Action Input."})
else:
messages.append({"role": "user", "content": "Invalid format. Please follow the ReAct format."})
print("🛑 Max steps reached.")
return "Agent stopped due to iteration limit."
🚀 第三步:组装与运行 (main.py)
最后,将所有组件组装起来。这里演示了如何定义具体的业务工具(如 get_weather 和 calculate),以及如何启动 Agent。
完整代码:main.py
python
import os
import sys
from dotenv import load_dotenv
from tools import registry
from agent import ReActAgent
from mcp_client import MCPClient
# 加载环境变量 (OPENAI_API_KEY 等)
load_dotenv(override=True)
# === 定义本地工具 ===
# 使用装饰器轻松注册
@registry.register(name="get_weather", description="Get the current weather for a given city.")
def get_weather(city: str):
"""
Mock weather function.
"""
print(f"[System] Querying weather for {city}...")
# 模拟数据
mock_data = {
"Beijing": "Sunny, 25°C",
"Shanghai": "Rainy, 22°C",
"New York": "Cloudy, 15°C",
"Tokyo": "Sunny, 20°C"
}
return mock_data.get(city, "Unknown city, assuming Sunny, 20°C")
@registry.register(name="calculate", description="Calculate a mathematical expression.")
def calculate(expression: str):
"""
Safe calculator.
"""
print(f"[System] Calculating: {expression}")
try:
# 安全检查:仅允许简单的数学运算
allowed_chars = "0123456789+-*/(). "
if not all(c in allowed_chars for c in expression):
return "Error: Invalid characters in expression."
return eval(expression)
except Exception as e:
return f"Error: {str(e)}"
def main():
print("=== Native ReAct Agent Demo ===")
# 配置
api_key = os.getenv("OPENAI_API_KEY")
base_url = os.getenv("BASE_URL", "https://api.moonshot.cn/v1")
model_name = os.getenv("MODEL_NAME", "moonshot-v1-8k")
if not api_key:
print("Error: OPENAI_API_KEY environment variable is not set.")
return
print(f"Configuration loaded: Model={model_name}")
# === 可选:集成 MCP ===
# 启动同目录下的 simple_mcp_server.py
mcp_script = os.path.join(os.path.dirname(os.path.abspath(__file__)), "simple_mcp_server.py")
mcp_client = MCPClient(
command=sys.executable,
args=[mcp_script]
)
try:
mcp_client.start()
# 将 MCP Server 的工具注册到我们的 Registry 中
registry.register_mcp_server(mcp_client)
except Exception as e:
print(f"⚠️ Failed to start MCP client: {e}")
# === 初始化 Agent ===
agent = ReActAgent(
model=model_name,
tools_registry=registry,
api_key=api_key,
base_url=base_url
)
# === 运行任务 ===
question = "What is the weather in Beijing? Also use add_numbers tool to calculate 123 + 456."
print(f"User Question: {question}")
try:
agent.run(question)
finally:
mcp_client.close()
if __name__ == "__main__":
main()
结语
通过这三个文件,用不到 300 行代码实现了一个功能完整的 Agent 系统。
- Debuggable: 没有几十层的堆栈调用。出错时,你一眼就能看到是 Prompt 写错了,还是 JSON 解析挂了。
- Customizable : 想要增加一个 "Human Approval" 步骤?直接在
agent.py的while循环里加一行代码即可。 - Performance: 没有额外的序列化/反序列化开销,没有冗余的 Token 消耗。