手写一个 AI Agent：从 Function Calling 到自动化任务链

Agent 这个词现在到处都是，但很多教程要么只讲概念，要么直接上 LangChain 框架，跳过了最关键的底层机制。

这篇文章从零手写一个 Agent，不用任何框架，只用大模型原生的 Function Calling。搞懂这个，框架只是语法糖。

Function Calling 是什么

普通对话：你问模型问题，它给你文字回答。

Function Calling：你告诉模型「我有这些工具可以用」，模型推理时会说「我想调用 XX 工具，参数是 YY」，你执行这个调用，把结果返回，模型继续推理。

这就是 Agent 的核心------模型驱动工具调用，形成「推理→行动→观察」的循环。

实现步骤

1. 定义工具

python 复制代码

import json
from openai import OpenAI

client = OpenAI()

tools = [
    {
        "type": "function",
        "function": {
            "name": "search_web",
            "description": "搜索互联网获取最新信息",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string", "description": "搜索关键词"}
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "执行数学计算",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {"type": "string", "description": "数学表达式"}
                },
                "required": ["expression"]
            }
        }
    }
]

2. 实现工具函数

python 复制代码

def search_web(query: str) -> str:
    # 实际可接 Tavily、SerpAPI 等
    return f"搜索 '{query}' 的结果：相关信息..."

def calculate(expression: str) -> str:
    try:
        allowed = set('0123456789+-*/()., ')
        if not all(c in allowed for c in expression):
            return "错误：包含不允许的字符"
        return str(eval(expression))
    except Exception as e:
        return f"计算错误: {e}"

def execute_tool(name: str, args: dict) -> str:
    if name == "search_web": return search_web(**args)
    if name == "calculate": return calculate(**args)
    return f"未知工具: {name}"

3. Agent 主循环

python 复制代码

def run_agent(user_input: str, max_steps: int = 10) -> str:
    messages = [
        {"role": "system", "content": "你是有用的助手，可以使用工具回答问题。"},
        {"role": "user", "content": user_input}
    ]
    for step in range(max_steps):
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=tools,
            tool_choice="auto"
        )
        message = response.choices[0].message
        if not message.tool_calls:
            return message.content   # 无工具调用 = 最终答案
        messages.append(message)
        for tc in message.tool_calls:
            args = json.loads(tc.function.arguments)
            result = execute_tool(tc.function.name, args)
            messages.append({
                "role": "tool",
                "tool_call_id": tc.id,   # 必须和 tool_call id 对应
                "content": result
            })
    return "达到最大步数"

运行效果

css 复制代码

Step 1: 调用 search_web({'query': '苹果公司市值'})
Step 2: 调用 calculate({'expression': '3.5 * 7.2'})
Step 3: 最终回答：苹果市值约 3.5 万亿美元，折合人民币约 25.2 万亿元。

模型自己决定了先搜索再计算------这就是自主推理链。

关键细节

消息历史不能丢 --- 每次工具调用后，模型的决策和工具返回都要加进 messages，下一步推理依赖完整上下文。

tool_call_id 必须对应 --- tool 角色消息里的 id 必须和 message.tool_calls 里一一对应，多工具并发时尤其要注意。

max_steps 防死循环 --- 设步数上限是必要的安全措施。

扩展方向

并行工具调用：GPT-4 支持一次返回多个 tool_call，可并发执行
持久化记忆：把历史存 DB，跨会话保持上下文
多 Agent 协作：一个 Agent 把另一个 Agent 当工具调用

Agent 开发的系统性内容，gufacode.com 的 AI Agent 教程从基础到多 Agent 协作都有，写法偏工程实战，可以去翻翻。

Function Calling 搞懂之后，LangChain 的 Agent 部分就透明了------它只是把这个循环封装起来。先手写一遍再用框架，能省很多排查问题的时间。