用 Python 实现一个“小型 ReAct 智能体”：思维链 + 工具调用 + 环境交互

在大语言模型（LLM）的应用开发中，如何让模型具备调用外部工具的能力是一个关键问题。我们不希望模型只是"生成答案"，而是能像一个智能体（Agent）一样，按照推理链条自主决定调用搜索、计算、或数据库查询等工具，再结合结果给出最终答案。

本文将通过一段简洁的 Python 代码，演示如何实现一个迷你版的 ReAct Agent （Reasoning + Acting）。这个智能体能与用户进行交互，自动选择调用 Wikipedia 查询 、计算器 或 博客搜索 API 来辅助推理，并逐步生成最终答案。

1. 背景：ReAct 模式与工具调用

ReAct（Reason+Act）是一种大模型交互模式，流程大致为：

Thought：模型根据问题思考下一步的策略。
Action：模型选择一个工具并传入参数。
Observation：外部环境返回结果。
循环：模型继续思考并执行下一个动作，直到能直接给出最终答案。

这种模式能让 LLM 从"单纯生成"转变为"与环境交互"，具备更强的可扩展性。

2. 核心代码结构

我们先来看一段简化的实现：

python 复制代码

import re
import httpx
from langchain_openai import ChatOpenAI

client = ChatOpenAI(
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
    api_key="<your_secret_key>",
    model="qwen2.5-72b-instruct"
)

这里我们使用 ChatOpenAI 封装了一个大模型客户端（可替换为任意兼容 OpenAI 接口的模型，例如 Qwen、GPT-4、Claude 等）。

接下来定义了一个 ChatBot 类，用于管理消息上下文：

python 复制代码

class ChatBot:
    def __init__(self, system=""):
        self.system = system
        self.messages = []
        if self.system:
            self.messages.append({"role": "system", "content": system})

    def __call__(self, message):
        self.messages.append({"role": "user", "content": message})
        result = self.execute()
        self.messages.append({"role": "assistant", "content": result})
        return result

    def execute(self):
        completion = client.invoke(input=self.messages)
        return completion.content

关键点：

self.messages 保存了完整的对话历史（system prompt + user prompt + assistant response）。
__call__ 让 ChatBot 实例可以直接作为函数调用，便于迭代。
每次执行都调用 client.invoke()，并把所有上下文交给大模型。

3. Prompt 设计：引导 LLM 遵循 ReAct 格式

python 复制代码

prompt = """
You run in a loop of Thought, Action, PAUSE, Observation.
At the end of the loop you output an Answer
Use Thought to describe your thoughts about the question you have been asked.
Use Action to run one of the actions available to you - then return PAUSE.
Observation will be the result of running those actions.

Your available actions are:

calculate:
e.g. calculate: 4 * 7 / 3
Runs a calculation and returns the number - uses Python so be sure to use floating point syntax if necessary

wikipedia:
e.g. wikipedia: Django
Returns a summary from searching Wikipedia

simon_blog_search:
e.g. simon_blog_search: Django
Search Simon's blog for that term

Always look things up on Wikipedia if you have the opportunity to do so.

Example session:

Question: What is the capital of France?
Thought: I should look up France on Wikipedia
Action: wikipedia: France
PAUSE

You will be called again with this:

Observation: France is a country. The capital is Paris.

You then output:

Answer: The capital of France is Paris
""".strip()

这段 system prompt 明确规定了交互格式：

模型必须先写 Thought。
如果需要调用工具，则写 Action: 工具名: 参数，然后返回 PAUSE。
工具执行结果会以 Observation: ... 的形式喂回给模型。
最后，模型才能输出 Answer:。

通过严格约束，我们让模型进入一个 循环推理-调用-观察 的流程。

4. 动作解析与执行

利用正则表达式匹配模型输出中的 Action：

python 复制代码

action_re = re.compile('^Action: (\w+): (.*)$')

在主循环 query() 里：

python 复制代码

def query(question, max_turns=5):
    i = 0
    bot = ChatBot(prompt)
    next_prompt = question
    while i < max_turns:
        i += 1
        result = bot(next_prompt)
        print(result)
        actions = [action_re.match(a) for a in result.split('\n') if action_re.match(a)]
        if actions:
            # There is an action to run
            action, action_input = actions[0].groups()
            if action not in known_actions:
                raise Exception("Unknown action: {}: {}".format(action, action_input))
            print(" -- running {} {}".format(action, action_input))
            observation = known_actions[action](action_input)
            print("Observation:", observation)
            next_prompt = "Observation: {}".format(observation)
        else:
            return

这里的逻辑是：

把用户问题送进模型，获取输出。
如果输出里有 Action，则调用对应工具函数。
把工具的结果作为 Observation 再送回模型。
如果模型直接输出 Answer，就结束循环。

5. 工具实现

目前实现了三个工具：

python 复制代码

def wikipedia(q):
    return httpx.get("https://en.wikipedia.org/w/api.php", params={
        "action": "query",
        "list": "search",
        "srsearch": q,
        "format": "json"
    }).json()["query"]["search"][0]["snippet"]


def simon_blog_search(q):
    results = httpx.get("https://datasette.simonwillison.net/simonwillisonblog.json", params={
        "sql": """
        select
          blog_entry.title || ': ' || substr(html_strip_tags(blog_entry.body), 0, 1000) as text,
          blog_entry.created
        from
          blog_entry join blog_entry_fts on blog_entry.rowid = blog_entry_fts.rowid
        where
          blog_entry_fts match escape_fts(:q)
        order by
          blog_entry_fts.rank
        limit
          1""".strip(),
        "_shape": "array",
        "q": q,
    }).json()
    return results[0]["text"]


def calculate(what):
    return eval(what)

known_actions = {
    "wikipedia": wikipedia,
    "calculate": calculate,
    "simon_blog_search": simon_blog_search
}

Wikipedia：通过官方 API 获取搜索摘要。
Simon Blog Search ：调用 Simon Willison 的博客 API 进行全文检索。
Calculate ：直接用 Python eval() 计算表达式（仅演示，实际生产中要做安全防护）。

6. 效果演示

执行：

python 复制代码

query("What does India share borders with?")

可得到以下推理过程：

复制代码

Thought: To answer this question, I need to look up information about India's geography and its neighboring countries on Wikipedia.
Action: wikipedia: India
PAUSE
 -- running wikipedia India
Observation: <span class="searchmatch">India</span>, officially the Republic of <span class="searchmatch">India</span>, is a country in South Asia. It is the seventh-largest country by area; the most populous country since 2023;
Thought: The provided excerpt does not contain the specific information about the countries that share borders with India. I need to refine my search to get more detailed geographical information.
Action: wikipedia: Geography of India
PAUSE
 -- running wikipedia Geography of India
Observation:  <span class="searchmatch">Of</span> <span class="searchmatch">India</span>. 2007. ISBN 978-81-230-1423-4. Wikimedia Commons has media related to <span class="searchmatch">Geography</span> <span class="searchmatch">of</span> <span class="searchmatch">India</span>. Singh, R.L. (1971). <span class="searchmatch">India</span> A Regional <span class="searchmatch">Geography</span>. National
Thought: The current observation still does not provide the specific information about the countries that share borders with India. I will try a more direct search to find this information.
Action: wikipedia: Borders of India
PAUSE
 -- running wikipedia Borders of India
Observation: The Republic <span class="searchmatch">of</span> <span class="searchmatch">India</span> shares <span class="searchmatch">borders</span> with several sovereign countries; it shares land <span class="searchmatch">borders</span> with China, Bhutan, Nepal, Pakistan, Bangladesh, and Myanmar
Answer: India shares borders with China, Bhutan, Nepal, Pakistan, Bangladesh, and Myanmar.

可以看到，模型先思考，再调用 Wikipedia API，拿到结果后生成最终答案。

7. 可扩展的方向

这个简单的 Demo 展示了 ReAct 智能体的核心循环。在实际应用中，读者朋友们可以进一步扩展：

增加更多工具：如数据库查询、文件系统、搜索引擎、第三方 API 等。
错误处理 ：对 eval() 和网络请求增加异常捕获和安全限制。
并行工具调用：让模型一次调用多个工具，合并结果后继续推理。
LangChain/LangGraph 集成：结合更强的智能体框架，实现任务规划、子任务拆解与状态管理。

8. 总结

通过不到 200 行代码，我们实现了一个简洁的 ReAct 风格智能体。它展示了以下关键点：

利用 system prompt 约束 LLM 输出格式。
通过 Action/Observation 循环 让模型与外部环境交互。
把 工具调用 抽象成函数，方便扩展和维护。

这类模式是构建 大模型智能体 的核心思路，未来读者朋友们可以在此基础上扩展成更强大的多工具、多任务智能体。