Loop Engineering 深度实践指南:9 种 2026 年最新做法与完整代码

2026 年 6 月,Loop Engineering 概念在硅谷引爆 ------ 它不是又一个转瞬即逝的 buzzword,而是 AI 工程范式从 " 人提示 AI" 到 " 人设计系统,系统驱动 AI" 的根本转折。本文不讲入门概念,直接切入当前工业界和研究界最活跃的九大实践方向,每个方向都配有可运行的代码示例和架构决策指南。


目录

  1. [极简代码循环 vs. 图状态机框架:两条路线之争](#极简代码循环 vs. 图状态机框架:两条路线之争 "#1-%E6%9E%81%E7%AE%80%E4%BB%A3%E7%A0%81%E5%BE%AA%E7%8E%AF-vs-%E5%9B%BE%E7%8A%B6%E6%80%81%E6%9C%BA%E6%A1%86%E6%9E%B6%E4%B8%A4%E6%9D%A1%E8%B7%AF%E7%BA%BF%E4%B9%8B%E4%BA%89")
  2. [计划-执行双层循环 + 动态重规划](#计划-执行双层循环 + 动态重规划 "#2-%E8%AE%A1%E5%88%92-%E6%89%A7%E8%A1%8C%E5%8F%8C%E5%B1%82%E5%BE%AA%E7%8E%AF--%E5%8A%A8%E6%80%81%E9%87%8D%E8%A7%84%E5%88%92")
  3. 事件驱动与流式循环
  4. [多 Agent 拓扑循环](#多 Agent 拓扑循环 "#4-%E5%A4%9A-agent-%E6%8B%93%E6%89%91%E5%BE%AA%E7%8E%AF")
  5. 长时持久化与耐久执行
  6. [自优化循环:DSPy 驱动的 Loop 工程](#自优化循环:DSPy 驱动的 Loop 工程 "#6-%E8%87%AA%E4%BC%98%E5%8C%96%E5%BE%AA%E7%8E%AF%EF%BC%9Adspy%E9%A9%B1%E5%8A%A8%E7%9A%84loop%E5%B7%A5%E7%A8%8B")
  7. 声明式循环配置
  8. 可观测性驱动的循环断点与人工协同
  9. 安全护栏子循环

前置背景:Loop Engineering 解决了什么?

在进入具体做法之前,需要明确一个根本问题:

人类注意力有限,AI 执行能力无限。 传统的 Prompt Engineering 模式下,人必须坐在终端前按每一次回车、审每一次输出。当模型输出速度远超人类处理速度时,人就成为了整个流程的瓶颈。

Loop Engineering 的核心是将开发者从 " 每一次交互的参与者 " 转变为 " 循环系统的设计者 "------定义目标、设置护栏、设计验证机制,然后让系统自运转。这不仅是效率的量变,而是生产关系的质变。

当前 Loop Engineering 的工程实践已经远远超越了 2025 年的 ReAct 简单循环,形成了多条技术路线。以下是 2026 年最值得关注的九大方向。


1. 极简代码循环 vs. 图状态机框架:两条路线之争

这是 Loop Engineering 工程哲学上最根本的分野,两条路线各有拥趸,而最新趋势是它们正在走向融合。

路线 A:极简 while 循环

Anthropic 内部推崇的路线。直接用 Python/TypeScript 的 while 循环控制 Agent,内部顺序调用 LLM、解析工具、执行、拼接结果。

核心理念:逻辑完全透明,无黑魔法,易调试。每一步发生什么一目了然,适合规则清晰、步骤较短的 Agent。

python 复制代码
"""
极简 while 循环 Agent ------ 完全透明的执行逻辑
"""

import json
from openai import OpenAI
from typing import Any

client = OpenAI()

# 工具定义
tools: list[dict[str, Any]] = [
    {
        "type": "function",
        "function": {
            "name": "read_file",
            "description": "Read file content",
            "parameters": {
                "type": "object",
                "properties": {
                    "path": {"type": "string", "description": "File path"}
                },
                "required": ["path"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "run_tests",
            "description": "Run the test suite and return results",
            "parameters": {
                "type": "object",
                "properties": {
                    "filter": {"type": "string", "description": "Test filter pattern"}
                },
            },
        },
    },
]

def execute_tool(tool_call) -> str:
    """执行工具调用 ------ 逻辑完全可见"""
    name = tool_call.function.name
    args = json.loads(tool_call.function.arguments)

    if name == "read_file":
        with open(args["path"]) as f:
            return f.read()
    elif name == "run_tests":
        import subprocess
        result = subprocess.run(
            ["pytest", args.get("filter", ""), "-q"],
            capture_output=True, text=True
        )
        return result.stdout + result.stderr
    return "Unknown tool"

def agent_loop(task: str, max_iterations: int = 10) -> str:
    """
    核心循环:完全使用标准 Python while,无框架依赖。
    每一步:LLM 推理 → 解析工具调用 → 执行 → 拼接结果 → 下一轮。
    """
    messages: list[dict[str, Any]] = [
        {
            "role": "system",
            "content": (
                "You are a coding agent. "
                "Use tools to understand the codebase and make changes. "
                "When you believe the task is done, respond with 'DONE'."
            ),
        },
        {"role": "user", "content": task},
    ]

    for iteration in range(max_iterations):
        # Step 1: LLM 推理
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=tools,
        )
        msg = response.choices[0].message

        # Step 2: 如果模型直接返回文本(无工具调用)
        if msg.content and "DONE" in msg.content:
            return msg.content

        # Step 3: 执行工具调用
        if msg.tool_calls:
            tool_results = []
            for tc in msg.tool_calls:
                result = execute_tool(tc)
                tool_results.append({
                    "role": "tool",
                    "tool_call_id": tc.id,
                    "content": result,
                })

            # Step 4: 拼接结果,进入下一轮
            messages.append(msg)
            messages.extend(tool_results)
            continue

        return msg.content or "No response"

    return f"Stopped after {max_iterations} iterations"

# 使用
result = agent_loop("Read README.md and run all tests. Fix any failures.")

优点

  • 每一步都是显式的,断点调试零学习成本
  • 状态管理完全在 messages 列表中,脏了清掉即可
  • 代码量通常不超过 150 行,review 成本低

缺点

  • 复杂分支逻辑(5+ 个条件跳转)会快速膨胀
  • 无法原生表达并行执行
  • 缺乏可视化和 tracing 支持

路线 B:图状态机框架

LangGraph、OpenAI Agents SDK、Google ADK 都走向了图状态机路线。用有向图定义节点(LLM 调用、工具、条件分支)和边,原生支持并行、动态路由、子图嵌套。

核心理念:用结构化的图描述控制流,逻辑更复杂但可追溯、可可视化。

python 复制代码
"""
LangGraph 图状态机 Agent
用有向图明确定义节点和边,支持并行分支和条件路由
"""

from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from typing import TypedDict, Annotated, Literal
import operator


# 定义状态
class AgentState(TypedDict):
    messages: Annotated[list, operator.add]
    task_status: str
    tool_results: dict


# 节点定义
def planner(state: AgentState) -> AgentState:
    """规划节点:分析任务,决定下一步"""
    # LLM 调用,生成计划
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "Plan the next step. Return a JSON action."},
            *state["messages"],
        ],
    )
    return {"messages": [response.choices[0].message]}


def executor(state: AgentState) -> AgentState:
    """执行节点:执行工具调用"""
    # 工具执行逻辑
    return {"task_status": "executing"}


def evaluator(state: AgentState) -> AgentState:
    """评估节点:判断结果质量"""
    return {"task_status": "evaluated"}


def should_retry(state: AgentState) -> Literal["executor", "planner", END]:
    """条件路由:根据评估结果决定跳转"""
    if state.get("task_status") == "success":
        return END
    elif state.get("retry_count", 0) < 3:
        return "executor"
    else:
        return "planner"


# 构建图
graph = StateGraph(AgentState)

graph.add_node("planner", planner)
graph.add_node("executor", executor)
graph.add_node("evaluator", evaluator)

graph.set_entry_point("planner")
graph.add_edge("planner", "executor")
graph.add_edge("executor", "evaluator")
graph.add_conditional_edges("evaluator", should_retry)

app = graph.compile()

优点

  • 可视化整个循环拓扑,一眼看清决策分支
  • 原生支持并行节点(多个 executor 同时跑)
  • 子图嵌套:一个节点本身可以是另一个 StateGraph
  • 社区生态成熟(LangSmith tracing、Weave 集成)

缺点

  • 调试难度倍增 ------ 需要图编译器 + 状态追踪
  • 简单任务引入不必要的抽象层
  • 性能开销:图的调度层比裸 while 慢约 15-30%

融合趋势:轻量图 + 代码节点

2026 年的最佳实践正在向中间路线收敛 ------ 用 JSON/YAML 声明图的拓扑结构,用代码直接实现节点逻辑。

python 复制代码
"""
轻量图 Loop:JSON 定义拓扑 + 代码实现节点
"""

loop_config = {
    "entry": "planner",
    "nodes": {
        "planner": {
            "type": "llm",
            "model": "gpt-4o",
            "system_prompt": "Plan the next step as JSON.",
            "next": "executor",
        },
        "executor": {
            "type": "code",
            "handler": "my_project.loop_nodes.execute_step",
            "next": "evaluator",
        },
        "evaluator": {
            "type": "llm",
            "model": "gpt-4o-mini",
            "system_prompt": "Evaluate the result. Return PASS or FAIL with reasons.",
            "routes": {
                "PASS": "END",
                "FAIL": "executor",
                "REPLAN": "planner",
            },
        },
    },
    "max_iterations": 20,
    "parallel_nodes": [],
}


class LightweightLoop:
    """轻量循环引擎:解析 JSON 配置,调用代码节点"""

    def __init__(self, config: dict):
        self.config = config
        self.state = {"messages": [], "iteration": 0}

    def run(self, task: str):
        current = self.config["entry"]
        while current != "END" and self.state["iteration"] < self.config.get(
            "max_iterations", 50
        ):
            node = self.config["nodes"][current]
            if node["type"] == "llm":
                self._run_llm_node(node)
            elif node["type"] == "code":
                self._run_code_node(node)

            current = self._resolve_route(node)
            self.state["iteration"] += 1

这种混合路线的优势

  • 图拓扑一目了然(改配置即改流程,无需动代码)
  • 节点逻辑用纯代码实现,调试像普通函数
  • 换 Loop 策略(ReAct → Plan-Execute → Maker-Checker)只需换配置文件

2. 计划-执行双层循环 + 动态重规划

平铺的 ReAct 循环在复杂任务中暴露了根本缺陷:每一步都是局部最优决策,缺乏全局视野,导致大量无效工具调用和路径偏离。

双层架构设计

vbnet 复制代码
┌──────────────────────────────────────────────────┐
│               Outer Loop: Planner                 │
│  ┌────────────────────────────────────────────┐  │
│  │  Generate high-level plan with steps        │  │
│  │  Step 1 → Step 2 → Step 3 → Step 4         │  │
│  └──────────────────┬─────────────────────────┘  │
│          │   │   │   │                             │
│          ▼   ▼   ▼   ▼                             │
│  ┌────────────────────────────────────────────┐  │
│  │        Inner Loop: Executor (per step)      │  │
│  │   Perceive → Reason → Act → Observe         │  │
│  │   On failure: signal to Outer Loop          │  │
│  └──────────────────┬─────────────────────────┘  │
│                     │                              │
│                     ▼                              │
│   Dynamic Re-plan based on executor result        │
└──────────────────────────────────────────────────┘

外循环 Planner:LLM 生成高层计划(步骤列表,带依赖关系)。Planner 不参与执行,仅负责战略。

内循环 Executor:逐步执行,每步后将结果反馈给外循环。若某步失败,外循环根据新状态动态修订剩余计划。

这本质上是 Plan-and-Solve 的增强版 ,加上 ReWOO 的核心思想 ------ 把观测(Observation)与推理(Reasoning)脱钩,避免上下文被中间结果污染。

python 复制代码
"""
双层循环 Agent:Plan-and-Execute + Dynamic Re-plan
"""

import json
from dataclasses import dataclass, field
from typing import Optional


@dataclass
class PlanStep:
    id: str
    description: str
    depends_on: list[str] = field(default_factory=list)
    status: str = "pending"  # pending | running | success | failed


@dataclass
class ExecutionPlan:
    steps: list[PlanStep]
    current_step_index: int = 0


class DualLoopAgent:
    """
    双层循环:
    - Outer loop: 生成计划 + 当执行失败时动态重规划
    - Inner loop: 逐步执行 + 将结果反馈给 Outer loop
    """

    def __init__(self):
        self.context: list[dict] = []
        self.plan: Optional[ExecutionPlan] = None

    def outer_loop(self, task: str) -> str:
        """Outer Loop:战略层 ------ 规划与重规划"""

        # Phase 1: 生成高层计划
        plan_response = self._call_llm(
            system=(
                "You are a planner. Break the task into a JSON array of steps. "
                'Each step: {"id": "step_N", "description": "...", '
                '"depends_on": ["step_X"]}. '
                "Steps must be ordered. Only output JSON, no explanation."
            ),
            user=task,
        )
        steps_data = json.loads(plan_response)

        self.plan = ExecutionPlan(
            steps=[PlanStep(**s) for s in steps_data]
        )

        # Phase 2: 逐步执行(Inner Loop 调 outer)
        while self.plan.current_step_index < len(self.plan.steps):
            step = self.plan.steps[self.plan.current_step_index]
            step.status = "running"

            result = self.inner_loop(step)

            if result["success"]:
                step.status = "success"
                self.plan.current_step_index += 1
                self.context.append({
                    "step": step.id,
                    "result": result["output"],
                })
            else:
                step.status = "failed"
                # 动态重规划:根据失败原因修订剩余计划
                if not self._replan(step, result["error"]):
                    return f"Task failed at step {step.id}: {result['error']}"

        return "All steps completed successfully"

    def inner_loop(self, step: PlanStep) -> dict:
        """Inner Loop:战术层 ------ 单步执行 + 迭代修正"""

        # 准备上下文:只包含当前步骤和已完成步骤的结果摘要
        inner_context = [
            {"role": "system", "content": f"Execute step: {step.description}"},
            {
                "role": "user",
                "content": f"Previous results: {json.dumps(self.context[-3:])}",
            },
        ]

        for attempt in range(5):
            result = self._execute_with_tools(inner_context)

            if result["success"]:
                return result

            # 把错误反馈注入下一轮
            inner_context.append({
                "role": "user",
                "content": f"Previous attempt failed: {result['error']}. "
                "Analyze the error and try a different approach.",
            })

        return {"success": False, "error": "Max inner loop iterations reached"}

    def _replan(self, failed_step: PlanStep, error: str) -> bool:
        """动态重规划:根据执行失败原因修订后续步骤"""

        remaining_steps = self.plan.steps[self.plan.current_step_index + 1 :]
        if not remaining_steps:
            return False

        replan_prompt = (
            f"Step '{failed_step.description}' failed with: {error}\n"
            f"Remaining steps to complete the task:\n"
            + "\n".join(f"- {s.description}" for s in remaining_steps)
            + "\n\nRevise or reorder the remaining steps. "
            "If the task is no longer achievable, return empty list."
        )

        response = self._call_llm(
            system="Revise the plan based on the failure. Output JSON array of steps.",
            user=replan_prompt,
        )

        new_steps = json.loads(response)
        if not new_steps:
            return False

        # 替换后续步骤
        self.plan.steps = (
            self.plan.steps[: self.plan.current_step_index + 1]
            + [PlanStep(**s) for s in new_steps]
        )
        self.plan.current_step_index += 1
        return True

    def _call_llm(self, system: str, user: str) -> str:
        # 简化的 LLM 调用接口
        import openai
        client = openai.OpenAI()
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": system},
                {"role": "user", "content": user},
            ],
        )
        return response.choices[0].message.content

    def _execute_with_tools(self, context: list[dict]) -> dict:
        # 简化的工具执行接口
        return {"success": True, "output": "done"}

与 ReWOO 的对比

ReWOO (Reasoning WithOut Observation) 的核心洞察是:观测数据(工具输出)不应混入推理过程------它会把推理链污染成 " 我看到 X,所以做 X",而非 " 基于目标,我需要 X"。

双层循环天然支持这个模式:Inner Loop 负责 " 脏活 "(工具调用 + 观测),Outer Loop 只接收摘要结果做战略决策。两者的上下文窗口保持干净。

效果(社区报告数据):

  • 无效工具调用减少约 40-60%
  • 复杂多步任务的成功率提升约 25%
  • 最长可处理 30+ 步的复杂任务链

3. 事件驱动与流式循环

传统 Agent Loop 是严格的串行模式:" 请求 → 完整响应 → 解析 → 执行 → 下一轮请求 "。每个环节都要等上一个完成,端到端延迟是各环节之和。

事件驱动循环把这个范式彻底翻转:LLM 每吐出部分结构化动作的片段,就立刻触发工具调用,工具执行的同时 LLM 还在继续生成。

vbnet 复制代码
Traditional Loop:
LLM generate ──→ parse tool calls ──→ execute tools ──→ concat ──→ LLM generate ──→ ...
[───── wait ─────][── wait ──][── wait ──]

Event-driven Loop:
LLM streaming ───────────────────────────────────────────→
 ├─→ tool call fragment arrives → async execute tool ──→
 │      ├─→ result event pushed
 ├─→ continue generating ───────────────────────────────→
 │      ├─→ state update triggered
 └─→ more tool calls → async execute ───────────────────→
        └─→ next round decision

流式工具调用解析

OpenAI 和 Anthropic 的 API 在 2025 年已经支持流式工具调用(streaming tool calls)。当模型决定调用工具时,API 以增量块的形式推送函数名、参数片段,客户端可以在参数还没有完整接收时就发起准备。

python 复制代码
"""
流式循环引擎:边生成边执行
"""

import asyncio
import json
from typing import AsyncIterator


class StreamingLoopEngine:
    """
    核心机制:
    1. LLM 流式输出 → 实时解析工具调用片段
    2. 工具调用立即异步执行
    3. 工具结果以事件形式推回决策循环
    """

    def __init__(self, event_bus=None):
        self.event_bus = event_bus or AsyncEventBus()
        self.pending_tools: dict[str, asyncio.Task] = {}
        self.tool_results: dict[str, str] = {}

    async def run(self, task: str, max_iterations: int = 20):
        """流式循环入口"""

        messages = [{"role": "user", "content": task}]
        accumulated_tool_calls: dict[int, dict] = {}

        for iteration in range(max_iterations):
            stream = await self._stream_llm(messages)

            async for chunk in stream:
                delta = chunk.choices[0].delta

                # 情况 1:纯文本增量
                if delta.content:
                    yield {"type": "text", "content": delta.content}

                # 情况 2:工具调用增量
                if delta.tool_calls:
                    for tc_delta in delta.tool_calls:
                        idx = tc_delta.index
                        if idx not in accumulated_tool_calls:
                            accumulated_tool_calls[idx] = {
                                "id": tc_delta.id or "",
                                "function": {"name": "", "arguments": ""},
                            }

                        if tc_delta.id:
                            accumulated_tool_calls[idx]["id"] = tc_delta.id
                        if tc_delta.function:
                            if tc_delta.function.name:
                                accumulated_tool_calls[idx]["function"][
                                    "name"
                                ] += tc_delta.function.name
                            if tc_delta.function.arguments:
                                accumulated_tool_calls[idx]["function"][
                                    "arguments"
                                ] += tc_delta.function.arguments

                        # 关键:参数够完整就立即发起异步执行
                        if self._args_complete(
                            accumulated_tool_calls[idx]
                        ):
                            await self._dispatch_tool_async(
                                accumulated_tool_calls[idx]
                            )

            # 等待所有待处理的工具调用完成
            results = await self._gather_tool_results()

            # 拼接结果进入下一轮
            messages.append({
                "role": "assistant",
                "tool_calls": list(accumulated_tool_calls.values()),
            })
            for tc_id, result in results.items():
                messages.append({
                    "role": "tool",
                    "tool_call_id": tc_id,
                    "content": result,
                })

            accumulated_tool_calls.clear()

            # 如果模型只返回文本且无工具调用,结束
            if not results:
                break

    def _args_complete(self, tool_call: dict) -> bool:
        """判断工具调用参数是否足够完整以开始执行"""
        try:
            json.loads(tool_call["function"].get("arguments", ""))
            return True
        except json.JSONDecodeError:
            return False

    async def _dispatch_tool_async(self, tool_call: dict):
        """异步派发工具调用,立即返回不阻塞"""
        tc_id = tool_call["id"]
        if tc_id in self.pending_tools:
            return

        task = asyncio.create_task(self._execute_tool(tool_call))
        self.pending_tools[tc_id] = task

    async def _execute_tool(self, tool_call: dict) -> str:
        """执行工具(可能涉及网络 IO)"""
        name = tool_call["function"]["name"]
        args = json.loads(tool_call["function"]["arguments"])
        # 实际工具执行逻辑...
        result = f"Executed {name} with {args}"
        self.tool_results[tool_call["id"]] = result
        return result

    async def _gather_tool_results(self) -> dict:
        """等待所有异步工具调用完成,返回结果"""
        results = {}
        for tc_id, task in list(self.pending_tools.items()):
            try:
                await task
                results[tc_id] = task.result()
            except Exception as e:
                results[tc_id] = f"Error: {e}"
        self.pending_tools.clear()
        return results

    async def _stream_llm(self, messages) -> AsyncIterator:
        """流式调用 LLM API"""
        import openai
        client = openai.AsyncOpenAI()
        return await client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=[...],
            stream=True,
        )

事件总线架构(生产级)

在生产环境中,多个 Agent 可能并行运行,工具调用涉及多个微服务。事件总线模式将循环控制器从中心化调用者转变为事件消费者:

python 复制代码
"""
基于 AsyncIO + 事件队列的生产级流式循环
"""

class EventDrivenLoop:
    """
    事件驱动循环引擎:
    - llm_events: LLM 输出事件流
    - tool_events: 工具执行结果事件流
    - 循环控制器订阅两个事件流,根据事件类型驱动状态变更
    """

    def __init__(self):
        self.llm_events: asyncio.Queue = asyncio.Queue()
        self.tool_events: asyncio.Queue = asyncio.Queue()
        self.state = {"step": 0, "status": "running"}

    async def controller(self, task: str):
        """主控制器:事件消费者 + 分发者"""
        # 启动 LLM 生产者
        asyncio.create_task(self._llm_producer(task))

        while self.state["status"] == "running":
            # 同时监听两个事件源
            done, pending = await asyncio.wait(
                [
                    asyncio.create_task(self.llm_events.get()),
                    asyncio.create_task(self.tool_events.get()),
                ],
                return_when=asyncio.FIRST_COMPLETED,
            )

            for task_done in done:
                event = task_done.result()
                if event["source"] == "llm":
                    await self._handle_llm_event(event)
                elif event["source"] == "tool":
                    await self._handle_tool_event(event)

            # 取消未完成的等待
            for p in pending:
                p.cancel()

    async def _llm_producer(self, task: str):
        """LLM 生产者:流式输出 → 推送事件"""
        stream = await self._stream_llm(task)
        async for chunk in stream:
            await self.llm_events.put({
                "source": "llm",
                "type": "chunk",
                "data": chunk,
            })
        await self.llm_events.put({
            "source": "llm",
            "type": "done",
        })

    async def _handle_llm_event(self, event):
        """处理 LLM 事件:解析工具调用 → 异步分发"""
        if event["type"] == "done":
            self.state["status"] = "evaluating"
            return

        tool_call = self._parse_tool_call(event["data"])
        if tool_call:
            # 异步执行工具,结果会通过 tool_events 回来
            asyncio.create_task(self._execute_and_emit(tool_call))

    async def _execute_and_emit(self, tool_call):
        """执行工具并将结果推入事件队列"""
        result = await self._execute_tool(tool_call)
        await self.tool_events.put({
            "source": "tool",
            "tool_call_id": tool_call["id"],
            "result": result,
        })

    async def _handle_tool_event(self, event):
        """处理工具结果事件:更新状态 → 决定下一步"""
        self.state["tool_results"] = self.state.get("tool_results", {})
        self.state["tool_results"][event["tool_call_id"]] = event["result"]

        # 可在此触发下一轮 LLM 调用或直接结束

端到端延迟对比

模式 3 次工具调用延迟 说明
传统串行 LLM 生成 (2s) + Tool1(1s) + LLM(2s) + Tool2(1s) + LLM(2s) + Tool3(1s) = 9s 严格串行
流式循环 max(LLM 生成, Tool1+Tool2+Tool3 并行) ≈ 3-4s 边生成边执行

4. 多 Agent 拓扑循环

单个 Agent 的循环效能有天花板。无论是推理深度、上下文窗口还是单一视角,单 Agent 都有不可逾越的局限。2026 年的主流实践是在一个更大的循环中嵌套多个 Agent,形成 " 协作循环 "。

4.1 管理者-工人循环(Manager-Worker Loop)

最经典也最稳健的多 Agent 拓扑。主 Agent(Manager)分配子任务,启动子 Agent Loop 并等待结果,自身保持监督循环,随时中断或重新分配。

python 复制代码
"""
管理者-工人循环:主 Agent 协调多个子 Agent
"""

class ManagerWorkerLoop:
    """
    Manager 自身运行一个监督循环:
    1. 分析任务,拆解为子任务
    2. 分配给 Worker(创建子循环)
    3. 收集结果,评估质量
    4. 不满意则重新分配
    """

    def __init__(self):
        self.workers: dict[str, "WorkerAgent"] = {}
        self.task_queue: asyncio.Queue = asyncio.Queue()

    async def manage(self, main_task: str) -> str:
        """Manager 的主循环"""

        # Step 1: 拆解任务
        subtasks = await self._decompose(main_task)
        for st in subtasks:
            await self.task_queue.put(st)

        results = {}
        failed_tasks = []

        # Step 2: 分配循环(支持重分配)
        while not self.task_queue.empty() or failed_tasks:
            # 优先处理失败重试
            if failed_tasks:
                task = failed_tasks.pop(0)
                task["retry_count"] = task.get("retry_count", 0) + 1
            else:
                task = await self.task_queue.get()

            # 分配 Worker
            worker = await self._select_worker(task)
            worker_id = worker.id

            # 启动 Worker 子循环
            result = await worker.execute(task)
            results[task["id"]] = result

            # Step 3: 质量评估
            quality = await self._evaluate(task, result)
            if quality["pass"]:
                continue
            elif task.get("retry_count", 0) < 3:
                # 重新分配(可能换 Worker)
                task["feedback"] = quality["feedback"]
                failed_tasks.append(task)
            else:
                # 超过重试上限,升级为人工处理
                results[task["id"]]["escalated"] = True

        return self._summarize(results)

    async def _decompose(self, task: str) -> list[dict]:
        """LLM 拆解主任务为子任务列表"""
        response = await self._call_llm(
            system=(
                "Decompose the task into subtasks. Each subtask must be "
                "independently executable. Output JSON array with id and "
                "description. Specify dependencies if any."
            ),
            user=task,
        )
        return json.loads(response)

    async def _select_worker(self, task: dict) -> "WorkerAgent":
        """根据子任务特征选择最合适的 Worker"""
        # 可以基于 skills 匹配、当前负载、历史成功率
        pass

    async def _evaluate(self, task: dict, result: dict) -> dict:
        """评估 Worker 结果,返回 pass + feedback"""
        pass

4.2 生成-批评-修正循环(Generator-Critic-Reviser Loop)

输出 Agent 生成 → 批评 Agent 指出问题 → 生成 Agent 再次修改。形成一个内环迭代,直到质量过关。

python 复制代码
"""
生成-批评-修正循环
三个角色各司其职,形成质量迭代内环
"""

class GeneratorCriticLoop:
    """
    内环结构:
    Generator → Critic → [PASS] → 输出
                       → [FAIL] → Generator(带批评意见)→ ...
    """

    def generate(self, task: str, max_rounds: int = 5) -> str:
        output = self._generator(task)

        for round_num in range(max_rounds):
            # 批评 Agent 独立评估(不同模型、不同温度)
            critique = self._critic(output, task)

            if critique["score"] >= 0.9:
                return output

            # 将批评注入 Generator 的下一轮
            output = self._generator(
                task,
                previous_output=output,
                feedback=critique["feedback"],
            )

        return output  # 达到最大轮次,返回当前最优

    def _generator(
        self,
        task: str,
        previous_output: str = None,
        feedback: str = None,
    ) -> str:
        """Generator: 高温度 (0.7-0.9),鼓励创造性"""
        prompt = f"Task: {task}"
        if previous_output:
            prompt += (
                f"\n\nYour previous output:\n{previous_output}"
                f"\n\nCritique: {feedback}"
                "\n\nRevise based on the critique."
            )
        return self._call_llm(prompt, temperature=0.8)

    def _critic(self, output: str, task: str) -> dict:
        """Critic: 低温度 (0.0-0.1),严格评估"""
        response = self._call_llm(
            system=(
                "You are a strict code reviewer. Evaluate the output "
                "against the original task. Return JSON: "
                '{"score": 0.0-1.0, "feedback": "specific issues"}'
            ),
            user=f"Task: {task}\n\nOutput to evaluate:\n{output}",
            temperature=0.0,
        )
        return json.loads(response)

4.3 ChatEval 式多 Agent 辩论循环

多个 Agent 并行独立回答,然后互相辩论多轮,最后裁判 Agent 汇总,整体作为一个大循环。

python 复制代码
"""
多 Agent 辩论循环
"""

class DebateLoop:
    """
    辩论流程:
    1. 多个 Expert Agent 并行独立回答
    2. 互相审阅对方的回答,提出反驳
    3. 多轮辩论后,Judge Agent 汇总裁决
    """

    async def debate(self, question: str, num_experts: int = 3, rounds: int = 3):
        # Round 1: 独立回答
        answers = await asyncio.gather(*[
            self._expert_answer(question, expert_id=i)
            for i in range(num_experts)
        ])

        # Round 2-N: 互相辩论
        for round_num in range(1, rounds):
            # 每个 Expert 看到所有其他人的回答
            responses = await asyncio.gather(*[
                self._expert_rebut(
                    question=question,
                    my_answer=answers[i],
                    other_answers=[a for j, a in enumerate(answers) if j != i],
                    expert_id=i,
                )
                for i in range(num_experts)
            ])
            answers = responses

        # 最终裁决
        verdict = await self._judge(question, answers)
        return verdict

    async def _expert_answer(self, question: str, expert_id: int) -> str:
        return await self._call_llm(
            system=f"You are Expert {expert_id}. Give your best answer.",
            user=question,
            temperature=0.7,
        )

    async def _expert_rebut(
        self,
        question: str,
        my_answer: str,
        other_answers: list[str],
        expert_id: int,
    ) -> str:
        others_text = "\n---\n".join(
            f"Expert {j}: {a}" for j, a in enumerate(other_answers)
        )
        return await self._call_llm(
            system=(
                f"You are Expert {expert_id}. Review other experts' answers "
                "and refine your own. Point out flaws in others' reasoning."
            ),
            user=(
                f"Question: {question}\n"
                f"Your answer: {my_answer}\n"
                f"Other answers:\n{others_text}\n"
                "Provide your refined answer."
            ),
            temperature=0.6,
        )

    async def _judge(self, question: str, answers: list[str]) -> dict:
        return await self._call_llm(
            system=(
                "You are a judge. Synthesize all expert opinions and produce "
                "a final verdict. Output JSON with 'verdict' and 'reasoning'."
            ),
            user=f"Question: {question}\n\n"
            + "\n---\n".join(f"Expert {i}: {a}" for i, a in enumerate(answers)),
            temperature=0.0,
        )

拓扑选择指南

拓扑 Token 成本 质量提升 适用场景
Manager-Worker 中(子任务并行) 大型多文件任务,有明确分工
Generator-Critic 低(2x 调用) 代码生成、文档写作
多 Agent 辩论 高(N×M 轮) 最高 安全审计、设计评审、关键决策
层级委派 中-高 中-高 大规模系统,需要递归拆解

5. 长时持久化与耐久执行

大多数 Agent 循环的生命周期不超过一次会话。服务器重启、进程崩溃、网络中断------任何中断都意味着从头开始。耐久执行(Durable Execution) 让 Agent 循环具备 " 系统重启后继续 " 的能力。

核心思路

每一步的状态写入持久化存储(数据库 / 对象存储),工作流引擎能重新唤起并从中断的节点继续循环,而不是重新运行整个任务。

基于 Temporal.io 的实现

Temporal 是目前工业界耐久执行的事实标准。每一步是一个 Activity,整个循环是一个 Workflow。

python 复制代码
"""
基于 Temporal.io 的耐久执行 Agent Loop
服务器重启后自动从中断节点继续,无需重新运行
"""

from temporalio import workflow, activity
from temporalio.common import RetryPolicy
from dataclasses import dataclass
from typing import Optional


@dataclass
class LoopState:
    """持久化的循环状态 ------ 每一步都写入 Temporal 的 Event History"""
    task: str
    current_step: int = 0
    max_steps: int = 20
    messages: list[dict] = None
    step_results: list[dict] = None


@activity.defn
async def llm_reason(state: LoopState) -> dict:
    """Activity:LLM 推理 ------ Temporal 保证至多执行一次"""
    # 调用 LLM
    response = await openai_client.chat.completions.create(
        model="gpt-4o",
        messages=state.messages,
        tools=[...],
    )
    return response.choices[0].message.model_dump()


@activity.defn
async def execute_tool(tool_call: dict) -> str:
    """Activity:工具执行 ------ 支持重试策略"""
    # 执行工具调用
    pass


@activity.defn
async def evaluate_result(results: list[dict]) -> dict:
    """Activity:结果评估"""
    # 评估是否达到目标
    pass


@workflow.defn
class DurableAgentLoop:
    """
    Workflow 层面的 Agent Loop。
    Temporal 自动持久化每一步状态。
    服务器崩溃后从上次成功节点恢复。
    """

    @workflow.run
    async def run(self, task: str) -> str:
        state = LoopState(task=task)
        state.messages = [{"role": "user", "content": task}]

        # 重试策略:工具调用失败时重试
        tool_retry = RetryPolicy(
            initial_interval=1.0,
            maximum_interval=60.0,
            maximum_attempts=3,
        )

        for state.current_step in range(state.max_steps):
            # 每个 Activity 调用都会自动持久化到 Event History
            response = await workflow.execute_activity(
                llm_reason,
                state,
                start_to_close_timeout=30.0,
            )

            # 检查是否任务完成
            eval_result = await workflow.execute_activity(
                evaluate_result,
                [{"response": response}],
                start_to_close_timeout=10.0,
            )

            if eval_result.get("complete"):
                return eval_result["summary"]

            # 执行工具调用(带重试)
            if response.get("tool_calls"):
                for tc in response["tool_calls"]:
                    tool_result = await workflow.execute_activity(
                        execute_tool,
                        tc,
                        retry_policy=tool_retry,
                        start_to_close_timeout=60.0,
                    )
                    state.messages.append({
                        "role": "tool",
                        "tool_call_id": tc["id"],
                        "content": tool_result,
                    })

        return f"Completed {state.max_steps} steps"


# 部署后,即使 Worker 进程崩溃,
# Temporal Server 会在另一个 Worker 上从中断点恢复执行

关键设计原则

  • State is the source of truth:循环状态(messages、step_results)必须序列化到持久化存储,而非停留在内存
  • Deterministic replay :Workflow 代码必须是确定性的(不能直接调用 randomdatetime.now() 等),Temporal 通过重放 Event History 恢复状态
  • Activity 幂等性:每个 Activity 调用至少在重放时是幂等的

替代方案对比

引擎 适用场景 复杂度 亮点
Temporal.io 生产级长时任务(数小时-数天) 中高 重放机制、多语言 SDK、可视化管理面板
Prefect 数据管道 + Agent Loop Python 原生、观测性优秀
AWS Step Functions 已有 AWS 基础设施的团队 免运维、与 AWS 生态深度集成
Celery + Redis 轻量级、快速原型 部署简单、Python 生态成熟

6. 自优化循环:DSPy驱动的Loop工程

这项技术将 Loop Engineering 推向了元层面:把 Loop 本身当作可优化对象。在循环的各节点定义可学习的提示词和参数,利用 DSPy 等框架根据每个循环节点的成功 / 失败日志自动优化提示词和 few-shot 示例。

DSPy 在 Loop 优化中的应用

vbnet 复制代码
Traditional Loop Engineering:
Human design prompt → Loop run → Human observe failure → Human fix prompt → Loop run → ...

DSPy-driven Loop Engineering:
Human define metrics → Loop run → DSPy auto-collect success/failure logs
                                       ↓
                       Auto-optimize prompts, few-shot, model selection
                                       ↓
                       Loop run (new config) → success rate improved

DSPy 核心机制:将 LLM 调用抽象为 Signature(签名)Module(模块),通过编译器根据训练数据自动优化模块参数。

python 复制代码
"""
DSPy 驱动的自优化 Loop:让循环内的每个决策节点自动进化
"""

import dspy
from dspy.teleprompt import BootstrapFewShot

# ============================================
# 定义循环内的可优化模块
# ============================================

class PlanGenerator(dspy.Signature):
    """签名:将任务描述转化为执行计划"""
    task = dspy.InputField(desc="The high-level task to complete")
    tools = dspy.InputField(desc="Available tools with descriptions")
    plan = dspy.OutputField(desc="JSON array of steps with dependencies")


class ToolSelector(dspy.Signature):
    """签名:根据当前状态选择下一步工具"""
    current_state = dspy.InputField(desc="Current state and partial results")
    available_tools = dspy.InputField(desc="Available tools")
    next_tool = dspy.OutputField(desc="Selected tool name")
    tool_args = dspy.OutputField(desc="Arguments for the selected tool")


class ResultEvaluator(dspy.Signature):
    """签名:评估工具输出是否满足当前步骤要求"""
    step_goal = dspy.InputField(desc="What this step should achieve")
    tool_output = dspy.InputField(desc="Output from tool execution")
    passed = dspy.OutputField(desc="True/False")
    reason = dspy.OutputField(desc="Reason for pass/fail")


# ============================================
# 构建可优化的 DSPy Module
# ============================================

class SelfOptimizingLoop(dspy.Module):
    """
    DSPy Module 封装的 Agent Loop。
    每个 LLM 调用节点都是可优化的。
    """

    def __init__(self):
        super().__init__()
        # 每个节点都可以被 DSPy 编译器优化
        self.planner = dspy.ChainOfThought(PlanGenerator)
        self.selector = dspy.ChainOfThought(ToolSelector)
        self.evaluator = dspy.ChainOfThought(ResultEvaluator)

    def forward(self, task: str, tools: list[str]) -> str:
        """执行一次完整的循环"""

        # 节点 1:规划
        plan = self.planner(task=task, tools=str(tools))

        results = []
        for step in json.loads(plan.plan):
            # 节点 2:工具选择
            selection = self.selector(
                current_state=str(results),
                available_tools=str(tools),
            )

            # 执行工具(非 LLM 部分,不可优化)
            tool_result = self._execute(selection.next_tool, selection.tool_args)

            # 节点 3:结果评估
            eval_result = self.evaluator(
                step_goal=step["description"],
                tool_output=tool_result,
            )

            results.append({
                "step": step,
                "result": tool_result,
                "passed": eval_result.passed,
                "reason": eval_result.reason,
            })

            if not eval_result.passed:
                break

        return str(results)

    def _execute(self, tool: str, args: str) -> str:
        """实际工具执行 ------ 非 LLM 部分,不需要优化"""
        # ... 工具执行逻辑
        return "tool_result"


# ============================================
# 自动优化流程
# ============================================

def train_loop_with_examples():
    """用历史成功/失败数据训练 Loop"""

    # 准备训练数据:成功的任务执行轨迹
    trainset = [
        dspy.Example(
            task="Fix all ESLint errors in src/",
            tools='["read_file", "write_file", "run_lint"]',
            answer="success",  # 期望结果
        ).with_inputs("task", "tools"),
        # ... 更多训练样本
    ]

    # BootstrapFewShot:从成功轨迹中自动提取 few-shot 示例
    optimizer = BootstrapFewShot(
        metric=lambda example, pred, trace=None: 1.0 if "passed" in pred else 0.0,
        max_bootstrapped_demos=4,
    )

    # 编译优化
    optimized_loop = optimizer.compile(
        SelfOptimizingLoop(),
        trainset=trainset,
    )

    return optimized_loop

# 使用优化后的 Loop
optimized = train_loop_with_examples()
result = optimized(task="Fix all ESLint errors", tools=["read_file", "write_file", "run_lint"])

实际效果

社区报告数据(2026 Q2):

  • 任务成功率在无人工改 prompt 的情况下自动提升 15-35%
  • 原来需要 5-8 次迭代的任务降至 2-4 次
  • 特别适合需要反复执行同一类任务的场景(如每日 CI 修复)

DSPy 优化维度

DSPy 编译器可以优化以下维度:

维度 说明 编译器策略
Prompt 措辞 系统提示的表述方式 BootstrapFewShot 自动注入最优 few-shot
Few-shot 选择 选哪些示例放入上下文 基于任务相似度自动检索
Chain-of-Thought 是否需要推理步骤 自动添加 "Let's think step by step" 前缀
模型选择 哪个模型在这个节点表现最好 MIPROv2 自动模型选择
上下文长度 多长的历史消息进入下一轮 自动裁剪不相关的中间步骤

局限

  • 训练数据获取成本:需要积累足够的成功 / 失败轨迹
  • 编译器运行开销:BootstrapFewShot 需要多次 LLM 调用来优化,适用于稳定场景
  • 过度优化风险:优化后的 prompt 可能过度拟合训练数据中的模式

7. 声明式循环配置

Loop Engineering 的工程门槛正在被声明式配置大幅降低。核心思路:用 YAML/JSON 描述循环结构和策略,引擎解析配置直接生成循环运行时

配置驱动的好处

  • 策略即配置:换 Loop 策略只需换文件,无需改代码
  • 可版本控制:Loop 配置纳入 Git,可 diff、review、回滚
  • 快速实验:A/B 测试不同循环策略时,部署时间从小时级降到分钟级
  • 降低门槛:非工程角色也能理解和调整循环行为

完整配置示例

yaml 复制代码
# loop.yaml ------ 声明式 Loop 配置,一行不改代码即可切换策略

version: "2.0"
name: "daily-bug-fixer"

# ============ 循环基础配置 ============
loop:
  type: plan-execute  # react | plan-execute | maker-checker | ralph
  max_iterations: 30
  stop_conditions:
    - type: test_pass
      value: "all tests passing"
    - type: max_cost
      value: 5.0           # 最大 $5
    - type: no_progress
      consecutive_rounds: 3

# ============ 工具集 ============
tools:
  - name: read_file
    source: builtin
  - name: write_file
    source: builtin
    require_approval: true   # 写入文件需要人工审批
  - name: run_tests
    source: mcp
    endpoint: "http://ci-server/mcp/test-runner"
  - name: git_commit
    source: mcp
    endpoint: "http://git-server/mcp"

# ============ 模型配置 ============
models:
  planner:
    provider: anthropic
    model: claude-sonnet-4-20250514
    temperature: 0.3
    max_tokens: 4096
  executor:
    provider: openai
    model: gpt-4o
    temperature: 0.5
  checker:
    provider: anthropic
    model: claude-haiku-4-20250514
    temperature: 0.0

# ============ 子 Agent ============
sub_agents:
  - name: maker
    role: generator
    model: executor
    tools: [read_file, write_file, run_tests]
  - name: checker
    role: evaluator
    model: checker
    tools: [read_file, run_tests]
    independent: true       # 独立上下文窗口

# ============ 安全护栏 ============
guardrails:
  pre_action:
    - rule: "block destructive ops on src/"
      action: reject_and_replan
    - rule: "max file write per iteration: 3"
      action: reject_and_replan
  post_action:
    - rule: "check for secrets in output"
      action: redact_and_alert

# ============ 人工协同 ============
human_in_the_loop:
  triggers:
    - on: write_file
      paths: ["*.env", "*.key", "config/*"]
      action: await_approval
    - on: git_push
      action: await_approval
    - on: cost_exceeded
      threshold: 3.0
      action: notify_and_pause

# ============ 持久化 ============
persistence:
  engine: temporal           # temporal | prefect | redis
  state_file: ".loop/state.json"
  checkpoint_every: 5        # 每 5 步存档一次

# ============ 可观测性 ============
observability:
  tracing: langsmith
  metrics:
    - loop_iteration_count
    - tool_call_success_rate
    - cost_per_iteration
  alerts:
    - condition: "cost > $10/day"
      channel: "#ai-ops"

配置引擎实现

python 复制代码
"""
声明式 Loop 配置引擎
解析 YAML → 生成运行时
"""

import yaml
from typing import Literal


class LoopConfigEngine:
    """根据 YAML 配置动态生成 Loop 运行时"""

    STRATEGIES = {
        "react": ReactLoop,
        "plan-execute": PlanExecuteLoop,
        "maker-checker": MakerCheckerLoop,
        "ralph": RalphLoop,
    }

    @classmethod
    def from_yaml(cls, config_path: str) -> "BaseLoop":
        with open(config_path) as f:
            config = yaml.safe_load(f)

        loop_type = config["loop"]["type"]
        strategy_cls = cls.STRATEGIES[loop_type]

        # 根据配置组装 Loop
        loop = strategy_cls(
            max_iterations=config["loop"]["max_iterations"],
            stop_conditions=config["loop"]["stop_conditions"],
            tools=cls._init_tools(config["tools"]),
            models=cls._init_models(config["models"]),
            guardrails=config["guardrails"],
            human_triggers=config["human_in_the_loop"]["triggers"],
        )

        # 如果有子 Agent 配置,注入
        if "sub_agents" in config:
            for sa_config in config["sub_agents"]:
                loop.register_sub_agent(
                    name=sa_config["name"],
                    role=sa_config["role"],
                    model=sa_config["model"],
                    tools=sa_config["tools"],
                    independent=sa_config.get("independent", False),
                )

        return loop

    @classmethod
    def _init_tools(cls, tool_configs: list[dict]) -> dict:
        """初始化工具:支持 builtin 和 MCP 协议"""
        tools = {}
        for tc in tool_configs:
            if tc["source"] == "builtin":
                tools[tc["name"]] = BUILTIN_TOOLS[tc["name"]]
            elif tc["source"] == "mcp":
                tools[tc["name"]] = MCPTool(tc["endpoint"])
        return tools

配置驱动的实验流程

bash 复制代码
# 快速切换循环策略进行 A/B 测试
# 不需要修改任何 Python 代码

# 策略 A:ReAct
claude loop --config loops/react-config.yaml task.md

# 策略 B:Plan-Execute
claude loop --config loops/plan-execute-config.yaml task.md

# 策略 C:Maker-Checker
claude loop --config loops/maker-checker-config.yaml task.md

# 对比三者的成功率、耗时、成本
claude loop compare --configs loops/*.yaml --task task.md

8. 可观测性驱动的循环断点与人工协同

生产环境中的 Loop 不能是一个黑箱。通过全链路 tracing 实时监控循环状态,运维人员可以在运行时动态注入断点、热修改策略,而不需要下线 Agent。

核心能力

  • 动态断点注入:看到某个 Agent 即将执行高风险操作,通过管理面板挂入人工审批
  • 热修改策略:在不下线 Agent 的情况下调整循环参数
  • 实时状态面板:可视化每个节点的执行时间、Token 消耗、成功率

基于 LangSmith + Weave 的可观测 Loop

python 复制代码
"""
可观测 Loop:全链路 tracing + 动态断点 + 热修改
"""

import weave
from langsmith import traceable, run_tree


@weave.op
class ObservableLoop:
    """
    每个节点都被 weave 和 langsmith 追踪。
    运维人员可以在管理面板看到实时状态并注入断点。
    """

    def __init__(self, config: dict):
        self.config = config
        # 热修改支持:从配置中心实时拉取参数
        self.hot_config = HotConfigSource()

    @traceable(run_type="chain", name="agent_loop")
    async def run(self, task: str):
        """主循环 ------ 每个 iteration 都是一个 trace span"""

        state = {"task": task, "iteration": 0}

        while state["iteration"] < self.config["max_iterations"]:
            # 检查是否有运维注入的断点
            breakpoint = await self._check_breakpoint(state)
            if breakpoint:
                approval = await self._request_human_approval(
                    breakpoint
                )
                if not approval["approved"]:
                    state["status"] = "paused_by_human"
                    return state

            # 执行一步
            step_result = await self._execute_step(state)

            # 记录 metric
            weave.metrics.increment("loop.iterations")
            weave.metrics.record("loop.cost", step_result["cost"])

            # 检查热修改
            new_config = await self.hot_config.get()
            if new_config != self.config:
                weave.logger.info(f"Hot-reloading config: {new_config}")
                self.config = new_config

            state["iteration"] += 1

        return state

    @traceable(run_type="tool", name="execute_step")
    async def _execute_step(self, state: dict) -> dict:
        """单步执行 ------ 独立的 trace span"""
        # LLM 调用 + 工具执行
        pass

    async def _check_breakpoint(self, state: dict) -> dict | None:
        """检查管理面板是否注入了断点"""
        # 从 Redis / 配置中心读取断点配置
        breakpoints = await redis_client.get("loop:breakpoints")
        for bp in breakpoints:
            if bp["condition"](state):
                return bp
        return None

    async def _request_human_approval(self, breakpoint: dict) -> dict:
        """向管理面板发送人工审批请求"""
        # 通过 WebSocket 通知前端
        await ws_manager.send(
            channel=f"loop:{self.loop_id}",
            message={
                "type": "breakpoint",
                "step": breakpoint["step"],
                "action": breakpoint["action"],
                "reason": breakpoint["reason"],
            },
        )
        # 等待人工响应(带超时)
        return await ws_manager.wait_for_response(timeout=300)

管理面板能力清单

能力 说明 实现方式
实时火焰图 每个节点的耗时占比 LangSmith Trace View
动态断点 在下一步挂起,等待人工确认 WebSocket + Redis 断点配置
热修改参数 修改 max_iterations / temperature 配置中心(etcd / Consul)实时推送
成本仪表盘 实时 Token 消耗 + 预算预警 Weave / LangSmith dashboards
回放与调试 对历史轨迹逐步回放 Temporal Replay / LangSmith Playground
策略 A/B 同时运行多个策略版本对比 基于 trace tag 分流

9. 安全护栏子循环

在金融、医疗等强监管行业,Agent 的每一个行动都必须经过安全审核。安全护栏子循环在每一个行动前后嵌入一个轻量的安全检查循环,形成 " 行动前审核 → 行动后审计 " 的双层防线。

架构设计

vbscript 复制代码
┌──────────────────────────────────────────────┐
│               Main Agent Loop                 │
│                                              │
│  ┌────────────────────────────────────────┐  │
│  │        Pre-Action Safety Loop          │  │
│  │  Check: permission? budget?            │  │
│  │         compliance?                    │  │
│  │              ↓                         │  │
│  │    PASS → Execute      FAIL            │  │
│  │              ↓           ↓             │  │
│  │          Action    Request replan      │  │
│  └────────────────────────────────────────┘  │
│                      ↓                        │
│  ┌────────────────────────────────────────┐  │
│  │       Post-Action Audit Loop           │  │
│  │  Check: sensitive info leak?           │  │
│  │         compliance?                    │  │
│  │              ↓                         │  │
│  │    PASS → Continue      FAIL           │  │
│  │              ↓            ↓            │  │
│  │        Next Step    Block + Alert      │  │
│  └────────────────────────────────────────┘  │
└──────────────────────────────────────────────┘

实现

python 复制代码
"""
安全护栏子循环:每个 Action 前后的双层安全检查
"""

from enum import Enum
from dataclasses import dataclass


class SafetyVerdict(Enum):
    ALLOW = "allow"
    BLOCK = "block"
    REQUIRE_REPLAN = "require_replan"
    REDACT = "redact"


@dataclass
class SafetyRule:
    name: str
    description: str
    severity: str  # critical | high | medium
    check_fn: callable


class SafetyGuardLoop:
    """
    安全护栏子循环:
    - 行动前:权限、预算、合规检查
    - 行动后:敏感信息、输出审核
    """

    def __init__(self):
        self.pre_rules: list[SafetyRule] = [
            SafetyRule(
                name="block_destructive",
                description="Block destructive operations on critical paths",
                severity="critical",
                check_fn=self._check_destructive,
            ),
            SafetyRule(
                name="budget_limit",
                description="Check if action would exceed budget",
                severity="high",
                check_fn=self._check_budget,
            ),
            SafetyRule(
                name="permission_check",
                description="Verify agent has permission for this action",
                severity="critical",
                check_fn=self._check_permission,
            ),
            SafetyRule(
                name="rate_limit",
                description="Check API rate limits",
                severity="high",
                check_fn=self._check_rate_limit,
            ),
        ]
        self.post_rules: list[SafetyRule] = [
            SafetyRule(
                name="no_secrets",
                description="Scan output for secrets (API keys, tokens, passwords)",
                severity="critical",
                check_fn=self._scan_for_secrets,
            ),
            SafetyRule(
                name="pii_detection",
                description="Check for personally identifiable information",
                severity="critical",
                check_fn=self._check_pii,
            ),
            SafetyRule(
                name="output_size",
                description="Verify output size is within limits",
                severity="medium",
                check_fn=self._check_output_size,
            ),
            SafetyRule(
                name="compliance",
                description="Verify output against compliance policies",
                severity="high",
                check_fn=self._check_compliance,
            ),
        ]

        # 安全事件日志
        self.safety_log: list[dict] = []

    async def pre_action_check(
        self,
        action: dict,
        agent_state: dict,
    ) -> SafetyVerdict:
        """行动前安全检查循环"""

        for rule in self.pre_rules:
            verdict = await rule.check_fn(action, agent_state)

            self.safety_log.append({
                "phase": "pre",
                "action": action["name"],
                "rule": rule.name,
                "verdict": verdict,
                "timestamp": time.time(),
            })

            if verdict == SafetyVerdict.BLOCK:
                # 阻断 + 告警
                await self._alert(
                    f"BLOCKED: {action['name']} by rule {rule.name}"
                )
                return SafetyVerdict.BLOCK

            if verdict == SafetyVerdict.REQUIRE_REPLAN:
                # 要求 Agent 重新规划
                return SafetyVerdict.REQUIRE_REPLAN

        return SafetyVerdict.ALLOW

    async def post_action_audit(
        self,
        action: dict,
        output: str,
    ) -> SafetyVerdict:
        """行动后审计循环"""

        for rule in self.post_rules:
            verdict = await rule.check_fn(output)

            self.safety_log.append({
                "phase": "post",
                "action": action["name"],
                "rule": rule.name,
                "verdict": verdict,
                "timestamp": time.time(),
            })

            if verdict == SafetyVerdict.BLOCK:
                await self._alert(
                    f"POST-BLOCKED: output of {action['name']} "
                    f"flagged by {rule.name}",
                    severity=rule.severity,
                )
                return SafetyVerdict.BLOCK

            if verdict == SafetyVerdict.REDACT:
                # 自动脱敏后继续
                output = await self._redact(output, rule.name)

        return SafetyVerdict.ALLOW

    # --- 具体规则实现 ---

    async def _check_destructive(
        self, action: dict, state: dict
    ) -> SafetyVerdict:
        """阻断对关键路径的破坏性操作"""
        CRITICAL_PATHS = [
            "/etc/",
            "C:\\Windows\\",
            ".git/",
            "production/",
        ]
        if action["name"] in ("delete", "rm", "drop"):
            target = action.get("args", {}).get("path", "")
            for cp in CRITICAL_PATHS:
                if cp in target:
                    return SafetyVerdict.BLOCK
        return SafetyVerdict.ALLOW

    async def _check_budget(
        self, action: dict, state: dict
    ) -> SafetyVerdict:
        """预算检查"""
        estimated_cost = self._estimate_cost(action)
        current_spend = state.get("total_cost", 0)
        budget = state.get("budget", float("inf"))

        if current_spend + estimated_cost > budget:
            return SafetyVerdict.BLOCK
        return SafetyVerdict.ALLOW

    async def _scan_for_secrets(self, output: str) -> SafetyVerdict:
        """扫描输出中的敏感信息"""
        import re
        patterns = [
            r"[A-Za-z0-9_]{20,}==",       # Base64 token
            r"sk-[A-Za-z0-9]{32,}",        # OpenAI API key
            r"ghp_[A-Za-z0-9]{36}",        # GitHub token
            r"AKIA[0-9A-Z]{16}",           # AWS Access Key
            r"password\s*[:=]\s*\S+",      # Password in plaintext
        ]
        for pattern in patterns:
            if re.search(pattern, output):
                return SafetyVerdict.REDACT
        return SafetyVerdict.ALLOW

    async def _check_pii(self, output: str) -> SafetyVerdict:
        """PII 检测"""
        import re
        patterns = [
            r"\b\d{3}-\d{2}-\d{4}\b",                              # SSN
            r"\b\d{16}\b",                                          # Credit card
            r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b",  # Email
        ]
        for pattern in patterns:
            if re.search(pattern, output):
                return SafetyVerdict.REDACT
        return SafetyVerdict.ALLOW

    async def _alert(self, message: str, severity: str = "high"):
        """发送安全告警"""
        # 推送 Slack / PagerDuty / 安全运营中心
        pass

    async def _redact(self, output: str, rule_name: str) -> str:
        """自动脱敏"""
        # 用 [REDACTED] 替换敏感内容
        return output

集成到主循环

python 复制代码
class SafeAgentLoop:
    """嵌入了安全护栏子循环的 Agent Loop"""

    def __init__(self):
        self.guard = SafetyGuardLoop()

    async def run(self, task: str):
        state = {"task": task, "iteration": 0, "total_cost": 0}

        for state["iteration"] in range(50):
            # Agent 推理 + 规划
            action = await self._plan_next_action(state)

            # 行动前安全检查
            pre_verdict = await self.guard.pre_action_check(action, state)

            if pre_verdict == SafetyVerdict.BLOCK:
                # 阻断:记录事件,尝试重新规划
                state["blocked"] = True
                continue

            if pre_verdict == SafetyVerdict.REQUIRE_REPLAN:
                # 要求重规划:将拒绝原因注入上下文
                state["messages"].append({
                    "role": "user",
                    "content": f"Action {action['name']} rejected. "
                    "Plan a different approach.",
                })
                continue

            # 执行行动
            output = await self._execute_action(action)

            # 行动后审计
            post_verdict = await self.guard.post_action_audit(action, output)

            if post_verdict == SafetyVerdict.BLOCK:
                # 输出阻断:告警 + 停止
                break

            # 更新状态
            state["messages"].append({
                "role": "tool",
                "content": output,
            })

行业落地清单

行业 必选护栏 可选护栏
金融 操作权限矩阵、交易金额上限、合规审计日志 PII 检测、反洗钱规则
医疗 HIPAA 合规、PHI 数据脱敏、审计追踪 处方剂量校验、药物相互作用
SaaS API 密钥扫描、用户数据隔离、速率限制 内容安全审核、GDPR 合规
基础设施 破坏性操作确认、变更窗口检查 容量检查、回滚预案验证

总结:2026 年 Loop Engineering 技术选型矩阵

维度 推荐方案 为什么
控制流复杂度 轻量图 + 代码节点 拓扑可见、逻辑可调试、策略可切换
任务复杂度 双层循环 + 动态重规划 复杂任务成功率提升 25%,无效调用减少 50%
延迟敏感 流式循环 + 异步事件总线 端到端延迟从 9s 降到 3s
质量要求 Generator-Critic 或 多 Agent 辩论 质量提升显著,Token 成本可控
可靠性要求 Temporal.io 耐久执行 服务器崩溃后自动恢复,不丢状态
持续改进 DSPy 自优化循环 无需人工改 prompt,自动提升 15-35%
工程门槛 声明式配置 策略即配置,可 Git、可 diff、可实验
运维可控 可观测 Loop + 动态断点 运行时注入审批,不下线调整参数
安全合规 安全护栏子循环 每个 Action 前后双层防线,行业强制要求

Loop Engineering 在 2026 年已经不再是一个概念讨论,而是有清晰工程方案的实践领域。选择哪条技术路线,取决于你的任务类型、团队能力和合规要求。九大做法可以组合使用------例如用 Temporal 做耐久执行的底,在上面搭建双层循环,用流式事件驱动执行,最后用安全护栏包裹每一个行动。

这不是让工作变简单,而是让工作变深刻。设计 Loop 的判断力是你的解药,逃避思考的惯性是你的陷阱。同一个动作,相反的结果。

相关推荐
aaaa954726651 小时前
从Claude Code到平替:我的vibe coding迭代体验
人工智能
叫我:松哥1 小时前
基于机器学习的中文文本抑郁症风险检测系统,包括NLP与传统机器学习的抑郁症识别,准确率92%
人工智能·深度学习·机器学习·自然语言处理·flask·nlp·bootstrap
天天讯通1 小时前
OKCC 呼叫中心安全性能全解析:技术防护与管理措施指南
大数据·开发语言·网络·人工智能·安全·语音识别
hai3152475431 小时前
九章编程法 · 猜数字游戏 (GW-BASIC 重构版) *
人工智能·microsoft·游戏引擎·游戏程序
邵宇然1 小时前
跨沙箱动态传递:WASM 与宿主环境间变长文本数据的零拷贝读取
人工智能
小小小花儿1 小时前
如何使用Codex进行Vibe Coding
人工智能
信也科技布道师2 小时前
Agent Skills + Vibe Testing:构建人机协作的测试闭环
人工智能·agent skills
朱大喜2 小时前
BI 平台搭建:从数仓到自助分析的实战路径
人工智能