2026 年 6 月,Loop Engineering 概念在硅谷引爆 ------ 它不是又一个转瞬即逝的 buzzword,而是 AI 工程范式从 " 人提示 AI" 到 " 人设计系统,系统驱动 AI" 的根本转折。本文不讲入门概念,直接切入当前工业界和研究界最活跃的九大实践方向,每个方向都配有可运行的代码示例和架构决策指南。
目录
- [极简代码循环 vs. 图状态机框架:两条路线之争](#极简代码循环 vs. 图状态机框架:两条路线之争 "#1-%E6%9E%81%E7%AE%80%E4%BB%A3%E7%A0%81%E5%BE%AA%E7%8E%AF-vs-%E5%9B%BE%E7%8A%B6%E6%80%81%E6%9C%BA%E6%A1%86%E6%9E%B6%E4%B8%A4%E6%9D%A1%E8%B7%AF%E7%BA%BF%E4%B9%8B%E4%BA%89")
- [计划-执行双层循环 + 动态重规划](#计划-执行双层循环 + 动态重规划 "#2-%E8%AE%A1%E5%88%92-%E6%89%A7%E8%A1%8C%E5%8F%8C%E5%B1%82%E5%BE%AA%E7%8E%AF--%E5%8A%A8%E6%80%81%E9%87%8D%E8%A7%84%E5%88%92")
- 事件驱动与流式循环
- [多 Agent 拓扑循环](#多 Agent 拓扑循环 "#4-%E5%A4%9A-agent-%E6%8B%93%E6%89%91%E5%BE%AA%E7%8E%AF")
- 长时持久化与耐久执行
- [自优化循环:DSPy 驱动的 Loop 工程](#自优化循环:DSPy 驱动的 Loop 工程 "#6-%E8%87%AA%E4%BC%98%E5%8C%96%E5%BE%AA%E7%8E%AF%EF%BC%9Adspy%E9%A9%B1%E5%8A%A8%E7%9A%84loop%E5%B7%A5%E7%A8%8B")
- 声明式循环配置
- 可观测性驱动的循环断点与人工协同
- 安全护栏子循环
前置背景:Loop Engineering 解决了什么?
在进入具体做法之前,需要明确一个根本问题:
人类注意力有限,AI 执行能力无限。 传统的 Prompt Engineering 模式下,人必须坐在终端前按每一次回车、审每一次输出。当模型输出速度远超人类处理速度时,人就成为了整个流程的瓶颈。
Loop Engineering 的核心是将开发者从 " 每一次交互的参与者 " 转变为 " 循环系统的设计者 "------定义目标、设置护栏、设计验证机制,然后让系统自运转。这不仅是效率的量变,而是生产关系的质变。
当前 Loop Engineering 的工程实践已经远远超越了 2025 年的 ReAct 简单循环,形成了多条技术路线。以下是 2026 年最值得关注的九大方向。
1. 极简代码循环 vs. 图状态机框架:两条路线之争
这是 Loop Engineering 工程哲学上最根本的分野,两条路线各有拥趸,而最新趋势是它们正在走向融合。
路线 A:极简 while 循环
Anthropic 内部推崇的路线。直接用 Python/TypeScript 的 while 循环控制 Agent,内部顺序调用 LLM、解析工具、执行、拼接结果。
核心理念:逻辑完全透明,无黑魔法,易调试。每一步发生什么一目了然,适合规则清晰、步骤较短的 Agent。
python
"""
极简 while 循环 Agent ------ 完全透明的执行逻辑
"""
import json
from openai import OpenAI
from typing import Any
client = OpenAI()
# 工具定义
tools: list[dict[str, Any]] = [
{
"type": "function",
"function": {
"name": "read_file",
"description": "Read file content",
"parameters": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "File path"}
},
"required": ["path"],
},
},
},
{
"type": "function",
"function": {
"name": "run_tests",
"description": "Run the test suite and return results",
"parameters": {
"type": "object",
"properties": {
"filter": {"type": "string", "description": "Test filter pattern"}
},
},
},
},
]
def execute_tool(tool_call) -> str:
"""执行工具调用 ------ 逻辑完全可见"""
name = tool_call.function.name
args = json.loads(tool_call.function.arguments)
if name == "read_file":
with open(args["path"]) as f:
return f.read()
elif name == "run_tests":
import subprocess
result = subprocess.run(
["pytest", args.get("filter", ""), "-q"],
capture_output=True, text=True
)
return result.stdout + result.stderr
return "Unknown tool"
def agent_loop(task: str, max_iterations: int = 10) -> str:
"""
核心循环:完全使用标准 Python while,无框架依赖。
每一步:LLM 推理 → 解析工具调用 → 执行 → 拼接结果 → 下一轮。
"""
messages: list[dict[str, Any]] = [
{
"role": "system",
"content": (
"You are a coding agent. "
"Use tools to understand the codebase and make changes. "
"When you believe the task is done, respond with 'DONE'."
),
},
{"role": "user", "content": task},
]
for iteration in range(max_iterations):
# Step 1: LLM 推理
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
)
msg = response.choices[0].message
# Step 2: 如果模型直接返回文本(无工具调用)
if msg.content and "DONE" in msg.content:
return msg.content
# Step 3: 执行工具调用
if msg.tool_calls:
tool_results = []
for tc in msg.tool_calls:
result = execute_tool(tc)
tool_results.append({
"role": "tool",
"tool_call_id": tc.id,
"content": result,
})
# Step 4: 拼接结果,进入下一轮
messages.append(msg)
messages.extend(tool_results)
continue
return msg.content or "No response"
return f"Stopped after {max_iterations} iterations"
# 使用
result = agent_loop("Read README.md and run all tests. Fix any failures.")
优点:
- 每一步都是显式的,断点调试零学习成本
- 状态管理完全在
messages列表中,脏了清掉即可 - 代码量通常不超过 150 行,review 成本低
缺点:
- 复杂分支逻辑(5+ 个条件跳转)会快速膨胀
- 无法原生表达并行执行
- 缺乏可视化和 tracing 支持
路线 B:图状态机框架
LangGraph、OpenAI Agents SDK、Google ADK 都走向了图状态机路线。用有向图定义节点(LLM 调用、工具、条件分支)和边,原生支持并行、动态路由、子图嵌套。
核心理念:用结构化的图描述控制流,逻辑更复杂但可追溯、可可视化。
python
"""
LangGraph 图状态机 Agent
用有向图明确定义节点和边,支持并行分支和条件路由
"""
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from typing import TypedDict, Annotated, Literal
import operator
# 定义状态
class AgentState(TypedDict):
messages: Annotated[list, operator.add]
task_status: str
tool_results: dict
# 节点定义
def planner(state: AgentState) -> AgentState:
"""规划节点:分析任务,决定下一步"""
# LLM 调用,生成计划
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "Plan the next step. Return a JSON action."},
*state["messages"],
],
)
return {"messages": [response.choices[0].message]}
def executor(state: AgentState) -> AgentState:
"""执行节点:执行工具调用"""
# 工具执行逻辑
return {"task_status": "executing"}
def evaluator(state: AgentState) -> AgentState:
"""评估节点:判断结果质量"""
return {"task_status": "evaluated"}
def should_retry(state: AgentState) -> Literal["executor", "planner", END]:
"""条件路由:根据评估结果决定跳转"""
if state.get("task_status") == "success":
return END
elif state.get("retry_count", 0) < 3:
return "executor"
else:
return "planner"
# 构建图
graph = StateGraph(AgentState)
graph.add_node("planner", planner)
graph.add_node("executor", executor)
graph.add_node("evaluator", evaluator)
graph.set_entry_point("planner")
graph.add_edge("planner", "executor")
graph.add_edge("executor", "evaluator")
graph.add_conditional_edges("evaluator", should_retry)
app = graph.compile()
优点:
- 可视化整个循环拓扑,一眼看清决策分支
- 原生支持并行节点(多个 executor 同时跑)
- 子图嵌套:一个节点本身可以是另一个 StateGraph
- 社区生态成熟(LangSmith tracing、Weave 集成)
缺点:
- 调试难度倍增 ------ 需要图编译器 + 状态追踪
- 简单任务引入不必要的抽象层
- 性能开销:图的调度层比裸 while 慢约 15-30%
融合趋势:轻量图 + 代码节点
2026 年的最佳实践正在向中间路线收敛 ------ 用 JSON/YAML 声明图的拓扑结构,用代码直接实现节点逻辑。
python
"""
轻量图 Loop:JSON 定义拓扑 + 代码实现节点
"""
loop_config = {
"entry": "planner",
"nodes": {
"planner": {
"type": "llm",
"model": "gpt-4o",
"system_prompt": "Plan the next step as JSON.",
"next": "executor",
},
"executor": {
"type": "code",
"handler": "my_project.loop_nodes.execute_step",
"next": "evaluator",
},
"evaluator": {
"type": "llm",
"model": "gpt-4o-mini",
"system_prompt": "Evaluate the result. Return PASS or FAIL with reasons.",
"routes": {
"PASS": "END",
"FAIL": "executor",
"REPLAN": "planner",
},
},
},
"max_iterations": 20,
"parallel_nodes": [],
}
class LightweightLoop:
"""轻量循环引擎:解析 JSON 配置,调用代码节点"""
def __init__(self, config: dict):
self.config = config
self.state = {"messages": [], "iteration": 0}
def run(self, task: str):
current = self.config["entry"]
while current != "END" and self.state["iteration"] < self.config.get(
"max_iterations", 50
):
node = self.config["nodes"][current]
if node["type"] == "llm":
self._run_llm_node(node)
elif node["type"] == "code":
self._run_code_node(node)
current = self._resolve_route(node)
self.state["iteration"] += 1
这种混合路线的优势:
- 图拓扑一目了然(改配置即改流程,无需动代码)
- 节点逻辑用纯代码实现,调试像普通函数
- 换 Loop 策略(ReAct → Plan-Execute → Maker-Checker)只需换配置文件
2. 计划-执行双层循环 + 动态重规划
平铺的 ReAct 循环在复杂任务中暴露了根本缺陷:每一步都是局部最优决策,缺乏全局视野,导致大量无效工具调用和路径偏离。
双层架构设计
vbnet
┌──────────────────────────────────────────────────┐
│ Outer Loop: Planner │
│ ┌────────────────────────────────────────────┐ │
│ │ Generate high-level plan with steps │ │
│ │ Step 1 → Step 2 → Step 3 → Step 4 │ │
│ └──────────────────┬─────────────────────────┘ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌────────────────────────────────────────────┐ │
│ │ Inner Loop: Executor (per step) │ │
│ │ Perceive → Reason → Act → Observe │ │
│ │ On failure: signal to Outer Loop │ │
│ └──────────────────┬─────────────────────────┘ │
│ │ │
│ ▼ │
│ Dynamic Re-plan based on executor result │
└──────────────────────────────────────────────────┘
外循环 Planner:LLM 生成高层计划(步骤列表,带依赖关系)。Planner 不参与执行,仅负责战略。
内循环 Executor:逐步执行,每步后将结果反馈给外循环。若某步失败,外循环根据新状态动态修订剩余计划。
这本质上是 Plan-and-Solve 的增强版 ,加上 ReWOO 的核心思想 ------ 把观测(Observation)与推理(Reasoning)脱钩,避免上下文被中间结果污染。
python
"""
双层循环 Agent:Plan-and-Execute + Dynamic Re-plan
"""
import json
from dataclasses import dataclass, field
from typing import Optional
@dataclass
class PlanStep:
id: str
description: str
depends_on: list[str] = field(default_factory=list)
status: str = "pending" # pending | running | success | failed
@dataclass
class ExecutionPlan:
steps: list[PlanStep]
current_step_index: int = 0
class DualLoopAgent:
"""
双层循环:
- Outer loop: 生成计划 + 当执行失败时动态重规划
- Inner loop: 逐步执行 + 将结果反馈给 Outer loop
"""
def __init__(self):
self.context: list[dict] = []
self.plan: Optional[ExecutionPlan] = None
def outer_loop(self, task: str) -> str:
"""Outer Loop:战略层 ------ 规划与重规划"""
# Phase 1: 生成高层计划
plan_response = self._call_llm(
system=(
"You are a planner. Break the task into a JSON array of steps. "
'Each step: {"id": "step_N", "description": "...", '
'"depends_on": ["step_X"]}. '
"Steps must be ordered. Only output JSON, no explanation."
),
user=task,
)
steps_data = json.loads(plan_response)
self.plan = ExecutionPlan(
steps=[PlanStep(**s) for s in steps_data]
)
# Phase 2: 逐步执行(Inner Loop 调 outer)
while self.plan.current_step_index < len(self.plan.steps):
step = self.plan.steps[self.plan.current_step_index]
step.status = "running"
result = self.inner_loop(step)
if result["success"]:
step.status = "success"
self.plan.current_step_index += 1
self.context.append({
"step": step.id,
"result": result["output"],
})
else:
step.status = "failed"
# 动态重规划:根据失败原因修订剩余计划
if not self._replan(step, result["error"]):
return f"Task failed at step {step.id}: {result['error']}"
return "All steps completed successfully"
def inner_loop(self, step: PlanStep) -> dict:
"""Inner Loop:战术层 ------ 单步执行 + 迭代修正"""
# 准备上下文:只包含当前步骤和已完成步骤的结果摘要
inner_context = [
{"role": "system", "content": f"Execute step: {step.description}"},
{
"role": "user",
"content": f"Previous results: {json.dumps(self.context[-3:])}",
},
]
for attempt in range(5):
result = self._execute_with_tools(inner_context)
if result["success"]:
return result
# 把错误反馈注入下一轮
inner_context.append({
"role": "user",
"content": f"Previous attempt failed: {result['error']}. "
"Analyze the error and try a different approach.",
})
return {"success": False, "error": "Max inner loop iterations reached"}
def _replan(self, failed_step: PlanStep, error: str) -> bool:
"""动态重规划:根据执行失败原因修订后续步骤"""
remaining_steps = self.plan.steps[self.plan.current_step_index + 1 :]
if not remaining_steps:
return False
replan_prompt = (
f"Step '{failed_step.description}' failed with: {error}\n"
f"Remaining steps to complete the task:\n"
+ "\n".join(f"- {s.description}" for s in remaining_steps)
+ "\n\nRevise or reorder the remaining steps. "
"If the task is no longer achievable, return empty list."
)
response = self._call_llm(
system="Revise the plan based on the failure. Output JSON array of steps.",
user=replan_prompt,
)
new_steps = json.loads(response)
if not new_steps:
return False
# 替换后续步骤
self.plan.steps = (
self.plan.steps[: self.plan.current_step_index + 1]
+ [PlanStep(**s) for s in new_steps]
)
self.plan.current_step_index += 1
return True
def _call_llm(self, system: str, user: str) -> str:
# 简化的 LLM 调用接口
import openai
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": system},
{"role": "user", "content": user},
],
)
return response.choices[0].message.content
def _execute_with_tools(self, context: list[dict]) -> dict:
# 简化的工具执行接口
return {"success": True, "output": "done"}
与 ReWOO 的对比
ReWOO (Reasoning WithOut Observation) 的核心洞察是:观测数据(工具输出)不应混入推理过程------它会把推理链污染成 " 我看到 X,所以做 X",而非 " 基于目标,我需要 X"。
双层循环天然支持这个模式:Inner Loop 负责 " 脏活 "(工具调用 + 观测),Outer Loop 只接收摘要结果做战略决策。两者的上下文窗口保持干净。
效果(社区报告数据):
- 无效工具调用减少约 40-60%
- 复杂多步任务的成功率提升约 25%
- 最长可处理 30+ 步的复杂任务链
3. 事件驱动与流式循环
传统 Agent Loop 是严格的串行模式:" 请求 → 完整响应 → 解析 → 执行 → 下一轮请求 "。每个环节都要等上一个完成,端到端延迟是各环节之和。
事件驱动循环把这个范式彻底翻转:LLM 每吐出部分结构化动作的片段,就立刻触发工具调用,工具执行的同时 LLM 还在继续生成。
vbnet
Traditional Loop:
LLM generate ──→ parse tool calls ──→ execute tools ──→ concat ──→ LLM generate ──→ ...
[───── wait ─────][── wait ──][── wait ──]
Event-driven Loop:
LLM streaming ───────────────────────────────────────────→
├─→ tool call fragment arrives → async execute tool ──→
│ ├─→ result event pushed
├─→ continue generating ───────────────────────────────→
│ ├─→ state update triggered
└─→ more tool calls → async execute ───────────────────→
└─→ next round decision
流式工具调用解析
OpenAI 和 Anthropic 的 API 在 2025 年已经支持流式工具调用(streaming tool calls)。当模型决定调用工具时,API 以增量块的形式推送函数名、参数片段,客户端可以在参数还没有完整接收时就发起准备。
python
"""
流式循环引擎:边生成边执行
"""
import asyncio
import json
from typing import AsyncIterator
class StreamingLoopEngine:
"""
核心机制:
1. LLM 流式输出 → 实时解析工具调用片段
2. 工具调用立即异步执行
3. 工具结果以事件形式推回决策循环
"""
def __init__(self, event_bus=None):
self.event_bus = event_bus or AsyncEventBus()
self.pending_tools: dict[str, asyncio.Task] = {}
self.tool_results: dict[str, str] = {}
async def run(self, task: str, max_iterations: int = 20):
"""流式循环入口"""
messages = [{"role": "user", "content": task}]
accumulated_tool_calls: dict[int, dict] = {}
for iteration in range(max_iterations):
stream = await self._stream_llm(messages)
async for chunk in stream:
delta = chunk.choices[0].delta
# 情况 1:纯文本增量
if delta.content:
yield {"type": "text", "content": delta.content}
# 情况 2:工具调用增量
if delta.tool_calls:
for tc_delta in delta.tool_calls:
idx = tc_delta.index
if idx not in accumulated_tool_calls:
accumulated_tool_calls[idx] = {
"id": tc_delta.id or "",
"function": {"name": "", "arguments": ""},
}
if tc_delta.id:
accumulated_tool_calls[idx]["id"] = tc_delta.id
if tc_delta.function:
if tc_delta.function.name:
accumulated_tool_calls[idx]["function"][
"name"
] += tc_delta.function.name
if tc_delta.function.arguments:
accumulated_tool_calls[idx]["function"][
"arguments"
] += tc_delta.function.arguments
# 关键:参数够完整就立即发起异步执行
if self._args_complete(
accumulated_tool_calls[idx]
):
await self._dispatch_tool_async(
accumulated_tool_calls[idx]
)
# 等待所有待处理的工具调用完成
results = await self._gather_tool_results()
# 拼接结果进入下一轮
messages.append({
"role": "assistant",
"tool_calls": list(accumulated_tool_calls.values()),
})
for tc_id, result in results.items():
messages.append({
"role": "tool",
"tool_call_id": tc_id,
"content": result,
})
accumulated_tool_calls.clear()
# 如果模型只返回文本且无工具调用,结束
if not results:
break
def _args_complete(self, tool_call: dict) -> bool:
"""判断工具调用参数是否足够完整以开始执行"""
try:
json.loads(tool_call["function"].get("arguments", ""))
return True
except json.JSONDecodeError:
return False
async def _dispatch_tool_async(self, tool_call: dict):
"""异步派发工具调用,立即返回不阻塞"""
tc_id = tool_call["id"]
if tc_id in self.pending_tools:
return
task = asyncio.create_task(self._execute_tool(tool_call))
self.pending_tools[tc_id] = task
async def _execute_tool(self, tool_call: dict) -> str:
"""执行工具(可能涉及网络 IO)"""
name = tool_call["function"]["name"]
args = json.loads(tool_call["function"]["arguments"])
# 实际工具执行逻辑...
result = f"Executed {name} with {args}"
self.tool_results[tool_call["id"]] = result
return result
async def _gather_tool_results(self) -> dict:
"""等待所有异步工具调用完成,返回结果"""
results = {}
for tc_id, task in list(self.pending_tools.items()):
try:
await task
results[tc_id] = task.result()
except Exception as e:
results[tc_id] = f"Error: {e}"
self.pending_tools.clear()
return results
async def _stream_llm(self, messages) -> AsyncIterator:
"""流式调用 LLM API"""
import openai
client = openai.AsyncOpenAI()
return await client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=[...],
stream=True,
)
事件总线架构(生产级)
在生产环境中,多个 Agent 可能并行运行,工具调用涉及多个微服务。事件总线模式将循环控制器从中心化调用者转变为事件消费者:
python
"""
基于 AsyncIO + 事件队列的生产级流式循环
"""
class EventDrivenLoop:
"""
事件驱动循环引擎:
- llm_events: LLM 输出事件流
- tool_events: 工具执行结果事件流
- 循环控制器订阅两个事件流,根据事件类型驱动状态变更
"""
def __init__(self):
self.llm_events: asyncio.Queue = asyncio.Queue()
self.tool_events: asyncio.Queue = asyncio.Queue()
self.state = {"step": 0, "status": "running"}
async def controller(self, task: str):
"""主控制器:事件消费者 + 分发者"""
# 启动 LLM 生产者
asyncio.create_task(self._llm_producer(task))
while self.state["status"] == "running":
# 同时监听两个事件源
done, pending = await asyncio.wait(
[
asyncio.create_task(self.llm_events.get()),
asyncio.create_task(self.tool_events.get()),
],
return_when=asyncio.FIRST_COMPLETED,
)
for task_done in done:
event = task_done.result()
if event["source"] == "llm":
await self._handle_llm_event(event)
elif event["source"] == "tool":
await self._handle_tool_event(event)
# 取消未完成的等待
for p in pending:
p.cancel()
async def _llm_producer(self, task: str):
"""LLM 生产者:流式输出 → 推送事件"""
stream = await self._stream_llm(task)
async for chunk in stream:
await self.llm_events.put({
"source": "llm",
"type": "chunk",
"data": chunk,
})
await self.llm_events.put({
"source": "llm",
"type": "done",
})
async def _handle_llm_event(self, event):
"""处理 LLM 事件:解析工具调用 → 异步分发"""
if event["type"] == "done":
self.state["status"] = "evaluating"
return
tool_call = self._parse_tool_call(event["data"])
if tool_call:
# 异步执行工具,结果会通过 tool_events 回来
asyncio.create_task(self._execute_and_emit(tool_call))
async def _execute_and_emit(self, tool_call):
"""执行工具并将结果推入事件队列"""
result = await self._execute_tool(tool_call)
await self.tool_events.put({
"source": "tool",
"tool_call_id": tool_call["id"],
"result": result,
})
async def _handle_tool_event(self, event):
"""处理工具结果事件:更新状态 → 决定下一步"""
self.state["tool_results"] = self.state.get("tool_results", {})
self.state["tool_results"][event["tool_call_id"]] = event["result"]
# 可在此触发下一轮 LLM 调用或直接结束
端到端延迟对比
| 模式 | 3 次工具调用延迟 | 说明 |
|---|---|---|
| 传统串行 | LLM 生成 (2s) + Tool1(1s) + LLM(2s) + Tool2(1s) + LLM(2s) + Tool3(1s) = 9s | 严格串行 |
| 流式循环 | max(LLM 生成, Tool1+Tool2+Tool3 并行) ≈ 3-4s | 边生成边执行 |
4. 多 Agent 拓扑循环
单个 Agent 的循环效能有天花板。无论是推理深度、上下文窗口还是单一视角,单 Agent 都有不可逾越的局限。2026 年的主流实践是在一个更大的循环中嵌套多个 Agent,形成 " 协作循环 "。
4.1 管理者-工人循环(Manager-Worker Loop)
最经典也最稳健的多 Agent 拓扑。主 Agent(Manager)分配子任务,启动子 Agent Loop 并等待结果,自身保持监督循环,随时中断或重新分配。
python
"""
管理者-工人循环:主 Agent 协调多个子 Agent
"""
class ManagerWorkerLoop:
"""
Manager 自身运行一个监督循环:
1. 分析任务,拆解为子任务
2. 分配给 Worker(创建子循环)
3. 收集结果,评估质量
4. 不满意则重新分配
"""
def __init__(self):
self.workers: dict[str, "WorkerAgent"] = {}
self.task_queue: asyncio.Queue = asyncio.Queue()
async def manage(self, main_task: str) -> str:
"""Manager 的主循环"""
# Step 1: 拆解任务
subtasks = await self._decompose(main_task)
for st in subtasks:
await self.task_queue.put(st)
results = {}
failed_tasks = []
# Step 2: 分配循环(支持重分配)
while not self.task_queue.empty() or failed_tasks:
# 优先处理失败重试
if failed_tasks:
task = failed_tasks.pop(0)
task["retry_count"] = task.get("retry_count", 0) + 1
else:
task = await self.task_queue.get()
# 分配 Worker
worker = await self._select_worker(task)
worker_id = worker.id
# 启动 Worker 子循环
result = await worker.execute(task)
results[task["id"]] = result
# Step 3: 质量评估
quality = await self._evaluate(task, result)
if quality["pass"]:
continue
elif task.get("retry_count", 0) < 3:
# 重新分配(可能换 Worker)
task["feedback"] = quality["feedback"]
failed_tasks.append(task)
else:
# 超过重试上限,升级为人工处理
results[task["id"]]["escalated"] = True
return self._summarize(results)
async def _decompose(self, task: str) -> list[dict]:
"""LLM 拆解主任务为子任务列表"""
response = await self._call_llm(
system=(
"Decompose the task into subtasks. Each subtask must be "
"independently executable. Output JSON array with id and "
"description. Specify dependencies if any."
),
user=task,
)
return json.loads(response)
async def _select_worker(self, task: dict) -> "WorkerAgent":
"""根据子任务特征选择最合适的 Worker"""
# 可以基于 skills 匹配、当前负载、历史成功率
pass
async def _evaluate(self, task: dict, result: dict) -> dict:
"""评估 Worker 结果,返回 pass + feedback"""
pass
4.2 生成-批评-修正循环(Generator-Critic-Reviser Loop)
输出 Agent 生成 → 批评 Agent 指出问题 → 生成 Agent 再次修改。形成一个内环迭代,直到质量过关。
python
"""
生成-批评-修正循环
三个角色各司其职,形成质量迭代内环
"""
class GeneratorCriticLoop:
"""
内环结构:
Generator → Critic → [PASS] → 输出
→ [FAIL] → Generator(带批评意见)→ ...
"""
def generate(self, task: str, max_rounds: int = 5) -> str:
output = self._generator(task)
for round_num in range(max_rounds):
# 批评 Agent 独立评估(不同模型、不同温度)
critique = self._critic(output, task)
if critique["score"] >= 0.9:
return output
# 将批评注入 Generator 的下一轮
output = self._generator(
task,
previous_output=output,
feedback=critique["feedback"],
)
return output # 达到最大轮次,返回当前最优
def _generator(
self,
task: str,
previous_output: str = None,
feedback: str = None,
) -> str:
"""Generator: 高温度 (0.7-0.9),鼓励创造性"""
prompt = f"Task: {task}"
if previous_output:
prompt += (
f"\n\nYour previous output:\n{previous_output}"
f"\n\nCritique: {feedback}"
"\n\nRevise based on the critique."
)
return self._call_llm(prompt, temperature=0.8)
def _critic(self, output: str, task: str) -> dict:
"""Critic: 低温度 (0.0-0.1),严格评估"""
response = self._call_llm(
system=(
"You are a strict code reviewer. Evaluate the output "
"against the original task. Return JSON: "
'{"score": 0.0-1.0, "feedback": "specific issues"}'
),
user=f"Task: {task}\n\nOutput to evaluate:\n{output}",
temperature=0.0,
)
return json.loads(response)
4.3 ChatEval 式多 Agent 辩论循环
多个 Agent 并行独立回答,然后互相辩论多轮,最后裁判 Agent 汇总,整体作为一个大循环。
python
"""
多 Agent 辩论循环
"""
class DebateLoop:
"""
辩论流程:
1. 多个 Expert Agent 并行独立回答
2. 互相审阅对方的回答,提出反驳
3. 多轮辩论后,Judge Agent 汇总裁决
"""
async def debate(self, question: str, num_experts: int = 3, rounds: int = 3):
# Round 1: 独立回答
answers = await asyncio.gather(*[
self._expert_answer(question, expert_id=i)
for i in range(num_experts)
])
# Round 2-N: 互相辩论
for round_num in range(1, rounds):
# 每个 Expert 看到所有其他人的回答
responses = await asyncio.gather(*[
self._expert_rebut(
question=question,
my_answer=answers[i],
other_answers=[a for j, a in enumerate(answers) if j != i],
expert_id=i,
)
for i in range(num_experts)
])
answers = responses
# 最终裁决
verdict = await self._judge(question, answers)
return verdict
async def _expert_answer(self, question: str, expert_id: int) -> str:
return await self._call_llm(
system=f"You are Expert {expert_id}. Give your best answer.",
user=question,
temperature=0.7,
)
async def _expert_rebut(
self,
question: str,
my_answer: str,
other_answers: list[str],
expert_id: int,
) -> str:
others_text = "\n---\n".join(
f"Expert {j}: {a}" for j, a in enumerate(other_answers)
)
return await self._call_llm(
system=(
f"You are Expert {expert_id}. Review other experts' answers "
"and refine your own. Point out flaws in others' reasoning."
),
user=(
f"Question: {question}\n"
f"Your answer: {my_answer}\n"
f"Other answers:\n{others_text}\n"
"Provide your refined answer."
),
temperature=0.6,
)
async def _judge(self, question: str, answers: list[str]) -> dict:
return await self._call_llm(
system=(
"You are a judge. Synthesize all expert opinions and produce "
"a final verdict. Output JSON with 'verdict' and 'reasoning'."
),
user=f"Question: {question}\n\n"
+ "\n---\n".join(f"Expert {i}: {a}" for i, a in enumerate(answers)),
temperature=0.0,
)
拓扑选择指南
| 拓扑 | Token 成本 | 质量提升 | 适用场景 |
|---|---|---|---|
| Manager-Worker | 中(子任务并行) | 中 | 大型多文件任务,有明确分工 |
| Generator-Critic | 低(2x 调用) | 高 | 代码生成、文档写作 |
| 多 Agent 辩论 | 高(N×M 轮) | 最高 | 安全审计、设计评审、关键决策 |
| 层级委派 | 中-高 | 中-高 | 大规模系统,需要递归拆解 |
5. 长时持久化与耐久执行
大多数 Agent 循环的生命周期不超过一次会话。服务器重启、进程崩溃、网络中断------任何中断都意味着从头开始。耐久执行(Durable Execution) 让 Agent 循环具备 " 系统重启后继续 " 的能力。
核心思路
每一步的状态写入持久化存储(数据库 / 对象存储),工作流引擎能重新唤起并从中断的节点继续循环,而不是重新运行整个任务。
基于 Temporal.io 的实现
Temporal 是目前工业界耐久执行的事实标准。每一步是一个 Activity,整个循环是一个 Workflow。
python
"""
基于 Temporal.io 的耐久执行 Agent Loop
服务器重启后自动从中断节点继续,无需重新运行
"""
from temporalio import workflow, activity
from temporalio.common import RetryPolicy
from dataclasses import dataclass
from typing import Optional
@dataclass
class LoopState:
"""持久化的循环状态 ------ 每一步都写入 Temporal 的 Event History"""
task: str
current_step: int = 0
max_steps: int = 20
messages: list[dict] = None
step_results: list[dict] = None
@activity.defn
async def llm_reason(state: LoopState) -> dict:
"""Activity:LLM 推理 ------ Temporal 保证至多执行一次"""
# 调用 LLM
response = await openai_client.chat.completions.create(
model="gpt-4o",
messages=state.messages,
tools=[...],
)
return response.choices[0].message.model_dump()
@activity.defn
async def execute_tool(tool_call: dict) -> str:
"""Activity:工具执行 ------ 支持重试策略"""
# 执行工具调用
pass
@activity.defn
async def evaluate_result(results: list[dict]) -> dict:
"""Activity:结果评估"""
# 评估是否达到目标
pass
@workflow.defn
class DurableAgentLoop:
"""
Workflow 层面的 Agent Loop。
Temporal 自动持久化每一步状态。
服务器崩溃后从上次成功节点恢复。
"""
@workflow.run
async def run(self, task: str) -> str:
state = LoopState(task=task)
state.messages = [{"role": "user", "content": task}]
# 重试策略:工具调用失败时重试
tool_retry = RetryPolicy(
initial_interval=1.0,
maximum_interval=60.0,
maximum_attempts=3,
)
for state.current_step in range(state.max_steps):
# 每个 Activity 调用都会自动持久化到 Event History
response = await workflow.execute_activity(
llm_reason,
state,
start_to_close_timeout=30.0,
)
# 检查是否任务完成
eval_result = await workflow.execute_activity(
evaluate_result,
[{"response": response}],
start_to_close_timeout=10.0,
)
if eval_result.get("complete"):
return eval_result["summary"]
# 执行工具调用(带重试)
if response.get("tool_calls"):
for tc in response["tool_calls"]:
tool_result = await workflow.execute_activity(
execute_tool,
tc,
retry_policy=tool_retry,
start_to_close_timeout=60.0,
)
state.messages.append({
"role": "tool",
"tool_call_id": tc["id"],
"content": tool_result,
})
return f"Completed {state.max_steps} steps"
# 部署后,即使 Worker 进程崩溃,
# Temporal Server 会在另一个 Worker 上从中断点恢复执行
关键设计原则
- State is the source of truth:循环状态(messages、step_results)必须序列化到持久化存储,而非停留在内存
- Deterministic replay :Workflow 代码必须是确定性的(不能直接调用
random、datetime.now()等),Temporal 通过重放 Event History 恢复状态 - Activity 幂等性:每个 Activity 调用至少在重放时是幂等的
替代方案对比
| 引擎 | 适用场景 | 复杂度 | 亮点 |
|---|---|---|---|
| Temporal.io | 生产级长时任务(数小时-数天) | 中高 | 重放机制、多语言 SDK、可视化管理面板 |
| Prefect | 数据管道 + Agent Loop | 中 | Python 原生、观测性优秀 |
| AWS Step Functions | 已有 AWS 基础设施的团队 | 中 | 免运维、与 AWS 生态深度集成 |
| Celery + Redis | 轻量级、快速原型 | 低 | 部署简单、Python 生态成熟 |
6. 自优化循环:DSPy驱动的Loop工程
这项技术将 Loop Engineering 推向了元层面:把 Loop 本身当作可优化对象。在循环的各节点定义可学习的提示词和参数,利用 DSPy 等框架根据每个循环节点的成功 / 失败日志自动优化提示词和 few-shot 示例。
DSPy 在 Loop 优化中的应用
vbnet
Traditional Loop Engineering:
Human design prompt → Loop run → Human observe failure → Human fix prompt → Loop run → ...
DSPy-driven Loop Engineering:
Human define metrics → Loop run → DSPy auto-collect success/failure logs
↓
Auto-optimize prompts, few-shot, model selection
↓
Loop run (new config) → success rate improved
DSPy 核心机制:将 LLM 调用抽象为 Signature(签名) 和 Module(模块),通过编译器根据训练数据自动优化模块参数。
python
"""
DSPy 驱动的自优化 Loop:让循环内的每个决策节点自动进化
"""
import dspy
from dspy.teleprompt import BootstrapFewShot
# ============================================
# 定义循环内的可优化模块
# ============================================
class PlanGenerator(dspy.Signature):
"""签名:将任务描述转化为执行计划"""
task = dspy.InputField(desc="The high-level task to complete")
tools = dspy.InputField(desc="Available tools with descriptions")
plan = dspy.OutputField(desc="JSON array of steps with dependencies")
class ToolSelector(dspy.Signature):
"""签名:根据当前状态选择下一步工具"""
current_state = dspy.InputField(desc="Current state and partial results")
available_tools = dspy.InputField(desc="Available tools")
next_tool = dspy.OutputField(desc="Selected tool name")
tool_args = dspy.OutputField(desc="Arguments for the selected tool")
class ResultEvaluator(dspy.Signature):
"""签名:评估工具输出是否满足当前步骤要求"""
step_goal = dspy.InputField(desc="What this step should achieve")
tool_output = dspy.InputField(desc="Output from tool execution")
passed = dspy.OutputField(desc="True/False")
reason = dspy.OutputField(desc="Reason for pass/fail")
# ============================================
# 构建可优化的 DSPy Module
# ============================================
class SelfOptimizingLoop(dspy.Module):
"""
DSPy Module 封装的 Agent Loop。
每个 LLM 调用节点都是可优化的。
"""
def __init__(self):
super().__init__()
# 每个节点都可以被 DSPy 编译器优化
self.planner = dspy.ChainOfThought(PlanGenerator)
self.selector = dspy.ChainOfThought(ToolSelector)
self.evaluator = dspy.ChainOfThought(ResultEvaluator)
def forward(self, task: str, tools: list[str]) -> str:
"""执行一次完整的循环"""
# 节点 1:规划
plan = self.planner(task=task, tools=str(tools))
results = []
for step in json.loads(plan.plan):
# 节点 2:工具选择
selection = self.selector(
current_state=str(results),
available_tools=str(tools),
)
# 执行工具(非 LLM 部分,不可优化)
tool_result = self._execute(selection.next_tool, selection.tool_args)
# 节点 3:结果评估
eval_result = self.evaluator(
step_goal=step["description"],
tool_output=tool_result,
)
results.append({
"step": step,
"result": tool_result,
"passed": eval_result.passed,
"reason": eval_result.reason,
})
if not eval_result.passed:
break
return str(results)
def _execute(self, tool: str, args: str) -> str:
"""实际工具执行 ------ 非 LLM 部分,不需要优化"""
# ... 工具执行逻辑
return "tool_result"
# ============================================
# 自动优化流程
# ============================================
def train_loop_with_examples():
"""用历史成功/失败数据训练 Loop"""
# 准备训练数据:成功的任务执行轨迹
trainset = [
dspy.Example(
task="Fix all ESLint errors in src/",
tools='["read_file", "write_file", "run_lint"]',
answer="success", # 期望结果
).with_inputs("task", "tools"),
# ... 更多训练样本
]
# BootstrapFewShot:从成功轨迹中自动提取 few-shot 示例
optimizer = BootstrapFewShot(
metric=lambda example, pred, trace=None: 1.0 if "passed" in pred else 0.0,
max_bootstrapped_demos=4,
)
# 编译优化
optimized_loop = optimizer.compile(
SelfOptimizingLoop(),
trainset=trainset,
)
return optimized_loop
# 使用优化后的 Loop
optimized = train_loop_with_examples()
result = optimized(task="Fix all ESLint errors", tools=["read_file", "write_file", "run_lint"])
实际效果
社区报告数据(2026 Q2):
- 任务成功率在无人工改 prompt 的情况下自动提升 15-35%
- 原来需要 5-8 次迭代的任务降至 2-4 次
- 特别适合需要反复执行同一类任务的场景(如每日 CI 修复)
DSPy 优化维度
DSPy 编译器可以优化以下维度:
| 维度 | 说明 | 编译器策略 |
|---|---|---|
| Prompt 措辞 | 系统提示的表述方式 | BootstrapFewShot 自动注入最优 few-shot |
| Few-shot 选择 | 选哪些示例放入上下文 | 基于任务相似度自动检索 |
| Chain-of-Thought | 是否需要推理步骤 | 自动添加 "Let's think step by step" 前缀 |
| 模型选择 | 哪个模型在这个节点表现最好 | MIPROv2 自动模型选择 |
| 上下文长度 | 多长的历史消息进入下一轮 | 自动裁剪不相关的中间步骤 |
局限
- 训练数据获取成本:需要积累足够的成功 / 失败轨迹
- 编译器运行开销:BootstrapFewShot 需要多次 LLM 调用来优化,适用于稳定场景
- 过度优化风险:优化后的 prompt 可能过度拟合训练数据中的模式
7. 声明式循环配置
Loop Engineering 的工程门槛正在被声明式配置大幅降低。核心思路:用 YAML/JSON 描述循环结构和策略,引擎解析配置直接生成循环运行时。
配置驱动的好处
- 策略即配置:换 Loop 策略只需换文件,无需改代码
- 可版本控制:Loop 配置纳入 Git,可 diff、review、回滚
- 快速实验:A/B 测试不同循环策略时,部署时间从小时级降到分钟级
- 降低门槛:非工程角色也能理解和调整循环行为
完整配置示例
yaml
# loop.yaml ------ 声明式 Loop 配置,一行不改代码即可切换策略
version: "2.0"
name: "daily-bug-fixer"
# ============ 循环基础配置 ============
loop:
type: plan-execute # react | plan-execute | maker-checker | ralph
max_iterations: 30
stop_conditions:
- type: test_pass
value: "all tests passing"
- type: max_cost
value: 5.0 # 最大 $5
- type: no_progress
consecutive_rounds: 3
# ============ 工具集 ============
tools:
- name: read_file
source: builtin
- name: write_file
source: builtin
require_approval: true # 写入文件需要人工审批
- name: run_tests
source: mcp
endpoint: "http://ci-server/mcp/test-runner"
- name: git_commit
source: mcp
endpoint: "http://git-server/mcp"
# ============ 模型配置 ============
models:
planner:
provider: anthropic
model: claude-sonnet-4-20250514
temperature: 0.3
max_tokens: 4096
executor:
provider: openai
model: gpt-4o
temperature: 0.5
checker:
provider: anthropic
model: claude-haiku-4-20250514
temperature: 0.0
# ============ 子 Agent ============
sub_agents:
- name: maker
role: generator
model: executor
tools: [read_file, write_file, run_tests]
- name: checker
role: evaluator
model: checker
tools: [read_file, run_tests]
independent: true # 独立上下文窗口
# ============ 安全护栏 ============
guardrails:
pre_action:
- rule: "block destructive ops on src/"
action: reject_and_replan
- rule: "max file write per iteration: 3"
action: reject_and_replan
post_action:
- rule: "check for secrets in output"
action: redact_and_alert
# ============ 人工协同 ============
human_in_the_loop:
triggers:
- on: write_file
paths: ["*.env", "*.key", "config/*"]
action: await_approval
- on: git_push
action: await_approval
- on: cost_exceeded
threshold: 3.0
action: notify_and_pause
# ============ 持久化 ============
persistence:
engine: temporal # temporal | prefect | redis
state_file: ".loop/state.json"
checkpoint_every: 5 # 每 5 步存档一次
# ============ 可观测性 ============
observability:
tracing: langsmith
metrics:
- loop_iteration_count
- tool_call_success_rate
- cost_per_iteration
alerts:
- condition: "cost > $10/day"
channel: "#ai-ops"
配置引擎实现
python
"""
声明式 Loop 配置引擎
解析 YAML → 生成运行时
"""
import yaml
from typing import Literal
class LoopConfigEngine:
"""根据 YAML 配置动态生成 Loop 运行时"""
STRATEGIES = {
"react": ReactLoop,
"plan-execute": PlanExecuteLoop,
"maker-checker": MakerCheckerLoop,
"ralph": RalphLoop,
}
@classmethod
def from_yaml(cls, config_path: str) -> "BaseLoop":
with open(config_path) as f:
config = yaml.safe_load(f)
loop_type = config["loop"]["type"]
strategy_cls = cls.STRATEGIES[loop_type]
# 根据配置组装 Loop
loop = strategy_cls(
max_iterations=config["loop"]["max_iterations"],
stop_conditions=config["loop"]["stop_conditions"],
tools=cls._init_tools(config["tools"]),
models=cls._init_models(config["models"]),
guardrails=config["guardrails"],
human_triggers=config["human_in_the_loop"]["triggers"],
)
# 如果有子 Agent 配置,注入
if "sub_agents" in config:
for sa_config in config["sub_agents"]:
loop.register_sub_agent(
name=sa_config["name"],
role=sa_config["role"],
model=sa_config["model"],
tools=sa_config["tools"],
independent=sa_config.get("independent", False),
)
return loop
@classmethod
def _init_tools(cls, tool_configs: list[dict]) -> dict:
"""初始化工具:支持 builtin 和 MCP 协议"""
tools = {}
for tc in tool_configs:
if tc["source"] == "builtin":
tools[tc["name"]] = BUILTIN_TOOLS[tc["name"]]
elif tc["source"] == "mcp":
tools[tc["name"]] = MCPTool(tc["endpoint"])
return tools
配置驱动的实验流程
bash
# 快速切换循环策略进行 A/B 测试
# 不需要修改任何 Python 代码
# 策略 A:ReAct
claude loop --config loops/react-config.yaml task.md
# 策略 B:Plan-Execute
claude loop --config loops/plan-execute-config.yaml task.md
# 策略 C:Maker-Checker
claude loop --config loops/maker-checker-config.yaml task.md
# 对比三者的成功率、耗时、成本
claude loop compare --configs loops/*.yaml --task task.md
8. 可观测性驱动的循环断点与人工协同
生产环境中的 Loop 不能是一个黑箱。通过全链路 tracing 实时监控循环状态,运维人员可以在运行时动态注入断点、热修改策略,而不需要下线 Agent。
核心能力
- 动态断点注入:看到某个 Agent 即将执行高风险操作,通过管理面板挂入人工审批
- 热修改策略:在不下线 Agent 的情况下调整循环参数
- 实时状态面板:可视化每个节点的执行时间、Token 消耗、成功率
基于 LangSmith + Weave 的可观测 Loop
python
"""
可观测 Loop:全链路 tracing + 动态断点 + 热修改
"""
import weave
from langsmith import traceable, run_tree
@weave.op
class ObservableLoop:
"""
每个节点都被 weave 和 langsmith 追踪。
运维人员可以在管理面板看到实时状态并注入断点。
"""
def __init__(self, config: dict):
self.config = config
# 热修改支持:从配置中心实时拉取参数
self.hot_config = HotConfigSource()
@traceable(run_type="chain", name="agent_loop")
async def run(self, task: str):
"""主循环 ------ 每个 iteration 都是一个 trace span"""
state = {"task": task, "iteration": 0}
while state["iteration"] < self.config["max_iterations"]:
# 检查是否有运维注入的断点
breakpoint = await self._check_breakpoint(state)
if breakpoint:
approval = await self._request_human_approval(
breakpoint
)
if not approval["approved"]:
state["status"] = "paused_by_human"
return state
# 执行一步
step_result = await self._execute_step(state)
# 记录 metric
weave.metrics.increment("loop.iterations")
weave.metrics.record("loop.cost", step_result["cost"])
# 检查热修改
new_config = await self.hot_config.get()
if new_config != self.config:
weave.logger.info(f"Hot-reloading config: {new_config}")
self.config = new_config
state["iteration"] += 1
return state
@traceable(run_type="tool", name="execute_step")
async def _execute_step(self, state: dict) -> dict:
"""单步执行 ------ 独立的 trace span"""
# LLM 调用 + 工具执行
pass
async def _check_breakpoint(self, state: dict) -> dict | None:
"""检查管理面板是否注入了断点"""
# 从 Redis / 配置中心读取断点配置
breakpoints = await redis_client.get("loop:breakpoints")
for bp in breakpoints:
if bp["condition"](state):
return bp
return None
async def _request_human_approval(self, breakpoint: dict) -> dict:
"""向管理面板发送人工审批请求"""
# 通过 WebSocket 通知前端
await ws_manager.send(
channel=f"loop:{self.loop_id}",
message={
"type": "breakpoint",
"step": breakpoint["step"],
"action": breakpoint["action"],
"reason": breakpoint["reason"],
},
)
# 等待人工响应(带超时)
return await ws_manager.wait_for_response(timeout=300)
管理面板能力清单
| 能力 | 说明 | 实现方式 |
|---|---|---|
| 实时火焰图 | 每个节点的耗时占比 | LangSmith Trace View |
| 动态断点 | 在下一步挂起,等待人工确认 | WebSocket + Redis 断点配置 |
| 热修改参数 | 修改 max_iterations / temperature | 配置中心(etcd / Consul)实时推送 |
| 成本仪表盘 | 实时 Token 消耗 + 预算预警 | Weave / LangSmith dashboards |
| 回放与调试 | 对历史轨迹逐步回放 | Temporal Replay / LangSmith Playground |
| 策略 A/B | 同时运行多个策略版本对比 | 基于 trace tag 分流 |
9. 安全护栏子循环
在金融、医疗等强监管行业,Agent 的每一个行动都必须经过安全审核。安全护栏子循环在每一个行动前后嵌入一个轻量的安全检查循环,形成 " 行动前审核 → 行动后审计 " 的双层防线。
架构设计
vbscript
┌──────────────────────────────────────────────┐
│ Main Agent Loop │
│ │
│ ┌────────────────────────────────────────┐ │
│ │ Pre-Action Safety Loop │ │
│ │ Check: permission? budget? │ │
│ │ compliance? │ │
│ │ ↓ │ │
│ │ PASS → Execute FAIL │ │
│ │ ↓ ↓ │ │
│ │ Action Request replan │ │
│ └────────────────────────────────────────┘ │
│ ↓ │
│ ┌────────────────────────────────────────┐ │
│ │ Post-Action Audit Loop │ │
│ │ Check: sensitive info leak? │ │
│ │ compliance? │ │
│ │ ↓ │ │
│ │ PASS → Continue FAIL │ │
│ │ ↓ ↓ │ │
│ │ Next Step Block + Alert │ │
│ └────────────────────────────────────────┘ │
└──────────────────────────────────────────────┘
实现
python
"""
安全护栏子循环:每个 Action 前后的双层安全检查
"""
from enum import Enum
from dataclasses import dataclass
class SafetyVerdict(Enum):
ALLOW = "allow"
BLOCK = "block"
REQUIRE_REPLAN = "require_replan"
REDACT = "redact"
@dataclass
class SafetyRule:
name: str
description: str
severity: str # critical | high | medium
check_fn: callable
class SafetyGuardLoop:
"""
安全护栏子循环:
- 行动前:权限、预算、合规检查
- 行动后:敏感信息、输出审核
"""
def __init__(self):
self.pre_rules: list[SafetyRule] = [
SafetyRule(
name="block_destructive",
description="Block destructive operations on critical paths",
severity="critical",
check_fn=self._check_destructive,
),
SafetyRule(
name="budget_limit",
description="Check if action would exceed budget",
severity="high",
check_fn=self._check_budget,
),
SafetyRule(
name="permission_check",
description="Verify agent has permission for this action",
severity="critical",
check_fn=self._check_permission,
),
SafetyRule(
name="rate_limit",
description="Check API rate limits",
severity="high",
check_fn=self._check_rate_limit,
),
]
self.post_rules: list[SafetyRule] = [
SafetyRule(
name="no_secrets",
description="Scan output for secrets (API keys, tokens, passwords)",
severity="critical",
check_fn=self._scan_for_secrets,
),
SafetyRule(
name="pii_detection",
description="Check for personally identifiable information",
severity="critical",
check_fn=self._check_pii,
),
SafetyRule(
name="output_size",
description="Verify output size is within limits",
severity="medium",
check_fn=self._check_output_size,
),
SafetyRule(
name="compliance",
description="Verify output against compliance policies",
severity="high",
check_fn=self._check_compliance,
),
]
# 安全事件日志
self.safety_log: list[dict] = []
async def pre_action_check(
self,
action: dict,
agent_state: dict,
) -> SafetyVerdict:
"""行动前安全检查循环"""
for rule in self.pre_rules:
verdict = await rule.check_fn(action, agent_state)
self.safety_log.append({
"phase": "pre",
"action": action["name"],
"rule": rule.name,
"verdict": verdict,
"timestamp": time.time(),
})
if verdict == SafetyVerdict.BLOCK:
# 阻断 + 告警
await self._alert(
f"BLOCKED: {action['name']} by rule {rule.name}"
)
return SafetyVerdict.BLOCK
if verdict == SafetyVerdict.REQUIRE_REPLAN:
# 要求 Agent 重新规划
return SafetyVerdict.REQUIRE_REPLAN
return SafetyVerdict.ALLOW
async def post_action_audit(
self,
action: dict,
output: str,
) -> SafetyVerdict:
"""行动后审计循环"""
for rule in self.post_rules:
verdict = await rule.check_fn(output)
self.safety_log.append({
"phase": "post",
"action": action["name"],
"rule": rule.name,
"verdict": verdict,
"timestamp": time.time(),
})
if verdict == SafetyVerdict.BLOCK:
await self._alert(
f"POST-BLOCKED: output of {action['name']} "
f"flagged by {rule.name}",
severity=rule.severity,
)
return SafetyVerdict.BLOCK
if verdict == SafetyVerdict.REDACT:
# 自动脱敏后继续
output = await self._redact(output, rule.name)
return SafetyVerdict.ALLOW
# --- 具体规则实现 ---
async def _check_destructive(
self, action: dict, state: dict
) -> SafetyVerdict:
"""阻断对关键路径的破坏性操作"""
CRITICAL_PATHS = [
"/etc/",
"C:\\Windows\\",
".git/",
"production/",
]
if action["name"] in ("delete", "rm", "drop"):
target = action.get("args", {}).get("path", "")
for cp in CRITICAL_PATHS:
if cp in target:
return SafetyVerdict.BLOCK
return SafetyVerdict.ALLOW
async def _check_budget(
self, action: dict, state: dict
) -> SafetyVerdict:
"""预算检查"""
estimated_cost = self._estimate_cost(action)
current_spend = state.get("total_cost", 0)
budget = state.get("budget", float("inf"))
if current_spend + estimated_cost > budget:
return SafetyVerdict.BLOCK
return SafetyVerdict.ALLOW
async def _scan_for_secrets(self, output: str) -> SafetyVerdict:
"""扫描输出中的敏感信息"""
import re
patterns = [
r"[A-Za-z0-9_]{20,}==", # Base64 token
r"sk-[A-Za-z0-9]{32,}", # OpenAI API key
r"ghp_[A-Za-z0-9]{36}", # GitHub token
r"AKIA[0-9A-Z]{16}", # AWS Access Key
r"password\s*[:=]\s*\S+", # Password in plaintext
]
for pattern in patterns:
if re.search(pattern, output):
return SafetyVerdict.REDACT
return SafetyVerdict.ALLOW
async def _check_pii(self, output: str) -> SafetyVerdict:
"""PII 检测"""
import re
patterns = [
r"\b\d{3}-\d{2}-\d{4}\b", # SSN
r"\b\d{16}\b", # Credit card
r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b", # Email
]
for pattern in patterns:
if re.search(pattern, output):
return SafetyVerdict.REDACT
return SafetyVerdict.ALLOW
async def _alert(self, message: str, severity: str = "high"):
"""发送安全告警"""
# 推送 Slack / PagerDuty / 安全运营中心
pass
async def _redact(self, output: str, rule_name: str) -> str:
"""自动脱敏"""
# 用 [REDACTED] 替换敏感内容
return output
集成到主循环
python
class SafeAgentLoop:
"""嵌入了安全护栏子循环的 Agent Loop"""
def __init__(self):
self.guard = SafetyGuardLoop()
async def run(self, task: str):
state = {"task": task, "iteration": 0, "total_cost": 0}
for state["iteration"] in range(50):
# Agent 推理 + 规划
action = await self._plan_next_action(state)
# 行动前安全检查
pre_verdict = await self.guard.pre_action_check(action, state)
if pre_verdict == SafetyVerdict.BLOCK:
# 阻断:记录事件,尝试重新规划
state["blocked"] = True
continue
if pre_verdict == SafetyVerdict.REQUIRE_REPLAN:
# 要求重规划:将拒绝原因注入上下文
state["messages"].append({
"role": "user",
"content": f"Action {action['name']} rejected. "
"Plan a different approach.",
})
continue
# 执行行动
output = await self._execute_action(action)
# 行动后审计
post_verdict = await self.guard.post_action_audit(action, output)
if post_verdict == SafetyVerdict.BLOCK:
# 输出阻断:告警 + 停止
break
# 更新状态
state["messages"].append({
"role": "tool",
"content": output,
})
行业落地清单
| 行业 | 必选护栏 | 可选护栏 |
|---|---|---|
| 金融 | 操作权限矩阵、交易金额上限、合规审计日志 | PII 检测、反洗钱规则 |
| 医疗 | HIPAA 合规、PHI 数据脱敏、审计追踪 | 处方剂量校验、药物相互作用 |
| SaaS | API 密钥扫描、用户数据隔离、速率限制 | 内容安全审核、GDPR 合规 |
| 基础设施 | 破坏性操作确认、变更窗口检查 | 容量检查、回滚预案验证 |
总结:2026 年 Loop Engineering 技术选型矩阵
| 维度 | 推荐方案 | 为什么 |
|---|---|---|
| 控制流复杂度 | 轻量图 + 代码节点 | 拓扑可见、逻辑可调试、策略可切换 |
| 任务复杂度 | 双层循环 + 动态重规划 | 复杂任务成功率提升 25%,无效调用减少 50% |
| 延迟敏感 | 流式循环 + 异步事件总线 | 端到端延迟从 9s 降到 3s |
| 质量要求 | Generator-Critic 或 多 Agent 辩论 | 质量提升显著,Token 成本可控 |
| 可靠性要求 | Temporal.io 耐久执行 | 服务器崩溃后自动恢复,不丢状态 |
| 持续改进 | DSPy 自优化循环 | 无需人工改 prompt,自动提升 15-35% |
| 工程门槛 | 声明式配置 | 策略即配置,可 Git、可 diff、可实验 |
| 运维可控 | 可观测 Loop + 动态断点 | 运行时注入审批,不下线调整参数 |
| 安全合规 | 安全护栏子循环 | 每个 Action 前后双层防线,行业强制要求 |
Loop Engineering 在 2026 年已经不再是一个概念讨论,而是有清晰工程方案的实践领域。选择哪条技术路线,取决于你的任务类型、团队能力和合规要求。九大做法可以组合使用------例如用 Temporal 做耐久执行的底,在上面搭建双层循环,用流式事件驱动执行,最后用安全护栏包裹每一个行动。
这不是让工作变简单,而是让工作变深刻。设计 Loop 的判断力是你的解药,逃避思考的惯性是你的陷阱。同一个动作,相反的结果。