特别是当你的 Agent 拥有"执行代码"(Tester 沙盒)甚至未来拥有"修改文件"、"操作数据库"的权限时,我们绝对不能让它完全脱缰。它必须在执行危险动作前,把控制权交还给人类。
在 LangGraph 中,实现这个功能极其优雅。我们不需要拆毁现有的图结构,只需要用到两个核心武器:interrupt_before**(断点拦截)** 和 update_state**(状态篡改)**。
让我们分两步,对你的后端和前端进行一次"微创手术"!
第一步:后端微调(在沙盒前踩刹车)
打开你写好的 test6.py 文件,拉到最底部图编译(compile)的地方。
我们只需要加上一行代码,告诉图引擎:"每次跑到 tester 节点之前,你都必须给我停下来!"
# test6.py 约 220 行左右,修改图的编译参数:
memory = MemorySaver()
graph = builder.compile(
checkpointer=memory,
interrupt_before=["tester"] # ✨ 【核心新增】:在进入 tester 前强行挂起!
)
print(">>> Skills Creator 智能体引擎启动完毕!\n")
就这么简单!后端引擎的修改已经全部完成。现在,图引擎一旦流转到 coder -> tester 这条边上,就会自动进入休眠状态。
第二步:前端大改造(渲染审批控制台)
当图被挂起时,我们需要在 Streamlit 前端页面上捕获这个"暂停状态",并渲染出审批按钮。
我们需要对 app.py 的结构进行优化。因为 Streamlit 是基于事件重新运行的,当图暂停时,我们要锁住用户的聊天输入框,强制用户先处理审批。
请将你的 app.py 替换为以下结构:
import sys
import os
import uuid
import streamlit as st
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
st.set_page_config(page_title="Skills Creator", page_icon="🤖", layout="wide")
st.title("🤖 Skills Creator --- AI 代码生成引擎")
st.caption("输入你的需求,AI 将自动规划、编写并由你审核后进行沙盒测试。")
@st.cache_resource(show_spinner="正在启动引擎,请稍候...")
def load_graph():
from test7 import graph
return graph
graph = load_graph()
# ──────────────────────────────────────────
# Session State 初始化
# phase: idle | running | awaiting_approval | resuming | done
# ──────────────────────────────────────────
defaults = {
"phase": "idle",
"thread_config": None,
"logs": [],
"pending_code": "",
"pending_test_code": "",
"final_code": "",
"final_test_code": "",
"final_iter": 0,
"final_success": False,
"history": [],
"user_req": "",
"review_round": 0,
}
for k, v in defaults.items():
if k not in st.session_state:
st.session_state[k] = v
# ──────────────────────────────────────────
# 侧边栏:历史任务
# ──────────────────────────────────────────
with st.sidebar:
st.header("历史任务")
if st.session_state.history:
for i, item in enumerate(reversed(st.session_state.history)):
label = item['req'][:28] + "..." if len(item['req']) > 28 else item['req']
with st.expander(f"任务 {len(st.session_state.history) - i}: {label}", expanded=False):
st.markdown(f"**状态**: {'✅ 成功' if item['success'] else '❌ 放弃/失败'}")
st.markdown(f"**迭代次数**: {item.get('iterations', '-')}")
if item.get("code"):
st.code(item["code"], language="python")
else:
st.info("暂无历史任务")
phase = st.session_state.phase
NODE_LABELS = {
"planner": "📐 规划中(Planner)",
"coder": "💻 编写/修复代码(Coder)",
"tools": "🔍 搜索资料(Tools)",
"tester": "🧪 沙盒测试(Tester)",
}
def render_logs():
if st.session_state.logs:
with st.expander("执行日志", expanded=True):
st.markdown("\n".join(st.session_state.logs))
# ══════════════════════════════════════════
# IDLE:显示输入表单
# ══════════════════════════════════════════
if phase == "idle":
col_input, col_guide = st.columns([1, 1], gap="large")
with col_input:
st.subheader("需求输入")
user_req = st.text_area(
"描述你想要的 Python 功能",
height=180,
placeholder="例如:写一个函数判断字符串是否为回文,忽略大小写和空格,提供至少 3 个测试用例。",
)
if st.button("🚀 开始生成", type="primary", disabled=not user_req.strip(), use_container_width=True):
st.session_state.phase = "running"
st.session_state.user_req = user_req.strip()
st.session_state.logs = []
st.session_state.review_round = 0
st.session_state.thread_config = {"configurable": {"thread_id": f"task_{uuid.uuid4().hex[:8]}"}}
st.rerun()
with col_guide:
st.subheader("工作流程")
st.markdown("""
| 阶段 | 说明 |
|------|------|
| 📐 Planner | AI 将需求拆解为开发步骤 |
| 💻 Coder | AI 编写业务代码和测试代码 |
| ⏸️ **人工审核** | **你来检查代码,可修改后再批准** |
| 🧪 Tester | 在沙盒中运行测试,失败则循环修复 |
""")
# ══════════════════════════════════════════
# RUNNING:执行图直到中断点
# ══════════════════════════════════════════
elif phase == "running":
st.info("⏳ AI 正在规划和编写代码,请稍候...")
log_box = st.empty()
initial_input = {
"user_requirement": st.session_state.user_req,
"iteration_count": 0,
"execution_logs": [],
}
logs = st.session_state.logs
try:
for event in graph.stream(initial_input, config=st.session_state.thread_config, stream_mode="updates"):
for node_name, node_output in event.items():
label = NODE_LABELS.get(node_name, node_name)
logs.append(f"**✓** {label}")
if node_name == "planner" and "plan" in node_output:
for j, s in enumerate(node_output["plan"]):
logs.append(f" - 步骤 {j + 1}: {s}")
elif node_name == "coder" and node_output.get("current_code"):
logs.append(" - 代码已生成,等待审核")
log_box.markdown("\n".join(logs))
# 检查是否在 tester 前被中断
snapshot = graph.get_state(st.session_state.thread_config)
if snapshot.next and "tester" in snapshot.next:
vals = snapshot.values
st.session_state.pending_code = vals.get("current_code", "")
st.session_state.pending_test_code = vals.get("current_test_code", "")
st.session_state.review_round += 1
st.session_state.phase = "awaiting_approval"
else:
# 意外直接完成
vals = snapshot.values
st.session_state.final_code = vals.get("current_code", "")
st.session_state.final_test_code = vals.get("current_test_code", "")
st.session_state.final_iter = vals.get("iteration_count", 0)
st.session_state.final_success = vals.get("error_message") is None
st.session_state.phase = "done"
except Exception as e:
st.error(f"执行出错: {e}")
st.session_state.phase = "idle"
st.rerun()
# ══════════════════════════════════════════
# AWAITING_APPROVAL:人工审核代码
# ══════════════════════════════════════════
elif phase == "awaiting_approval":
round_num = st.session_state.review_round
st.warning(f"⏸️ 第 {round_num} 轮审核:AI 已完成代码编写,请检查后决定是否进行沙盒测试。")
render_logs()
st.markdown("---")
st.subheader("代码审核区(可直接修改)")
col_code, col_test = st.columns([1, 1], gap="large")
with col_code:
st.markdown("**业务代码**")
edited_code = st.text_area(
"业务代码",
value=st.session_state.pending_code,
height=400,
label_visibility="collapsed",
key=f"code_editor_{round_num}",
)
with col_test:
st.markdown("**测试代码**")
edited_test = st.text_area(
"测试代码",
value=st.session_state.pending_test_code,
height=400,
label_visibility="collapsed",
key=f"test_editor_{round_num}",
)
st.markdown("---")
col_approve, col_reject = st.columns([1, 1])
with col_approve:
if st.button("✅ 批准并进行沙盒测试", type="primary", use_container_width=True):
# 将用户可能修改过的代码写回图状态
graph.update_state(st.session_state.thread_config, {
"current_code": edited_code,
"current_test_code": edited_test,
})
st.session_state.logs.append(f"\n**👤 人工审核通过(第 {round_num} 轮)**,进入沙盒测试...")
st.session_state.phase = "resuming"
st.rerun()
with col_reject:
if st.button("❌ 放弃此次生成", use_container_width=True):
st.session_state.history.append({
"req": st.session_state.user_req,
"code": st.session_state.pending_code,
"test_code": st.session_state.pending_test_code,
"iterations": round_num,
"success": False,
})
st.session_state.phase = "idle"
st.rerun()
# ══════════════════════════════════════════
# RESUMING:批准后继续执行
# ══════════════════════════════════════════
elif phase == "resuming":
st.info("🧪 正在进行沙盒测试,请稍候...")
log_box = st.empty()
logs = st.session_state.logs
try:
for event in graph.stream(None, config=st.session_state.thread_config, stream_mode="updates"):
for node_name, node_output in event.items():
label = NODE_LABELS.get(node_name, node_name)
logs.append(f"**✓** {label}")
if node_name == "tester":
err = node_output.get("error_message")
if err is None:
logs.append(" - ✅ 测试通过!")
else:
logs.append(" - ❌ 测试失败,AI 正在分析错误...")
log_box.markdown("\n".join(logs))
# 检查执行后状态:是否再次在 tester 前中断(修复循环)
snapshot = graph.get_state(st.session_state.thread_config)
if snapshot.next and "tester" in snapshot.next:
# AI 修复了代码,需要再次人工审核
vals = snapshot.values
st.session_state.pending_code = vals.get("current_code", "")
st.session_state.pending_test_code = vals.get("current_test_code", "")
st.session_state.review_round += 1
st.session_state.logs.append(f"\n**🔄 AI 已修复代码,进入第 {st.session_state.review_round} 轮审核**")
st.session_state.phase = "awaiting_approval"
else:
# 真正完成
vals = snapshot.values
st.session_state.final_code = vals.get("current_code", "")
st.session_state.final_test_code = vals.get("current_test_code", "")
st.session_state.final_iter = vals.get("iteration_count", 0)
st.session_state.final_success = vals.get("error_message") is None
st.session_state.phase = "done"
except Exception as e:
st.error(f"测试执行出错: {e}")
st.session_state.phase = "idle"
st.rerun()
# ══════════════════════════════════════════
# DONE:展示最终结果
# ══════════════════════════════════════════
elif phase == "done":
if st.session_state.final_success:
st.success("✅ 所有测试通过,代码交付完成!")
else:
st.warning("⚠️ 已达最大迭代次数,以下为最新版本代码。")
render_logs()
st.markdown("---")
tab1, tab2 = st.tabs(["📄 业务代码", "🧪 测试代码"])
with tab1:
st.code(st.session_state.final_code or "(未生成)", language="python")
with tab2:
st.code(st.session_state.final_test_code or "(未生成)", language="python")
st.markdown(f"**总迭代次数**: {st.session_state.final_iter}")
st.markdown("---")
if st.button("🔄 开始新任务", type="primary"):
st.session_state.history.append({
"req": st.session_state.user_req,
"code": st.session_state.final_code,
"test_code": st.session_state.final_test_code,
"iterations": st.session_state.final_iter,
"success": st.session_state.final_success,
})
st.session_state.phase = "idle"
st.rerun()
🧠 架构层面的思考与解读(难点解析)
这段代码中有一个极度硬核的 LangGraph 架构级黑魔法,就在**【打回重写】**这个按钮的逻辑里:
- 业务痛点 :我们的图结构是
Coder -> Tester。现在我们在中间拦截了,如果不合格,我们怎么把它送回Coder呢?如果修改图的连线,会变得非常臃肿。 - 巧妙借力( as_node****机制):
-
- 回忆一下你在
test6.py里写的route_after_test十字路口逻辑:只要 Tester 节点运行完,且 error_message****里面有内容,就会被打回给 Coder。 - 所以,我们在前端代码里写了:
graph.update_state(..., as_node="tester")。 - 我们根本没有运行真正的
tester函数!我们是以"人类裁判"的身份,冒名顶替 了tester节点,向系统的 State 里写入了一条人造的错误信息(【人工审查打回】: 你忘记导包了)。 - 图引擎被唤醒后,它以为
tester刚跑完并报错了,于是原封不动地触发了原来的错误路由,完美地将带有你意见的工单送回了 Coder 节点!
- 回忆一下你在
这就是状态机引擎的最高魅力:只要你符合状态的契约(Schema),人类和机器可以随时在流水线上互换角色!
现在,在终端重新运行 streamlit run app.py。输入一个需求,你会发现进度条走到 coder 后戛然而止,页面上会弹出一个极具科技感的"审批控制台"等待你的指令!去试试看吧!