从持久化任务到多 Agent 协作

本系列贯穿 Agent Harness 的三大核心能力：

持久化（s07）→ 异步执行（s08）→ 多 Agent 协作（s09）

s07：持久化任务系统

设计目标

Agent 在长对话中面临上下文压缩 问题------当历史消息被截断或摘要后，模型会丢失工作进度。s07 的核心理念：将状态外化到文件系统，使其免疫于 LLM 上下文窗口的消减。

架构概览

复制代码

┌──────────────┐    工具调用     ┌──────────────────────┐
│   Agent Loop │ ────────────>  │    TaskManager       │
│  (单线程)     │                │  .tasks/task_*.json  │
│              │ <──────────── │  状态: pending        │
└──────────────┘    JSON 响应    │        in_progress   │
                                 │        completed     │
                                 │  依赖: blockedBy[]   │
                                 └──────────────────────┘

核心流程时序

文件系统 (.tasks/) TaskManager Agent Loop 文件系统 (.tasks/) TaskManager Agent Loop #mermaid-svg-EDqYDk1CsGlshMZv{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-EDqYDk1CsGlshMZv .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-EDqYDk1CsGlshMZv .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-EDqYDk1CsGlshMZv .error-icon{fill:#552222;}#mermaid-svg-EDqYDk1CsGlshMZv .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-EDqYDk1CsGlshMZv .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-EDqYDk1CsGlshMZv .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-EDqYDk1CsGlshMZv .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-EDqYDk1CsGlshMZv .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-EDqYDk1CsGlshMZv .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-EDqYDk1CsGlshMZv .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-EDqYDk1CsGlshMZv .marker{fill:#333333;stroke:#333333;}#mermaid-svg-EDqYDk1CsGlshMZv .marker.cross{stroke:#333333;}#mermaid-svg-EDqYDk1CsGlshMZv svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-EDqYDk1CsGlshMZv p{margin:0;}#mermaid-svg-EDqYDk1CsGlshMZv .actor{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-EDqYDk1CsGlshMZv text.actor>tspan{fill:black;stroke:none;}#mermaid-svg-EDqYDk1CsGlshMZv .actor-line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-EDqYDk1CsGlshMZv .innerArc{stroke-width:1.5;stroke-dasharray:none;}#mermaid-svg-EDqYDk1CsGlshMZv .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333;}#mermaid-svg-EDqYDk1CsGlshMZv .messageLine1{stroke-width:1.5;stroke-dasharray:2,2;stroke:#333;}#mermaid-svg-EDqYDk1CsGlshMZv #arrowhead path{fill:#333;stroke:#333;}#mermaid-svg-EDqYDk1CsGlshMZv .sequenceNumber{fill:white;}#mermaid-svg-EDqYDk1CsGlshMZv #sequencenumber{fill:#333;}#mermaid-svg-EDqYDk1CsGlshMZv #crosshead path{fill:#333;stroke:#333;}#mermaid-svg-EDqYDk1CsGlshMZv .messageText{fill:#333;stroke:none;}#mermaid-svg-EDqYDk1CsGlshMZv .labelBox{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-EDqYDk1CsGlshMZv .labelText,#mermaid-svg-EDqYDk1CsGlshMZv .labelText>tspan{fill:black;stroke:none;}#mermaid-svg-EDqYDk1CsGlshMZv .loopText,#mermaid-svg-EDqYDk1CsGlshMZv .loopText>tspan{fill:black;stroke:none;}#mermaid-svg-EDqYDk1CsGlshMZv .loopLine{stroke-width:2px;stroke-dasharray:2,2;stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-EDqYDk1CsGlshMZv .note{stroke:#aaaa33;fill:#fff5ad;}#mermaid-svg-EDqYDk1CsGlshMZv .noteText,#mermaid-svg-EDqYDk1CsGlshMZv .noteText>tspan{fill:black;stroke:none;}#mermaid-svg-EDqYDk1CsGlshMZv .activation0{fill:#f4f4f4;stroke:#666;}#mermaid-svg-EDqYDk1CsGlshMZv .activation1{fill:#f4f4f4;stroke:#666;}#mermaid-svg-EDqYDk1CsGlshMZv .activation2{fill:#f4f4f4;stroke:#666;}#mermaid-svg-EDqYDk1CsGlshMZv .actorPopupMenu{position:absolute;}#mermaid-svg-EDqYDk1CsGlshMZv .actorPopupMenuPanel{position:absolute;fill:#ECECFF;box-shadow:0px 8px 16px 0px rgba(0,0,0,0.2);filter:drop-shadow(3px 5px 2px rgb(0 0 0 / 0.4));}#mermaid-svg-EDqYDk1CsGlshMZv .actor-man line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-EDqYDk1CsGlshMZv .actor-man circle,#mermaid-svg-EDqYDk1CsGlshMZv line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;stroke-width:2px;}#mermaid-svg-EDqYDk1CsGlshMZv :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 阶段 1：创建任务并设置依赖阶段 2：完成前置任务 → 自动解阻塞阶段 3：处理已解锁的任务 task_create("修复登录", "按钮无响应") 扫描 max_id → task_1.json task_1.json (status=pending) {"id":1, "subject":"修复登录", "blockedBy":\[\]} task_create("编写测试") task_2.json {"id":2, "blockedBy":\[\]} task_update(id=2, addBlockedBy= $1$ ) 更新 task_2.json → blockedBy: $1$ task_2 被 task_1 阻塞 task_update(id=1, status="completed") 标记 task_1 → completed 遍历所有任务，从 blockedBy 移除 1 task_2.blockedBy 已清空 task_1 完成，task_2 已解除阻塞 task_list() 读取所有 task_*.json $task_1(completed), task_2(pending)$ $x$ 修复登录\n 编写测试 ← 可执行了！

数据模型

每个任务独立存储为一个 JSON 文件，路径为 .tasks/task_{id}.json：

json 复制代码

{
  "id": 1,
  "subject": "修复登录 bug",
  "description": "点击登录按钮无响应",
  "status": "pending",
  "blockedBy": [],
  "owner": ""
}

字段说明：

字段	类型	说明
`id`	int	自增 ID，通过扫描目录取最大值获得，重启后保持连续
`status`	enum	`"pending"` / `"in_progress"` / `"completed"`
`blockedBy`	int\[\]	阻塞当前任务的任务 ID 列表，为空时才可执行
`owner`	str	任务所有者（s07 预留字段，s11 正式启用）

TaskManager 核心实现

python 复制代码

class TaskManager:
    def __init__(self, path=".tasks"):
        self.path = Path(path)
        self.path.mkdir(exist_ok=True)          # 自动创建目录

    def _max_id(self):
        """扫描所有 task_*.json 文件，返回最大 ID（O(n) 扫描）"""
        ...

    def create(self, subject, description=""):
        task_id = self._max_id() + 1
        task = {"id": task_id, "subject": subject, ...}
        self._save(task)
        return task

    def update(self, task_id, **kwargs):
        task = self._load(task_id)
        if status == "completed":
            self._clear_dependency(task_id)     # 自动解阻塞
        ...

    def _clear_dependency(self, completed_id):
        """遍历所有任务，将 completed_id 从 blockedBy 中移除"""
        ...

依赖图解析（关键机制）

s07 使用 blockedBy（被谁阻塞） 而非传统的 dependsOn（依赖谁） 。当任务 A 完成后，系统会自动遍历所有任务 ，将 A 的 ID 从每条任务的 blockedBy 数组中移除。

复制代码

初始状态：
  task_1 (completed)  →  task_2.blockedBy = [1]  →  task_3.blockedBy = [2]
  task_2 (in_progress)

task_2 完成后：
  task_2 (completed)  →  自动清除阻塞:
                           task_3.blockedBy = []   ← 现在 task_3 可执行了！

工具接口：

工具	输入	输出
`task_create`	subject(必填), description(可选)	完整任务 JSON
`task_update`	task_id, status / addBlockedBy / removeBlockedBy	更新后任务 JSON
`task_list`	---	格式化任务列表 `[ ] pending / [>] in_progress / [x] completed`
`task_get`	task_id	完整任务 JSON

设计要点

简单优先：没有使用数据库，没有 ORM，用文件系统当数据库
无锁安全：主循环是单线程的，文件操作天然安全
重启耐久：ID 分配不依赖内存计数器，而是扫描文件系统
Unicode 支持 ：json.dump(..., ensure_ascii=False) 确保中文描述正常

理解难点说明

要点 1：blockedBy 与 dependsOn 的关系

在数据结构层面，两者等价。 无论是 {"blockedBy": [1]} 还是 {"dependsOn": [1]}，都是靠后的任务记录靠前任务的信息，存储结构完全一样。

区别仅在语义层：

dependsOn 描述主动性关系------"需要依赖任务 1 才能开始"
blockedBy 描述被动性关系------"被任务 1 挡住了，完成后才能开始"

两者指向同一个事实，但给 LLM 的心理模型不同。dependsOn: [1] 暗示"等 task_1 完成就开始"；blockedBy: [1] 暗示"被 task_1 卡住了"。

选择 blockedBy 的原因不是技术差异，而是其在 task_list() 输出中的直观性：

复制代码

[x] 修复登录     ← 已完成
[>] 编写 API     ← 进行中
[ ] 编写测试     ← blockedBy: [2]

"测试被 API 阻塞" 比 "测试依赖 API" 更贴合从后往前看的视觉顺序。这只是命名偏好，不是架构差异。 将 blockedBy 全局替换为 dependsOn，系统行为不会改变。

要点 2：状态外化解决上下文压缩的原理

"上下文压缩"不是简单地把消息变短，而是压缩后的摘要不再包含任务的完整状态 。比如原始对话中有一条 task_create("修复登录") 的回复 JSON，压缩后可能只剩"用户创建了一个任务"。模型读到这里不知道任务 ID、不知道是否已完成。

文件系统上的 .tasks/task_1.json 则不受 token 限制影响------无论对话被压缩成什么样子，task_get(1) 永远返回完整的当前状态。这就是"外化"的含义：状态从 LLM 的脆弱记忆中转移到文件系统的稳定存储中。

要点 3：ID 分配扫描文件而非依赖内存的原因

如果用内存计数器 self._next_id += 1，进程重启后计数器重置为 0，新创建的任务 ID 会从 1 重新开始，与已有的 task_1.json 冲突。

扫描文件系统的 _max_id() 虽然每次 create() 都需要 O(n) 遍历，但任务数量通常很小（几十个），这比维护持久化计数器要简单得多。这是"简单优先"设计哲学的体现。

s08：后台任务框架

设计目标

s07 中所有工具（包括 bash）都是同步阻塞 的，Agent 必须等待命令执行完毕才能继续。s08 引入"发后不理"模式------Agent 启动一个后台线程立即获得 task_id，主循环继续运转，结果通过通知队列异步注入。

架构概览

复制代码

┌──────────────┐  background_run  ┌───────────────────┐
│   Agent Loop │ ──────────────> │  BackgroundManager │
│  (主线程)     │                 │                    │
│              │                 │  ┌──────────────┐  │
│  每次 LLM     │                 │  │ Daemon Thread │  │
│  调用前:      │                 │  │ subprocess.run│  │
│              │                 │  │ (300s timeout) │  │
│  1. 排干通知   │                 │  └──────┬───────┘  │
│  2. 注入结果   │ <──────────────│─────────┘           │
│  3. 调用 LLM  │   通知队列       │  (threading.Lock)  │
└──────────────┘                 └───────────────────┘

核心流程时序

LLM subprocess Daemon Thread 通知队列 (threading.Lock) BackgroundManager Agent Loop (主线程) LLM subprocess Daemon Thread 通知队列 (threading.Lock) BackgroundManager Agent Loop (主线程) #mermaid-svg-zuNcik4wjsE8J2h2{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-zuNcik4wjsE8J2h2 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-zuNcik4wjsE8J2h2 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-zuNcik4wjsE8J2h2 .error-icon{fill:#552222;}#mermaid-svg-zuNcik4wjsE8J2h2 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-zuNcik4wjsE8J2h2 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-zuNcik4wjsE8J2h2 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-zuNcik4wjsE8J2h2 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-zuNcik4wjsE8J2h2 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-zuNcik4wjsE8J2h2 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-zuNcik4wjsE8J2h2 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-zuNcik4wjsE8J2h2 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-zuNcik4wjsE8J2h2 .marker.cross{stroke:#333333;}#mermaid-svg-zuNcik4wjsE8J2h2 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-zuNcik4wjsE8J2h2 p{margin:0;}#mermaid-svg-zuNcik4wjsE8J2h2 .actor{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-zuNcik4wjsE8J2h2 text.actor>tspan{fill:black;stroke:none;}#mermaid-svg-zuNcik4wjsE8J2h2 .actor-line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-zuNcik4wjsE8J2h2 .innerArc{stroke-width:1.5;stroke-dasharray:none;}#mermaid-svg-zuNcik4wjsE8J2h2 .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333;}#mermaid-svg-zuNcik4wjsE8J2h2 .messageLine1{stroke-width:1.5;stroke-dasharray:2,2;stroke:#333;}#mermaid-svg-zuNcik4wjsE8J2h2 #arrowhead path{fill:#333;stroke:#333;}#mermaid-svg-zuNcik4wjsE8J2h2 .sequenceNumber{fill:white;}#mermaid-svg-zuNcik4wjsE8J2h2 #sequencenumber{fill:#333;}#mermaid-svg-zuNcik4wjsE8J2h2 #crosshead path{fill:#333;stroke:#333;}#mermaid-svg-zuNcik4wjsE8J2h2 .messageText{fill:#333;stroke:none;}#mermaid-svg-zuNcik4wjsE8J2h2 .labelBox{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-zuNcik4wjsE8J2h2 .labelText,#mermaid-svg-zuNcik4wjsE8J2h2 .labelText>tspan{fill:black;stroke:none;}#mermaid-svg-zuNcik4wjsE8J2h2 .loopText,#mermaid-svg-zuNcik4wjsE8J2h2 .loopText>tspan{fill:black;stroke:none;}#mermaid-svg-zuNcik4wjsE8J2h2 .loopLine{stroke-width:2px;stroke-dasharray:2,2;stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-zuNcik4wjsE8J2h2 .note{stroke:#aaaa33;fill:#fff5ad;}#mermaid-svg-zuNcik4wjsE8J2h2 .noteText,#mermaid-svg-zuNcik4wjsE8J2h2 .noteText>tspan{fill:black;stroke:none;}#mermaid-svg-zuNcik4wjsE8J2h2 .activation0{fill:#f4f4f4;stroke:#666;}#mermaid-svg-zuNcik4wjsE8J2h2 .activation1{fill:#f4f4f4;stroke:#666;}#mermaid-svg-zuNcik4wjsE8J2h2 .activation2{fill:#f4f4f4;stroke:#666;}#mermaid-svg-zuNcik4wjsE8J2h2 .actorPopupMenu{position:absolute;}#mermaid-svg-zuNcik4wjsE8J2h2 .actorPopupMenuPanel{position:absolute;fill:#ECECFF;box-shadow:0px 8px 16px 0px rgba(0,0,0,0.2);filter:drop-shadow(3px 5px 2px rgb(0 0 0 / 0.4));}#mermaid-svg-zuNcik4wjsE8J2h2 .actor-man line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-zuNcik4wjsE8J2h2 .actor-man circle,#mermaid-svg-zuNcik4wjsE8J2h2 line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;stroke-width:2px;}#mermaid-svg-zuNcik4wjsE8J2h2 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 阶段 1：发起后台任务阶段 2：主循环继续，后台并行执行 par $后台执行$ 阶段 3：结果注入（每次 LLM 调用前） background_run("npm install") 生成 task_id = "a1b2c3d4" 启动 daemon 线程 "Background task a1b2c3d4 started" 继续其他工具调用... subprocess.run(timeout=300s) stdout+stderr (截断至 50000 字符) with _lock: queue.append(notification) 下一轮循环开始 drain_notifications() ${"task_id":"a1b2c3d4","status":"completed",...}$ 包装为 <background-results> XML 注入结果 + 调用 LLM

数据结构

后台任务状态（内存）：

python 复制代码

self.tasks = {
    "a1b2c3d4": {
        "status": "running",       # / "completed" / "timeout" / "error"
        "result": None,
        "command": "npm install"
    }
}

通知队列条目：

python 复制代码

{
    "task_id": "a1b2c3d4",
    "status": "completed",
    "command": "npm install",
    "result": "...（截断至 50000 字符）..."
}

BackgroundManager 核心实现

python 复制代码

class BackgroundManager:
    def __init__(self):
        self.tasks: dict[str, dict] = {}
        self._queue: list[dict] = []
        self._lock = threading.Lock()          # 仅保护通知队列

    def run(self, command: str) -> str:
        task_id = str(uuid.uuid4())[:8]        # 8 字符短 ID
        self.tasks[task_id] = {"status": "running", "result": None, "command": command}
        thread = threading.Thread(
            target=self._execute,
            args=(task_id, command),
            daemon=True
        )
        thread.start()
        return f"Background task {task_id} started: {command}"

    def _execute(self, task_id, command):
        try:
            result = subprocess.run(
                command, shell=True, capture_output=True, text=True,
                timeout=300                      # 5 分钟超时
            )
            output = (result.stdout + result.stderr)[:50000]
            self.tasks[task_id] = {"status": "completed", "result": output, ...}
        except subprocess.TimeoutExpired:
            self.tasks[task_id] = {"status": "timeout", ...}
        except Exception as e:
            self.tasks[task_id] = {"status": "error", "result": str(e), ...}
        finally:
            with self._lock:                    # 线程安全的通知入队
                self._queue.append({"task_id": task_id, ...})

    def drain_notifications(self):
        """排干通知队列，返回所有待处理通知"""
        with self._lock:
            items, self._queue = list(self._queue), []
        return items

关键设计模式：通知注入

注入时机 ：每次 LLM 调用之前，Agent 循环会排干通知队列，将结果包装为 XML 标签注入对话：

python 复制代码

# Agent 主循环核心片段
while True:
    # 第 1 步：排干通知
    notifs = BG.drain_notifications()
    if notifs:
        notif_text = "\n".join(
            f"[bg:{n['task_id']}] {n['status']}: {n['result']}"
            for n in notifs
        )
        messages.append({
            "role": "user",
            "content": f"<background-results>\n{notif_text}\n</background-results>"
        })

    # 第 2 步：调用 LLM
    response = llm(messages, tools)

    # 第 3 步：处理工具调用
    ...

双超时机制

执行方式	超时	适用场景
`bash`（同步）	120s	快速命令（ls, grep, git status）
`background_run`	300s	长时间任务（npm install, 测试套件, 模型训练）

新增工具

工具	输入	输出
`background_run`	command(string)	`"Background task {id} started: {command}"`
`check_background`	task_id(可选)	单个任务状态或全部任务列表

设计要点

Daemon 线程：主进程退出时自动终止，不会残留孤儿进程
通知队列 vs 直接写文件：队列的解耦让 Agent 层能控制"何时看到结果"，而不是被动接收
XML 标签语义 ：<background-results> 帮助模型区分"新注入的异步结果"和"主动查询的返回"
截断保护：50000 字符截断防止结果撑爆上下文（长文本场景需要摘要工具配合）

理解难点说明

要点 1：仅锁定通知队列而非 tasks 字典的依据

这是 s08 最具争议的设计决策。self.tasks 字典被两个线程并发访问：

写：daemon 线程在 _execute() 末尾写入 self.tasks[task_id] = {...}
读：主线程在 check_background() 时读取

但没有用锁保护 self.tasks。理由：Python 字典的单次读写是原子的（CPython GIL 保证） ，daemon 线程写入的是一个全新的 dict 值 {"status": "completed", ...}，不是对现有 dict 的原地修改。主线程要么读到旧值（"running"），要么读到新值（"completed"），不会读到"写到一半的坏数据"。

通知队列则不同------self._queue.append(...) 和 list(self._queue) 涉及可变列表的批量操作，不锁会导致：

通知丢失（主线程读走了一半元素）
重复处理（同一个通知被两次 drain）

所以设计原则是：只锁真正共享的可变容器，容忍 nanoseconds 级别的状态过期。

要点 2：在"LLM 调用前"而非"工具执行后"注入通知

复制代码

❌ 不推荐：工具执行后注入
   工具执行 → 排干通知 → 注入 → 调用 LLM
   （本轮注入的通知推迟到下一轮才处理）

✅ 实际做法：LLM 调用前注入
   排干通知 → 注入 → 调用 LLM → 工具执行
   （保证模型每次决策时都拥有最新信息）

如果放在"工具执行后"，一个后台任务可能在 LLM 思考时就完成了，但结果要等到下一轮循环才被看到。放在调用前意味着：无论后台任务何时完成，只要主循环进入下一次 LLM 调用，结果一定已经被注入。

要点 3：daemon 线程的销毁时机与风险

daemon=True 的含义：当主线程（唯一非 daemon 线程）退出时，所有 daemon 线程被强制终止。这意味着：

如果 daemon 线程正在 subprocess.run(..., timeout=300) 中执行 npm install，主线程突然 Ctrl+C------子进程变成孤儿进程继续运行，daemon 线程被杀死。
作者选择 daemon 而非 thread.join() 的设计意图：Agent 框架不应该因为一个后台命令而阻止进程退出。这是一个"宁愿残留孤儿进程，也不阻塞用户退出"的权衡。

实际生产中，更健壮的做法是使用 subprocess.Popen + thread.join(timeout) + 显式的进程树清理（shutil.which("taskkill") on Windows 或 os.killpg on Unix）。

要点 4：同步 bash 与异步 background_run 并用

不是所有命令都需要异步。Agent 的典型工作流中：

git status、ls、grep ------ 亚秒级完成，用同步 bash 更简单
npm install、pytest、pip install ------ 分钟级，用 background_run 避免阻塞

这个双超时体系 （120s / 300s）的意图是：将耗时判断交给 LLM------快速命令用 bash，慢速命令用 background_run，由模型根据命令特征自主选择。

s09：多 Agent 团队协作

设计目标

s08 引入了并发执行，s09 将其扩展到多 Agent 维度 ------每个队友 Agent 拥有独立的 LLM 实例、独立的对话历史、独立的工具集，通过文件级 JSONL 收件箱 进行异步通信。关键区分于 s04（子代理，一次性任务付）：s09 的队友是持久化的------它工作、空闲、接收新消息、再次工作，生命周期贯穿整个会话。

架构概览

复制代码

┌──────────────────────────────────────────────────────────┐
│                    Lead Agent（主线程）                    │
│  spawn_teammate / send_message / broadcast / list_all    │
└──────┬───────┬───────┬───────┬───────┬───────┬───────────┘
       │       │       │       │       │       │
       │   spawn    spawn    spawn    │       │
       ▼           ▼         ▼        │       │
┌───────────┐ ┌──────────┐ ┌──────────┐      │
│ Teammate  │ │ Teammate │ │ Teammate │      │
│ (alice)   │ │ (bob)    │ │ (carol)  │      │
│ daemon    │ │ daemon   │ │ daemon   │      │
│ thread    │ │ thread   │ │ thread   │      │
└─────┬─────┘ └────┬─────┘ └────┬─────┘      │
      │            │            │             │
      └────────────┴────────────┴─────────────┘
                    │
         ┌──────────▼──────────┐
         │  .team/inbox/*.jsonl │  ← 文件级消息总线
         │  .team/config.json   │  ← 持久化注册表
         └─────────────────────┘

核心流程时序

LLM Teammate Bob (daemon) Teammate Alice (daemon) MessageBus (.team/inbox/) TeammateManager Lead Agent (主线程) LLM Teammate Bob (daemon) Teammate Alice (daemon) MessageBus (.team/inbox/) TeammateManager Lead Agent (主线程) #mermaid-svg-TWmDhkwmKfICpqA7{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-TWmDhkwmKfICpqA7 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-TWmDhkwmKfICpqA7 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-TWmDhkwmKfICpqA7 .error-icon{fill:#552222;}#mermaid-svg-TWmDhkwmKfICpqA7 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-TWmDhkwmKfICpqA7 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-TWmDhkwmKfICpqA7 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-TWmDhkwmKfICpqA7 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-TWmDhkwmKfICpqA7 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-TWmDhkwmKfICpqA7 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-TWmDhkwmKfICpqA7 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-TWmDhkwmKfICpqA7 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-TWmDhkwmKfICpqA7 .marker.cross{stroke:#333333;}#mermaid-svg-TWmDhkwmKfICpqA7 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-TWmDhkwmKfICpqA7 p{margin:0;}#mermaid-svg-TWmDhkwmKfICpqA7 .actor{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-TWmDhkwmKfICpqA7 text.actor>tspan{fill:black;stroke:none;}#mermaid-svg-TWmDhkwmKfICpqA7 .actor-line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-TWmDhkwmKfICpqA7 .innerArc{stroke-width:1.5;stroke-dasharray:none;}#mermaid-svg-TWmDhkwmKfICpqA7 .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333;}#mermaid-svg-TWmDhkwmKfICpqA7 .messageLine1{stroke-width:1.5;stroke-dasharray:2,2;stroke:#333;}#mermaid-svg-TWmDhkwmKfICpqA7 #arrowhead path{fill:#333;stroke:#333;}#mermaid-svg-TWmDhkwmKfICpqA7 .sequenceNumber{fill:white;}#mermaid-svg-TWmDhkwmKfICpqA7 #sequencenumber{fill:#333;}#mermaid-svg-TWmDhkwmKfICpqA7 #crosshead path{fill:#333;stroke:#333;}#mermaid-svg-TWmDhkwmKfICpqA7 .messageText{fill:#333;stroke:none;}#mermaid-svg-TWmDhkwmKfICpqA7 .labelBox{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-TWmDhkwmKfICpqA7 .labelText,#mermaid-svg-TWmDhkwmKfICpqA7 .labelText>tspan{fill:black;stroke:none;}#mermaid-svg-TWmDhkwmKfICpqA7 .loopText,#mermaid-svg-TWmDhkwmKfICpqA7 .loopText>tspan{fill:black;stroke:none;}#mermaid-svg-TWmDhkwmKfICpqA7 .loopLine{stroke-width:2px;stroke-dasharray:2,2;stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-TWmDhkwmKfICpqA7 .note{stroke:#aaaa33;fill:#fff5ad;}#mermaid-svg-TWmDhkwmKfICpqA7 .noteText,#mermaid-svg-TWmDhkwmKfICpqA7 .noteText>tspan{fill:black;stroke:none;}#mermaid-svg-TWmDhkwmKfICpqA7 .activation0{fill:#f4f4f4;stroke:#666;}#mermaid-svg-TWmDhkwmKfICpqA7 .activation1{fill:#f4f4f4;stroke:#666;}#mermaid-svg-TWmDhkwmKfICpqA7 .activation2{fill:#f4f4f4;stroke:#666;}#mermaid-svg-TWmDhkwmKfICpqA7 .actorPopupMenu{position:absolute;}#mermaid-svg-TWmDhkwmKfICpqA7 .actorPopupMenuPanel{position:absolute;fill:#ECECFF;box-shadow:0px 8px 16px 0px rgba(0,0,0,0.2);filter:drop-shadow(3px 5px 2px rgb(0 0 0 / 0.4));}#mermaid-svg-TWmDhkwmKfICpqA7 .actor-man line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-TWmDhkwmKfICpqA7 .actor-man circle,#mermaid-svg-TWmDhkwmKfICpqA7 line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;stroke-width:2px;}#mermaid-svg-TWmDhkwmKfICpqA7 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 阶段 1：领导创建团队阶段 2：领导分配任务 → 点对点消息阶段 3：队友自发工作（独立轮询收件箱）阶段 4：领导轮询收件箱，接收汇报 spawn("alice", "coder", "你负责写代码...") 启动 daemon 线程，独立 LLM 循环 "队友 alice（coder）已加入团队" spawn("bob", "reviewer", "你负责审核...") 启动 daemon 线程 "队友 bob（reviewer）已加入团队" send("lead", "alice", "实现用户登录接口") 追加 JSONL → .team/inbox/alice.jsonl send("lead", "bob", "准备好做 code review") read_inbox("alice") ← 排干语义 ${"from":"lead", "content":"实现用户登录接口"}$ LLM 推理 → 调用 bash/write_file 等工具 send("alice", "lead", "接口已实现，请 review") read_inbox("bob") ${"from":"lead", "content":"准备好做 code review"}$ send("bob", "lead", "已就绪") read_inbox("lead") ← 每次 LLM 调用前排干 ${"from":"alice",...}, {"from":"bob",...}$ 包装为 <inbox> XML 标签注入接收队友消息，决定下一步

文件系统布局

复制代码

.team/
  config.json              # 团队配置（成员列表 + 状态）
  inbox/
    lead.jsonl             # 领导收件箱（接收队友消息）
    alice.jsonl            # Alice 收件箱
    bob.jsonl              # Bob 收件箱

消息协议与收件箱机制

JSONL（每行一个 JSON 对象）格式，仅追加写 + 排干读：

jsonl 复制代码

{"type": "message", "from": "lead", "content": "修复登录 bug", "timestamp": 1712345678.9}
{"type": "broadcast", "from": "lead", "content": "下午 3 点开会", "timestamp": 1712345778.9}

五种消息类型（s09 启用前两种，其余为 s10 预留）：

消息类型	说明	启用时机
`message`	点对点通信	s09
`broadcast`	广播给所有队友（除发送者）	s09
`shutdown_request`	关闭请求	s10
`shutdown_response`	关闭响应	s10
`plan_approval_response`	计划审批响应	s10

MessageBus 核心实现

python 复制代码

class MessageBus:
    def __init__(self, inbox_dir=".team/inbox"):
        self.inbox_dir = Path(inbox_dir)
        self.inbox_dir.mkdir(parents=True, exist_ok=True)

    def send(self, sender, to, content, msg_type="message", extra=None):
        """向目标收件箱追加一条消息"""
        message = {
            "type": msg_type, "from": sender,
            "content": content, "timestamp": time.time(),
            **(extra or {})
        }
        with open(self.inbox_dir / f"{to}.jsonl", "a", encoding="utf-8") as f:
            f.write(json.dumps(message, ensure_ascii=False) + "\n")

    def read_inbox(self, name):
        """排干收件箱------读完即清空"""
        path = self.inbox_dir / f"{name}.jsonl"
        if not path.exists():
            return []
        with open(path, "r", encoding="utf-8") as f:
            messages = [json.loads(line) for line in f if line.strip()]
        # 清空文件（排干语义）
        with open(path, "w", encoding="utf-8") as f:
            f.write("")
        return messages

    def broadcast(self, sender, content, teammates):
        """向所有队友（除发送者）广播"""
        for name in teammates:
            if name != sender:
                self.send(sender, name, content, msg_type="broadcast")

TeammateManager 与队友生命周期

python 复制代码

class TeammateManager:
    def __init__(self, config_path=".team/config.json"):
        self.config_path = Path(config_path)
        self.config_path.parent.mkdir(parents=True, exist_ok=True)
        self._load_config()

    def spawn(self, name, role, prompt):
        """生成或复活一个队友"""
        if name in self.members and self.members[name]["status"] not in ("idle", "shutdown"):
            return f"队友 {name} 正在工作中..."  # 防重复启动

        self.members[name] = {"name": name, "role": role, "status": "working"}
        self._save_config()

        thread = threading.Thread(
            target=self._teammate_loop,
            args=(name, role, prompt),
            daemon=True
        )
        thread.start()
        return f"队友 {name}（{role}）已加入团队"

    def _teammate_loop(self, name, role, prompt):
        """每个队友独立的 Agent 循环"""
        messages = [{"role": "user", "content": prompt}]
        tools = self._teammate_tools()           # 6 个工具（不含管理类）

        for _ in range(50):  # 安全上限，防止无限循环
            # 1. 读取收件箱（排干）
            inbox = BUS.read_inbox(name)
            for msg in inbox:
                messages.append({
                    "role": "user",
                    "content": f"[来自 {msg['from']}] {msg['content']}"
                })

            # 2. 调用 LLM
            response = llm_with_tools(messages, tools)

            # 3. 执行工具调用
            if response.stop_reason == "tool_use":
                for tc in response.content:
                    if tc.type == "tool_use":
                        result = self._exec(name, tc.name, tc.input)
                        messages.append(tc.result_message(result))
            else:
                break  # 非工具调用的回复 → 进入空闲

        self.members[name]["status"] = "idle"
        self._save_config()

工具权限不对称设计（关键安全模式）

领导工具有 9 个：

类别	工具	权限
基础	bash, read_file, write_file, edit_file	全员
团队管理	spawn_teammate, list_teammates, broadcast	仅领导
通信	send_message, read_inbox	全员

队友工具只有 6 个：

类别	工具	说明
基础	bash, read_file, write_file, edit_file	---
通信	send_message	仅限点对点
通信	read_inbox	只读自己的收件箱

明确禁止： 队友不允许 spawn_teammate（不能创建其他队友）、不允许 list_teammates（不知团队全貌）、不允许 broadcast（不能全局喊话）。

领导收件箱轮询（独特模式）

领导的 Agent 循环在每次 LLM 调用前排干自己的收件箱 ，将队友消息包装为 <inbox> XML 标签注入：

python 复制代码

# 领导循环的每次迭代
while True:
    inbox_msgs = BUS.read_inbox("lead")
    if inbox_msgs:
        inbox_text = "\n".join(f"[{m['from']}] {m['content']}" for m in inbox_msgs)
        messages.append({
            "role": "user",
            "content": f"<inbox>\n{inbox_text}\n</inbox>"
        })

    response = llm(messages, tools)
    ...

这使得队友能够向领导主动汇报（"任务完成"、"遇到问题需要帮助"等），形成双向通信回路。

REPL 调试命令

python 复制代码

/team   → 打印团队状态（成员名、角色、状态）
/inbox  → 查看领导收件箱（调试用）

设计要点

排干语义：消息一旦被读取就从文件中移除，天然防止重复处理
无锁安全：每个 Agent 只读自己的收件箱，不会产生读写冲突
线程隔离：每个队友有独立的消息历史、独立 LLM 调用，互不干扰
安全上限 50 次：防止队友陷入无限工具调用循环
文件即 IPC：不使用管道、socket、消息队列等复杂机制，用简单的 JSONL 文件完成进程间通信

理解难点说明

要点 1：JSONL 文件作为 IPC 的选型权衡

这是一个教学项目的核心理念选择，不是技术能力的限制。

方案	优点	缺点（教学场景下）
JSONL 文件	可用 `cat` 查看，可手工编辑调试，不依赖外部服务	性能低，不适合高频通信
Unix socket / pipe	性能高，延迟低	需要额外抽象层，调试困难（不可见）
Redis / RabbitMQ	功能完备	引入外部依赖，分散教学焦点

核心权衡：为了"看得见、摸得着"的可观测性，牺牲性能和功能完备性 。学员可以直接 cat .team/inbox/alice.jsonl 看到 Alice 收到了什么消息、echo '{"from":"debug","content":"test"}' >> .team/inbox/lead.jsonl 手动给领导发消息。

JSONL（而不是纯 JSON 文件）的选择也有深意：每行一个独立 JSON 对象意味着不需要锁 来追加------两个线程同时写同一个文件，open("a") 在操作系统层面保证了行不会被交错（至少对于 < PIPE_BUF 的行）。

要点 2：独立 LLM 实例的隔离含义

每个队友的 _teammate_loop 有自己的 messages 列表：

python 复制代码

# Alice 的 messages
[
    {"role": "user", "content": prompt},           # "你负责写代码..."
    {"role": "user", "content": "[来自 lead] 实现登录接口"},
    {"role": "user", "content": "[来自 lead] 加上单元测试"},
]

# Bob 的 messages
[
    {"role": "user", "content": prompt},           # "你负责审核..."
    {"role": "user", "content": "[来自 lead] review 一下登录接口"},
]

Alice 不知道 Bob 的存在 （除非领导通过消息告知）。两个 Agent 之间没有共享上下文、没有互相调用的能力。这是安全约束而非功能缺失------如果 Alice 能调用 Bob 的工具，当 Alice 被 prompt injection 攻击时，整个团队都会沦陷。

对比 s04（子代理）：s04 的子代理是"一次性工具"------调用 → 返回结果 → 销毁。s09 的队友是持久化同事------它保有自己的工作目录、工作成果、且可以在不同任务间保持状态。

要点 3：排干语义（drain）的风险

read_inbox("alice") 会读取并清空 alice.jsonl。这带来了一个隐患：

复制代码

时间线：
  1. 领导发送消息给 Alice
  2. Alice 读取收件箱 → 消息被清空
  3. Alice 在处理过程中崩溃/异常退出
  4. 消息已经没了，但工作没完成

这就是**"至少一次"vs"恰好一次"**的经典消息投递问题。JSONL 文件方案选择了"恰好一次"（即使发生异常也不重放），这是一个明确的权衡。如果要从"恰好一次"变为"至少一次"，需要引入消息确认机制（mark 而不是 delete），但这会大幅增加复杂度。

教学项目的立场：对于调试场景和教学演示，"恰好一次"的简化模型是合理的。生产系统中的多 Agent 通信才需要考虑重试、幂等、死信队列等机制。

要点 4：领导收件箱被动轮询的设计意图

领导的收件箱检查发生在LLM 调用之前：

python 复制代码

while True:
    inbox_msgs = BUS.read_inbox("lead")   # ① 每次循环顺便看看
    if inbox_msgs:
        messages.append({"role": "user", "content": "<inbox>..."})
    response = llm(messages, tools)        # ② 然后才调用 LLM

这意味着：领导不会在做事中途被消息打断。队友的汇报最多延迟一个 LLM 调用周期。这个设计的隐含前提是：LLM 调用是 Agent 工作流的"心跳"------每次心跳前检查邮件，和人类每天早晨检查邮件的模式一致。

如果领导正在执行一系列工具调用（bash → read_file → edit_file → ...），队友在此期间发送的消息会排队，直到领导的下一次 LLM 调用时才被看到。这不是 bug，这是有意设计的"不抢占"语义------当前的工作流不应该被队友消息打断。

三阶段演进总结

维度	s07：任务系统	s08：后台任务	s09：团队协作
核心问题	状态在上下文压缩中丢失	同步阻塞浪费 LLM 时间	单 Agent 能力有限
解决方案	文件系统持久化	后台线程 + 通知队列	N+1 个独立 Agent 线程
持久化	`.tasks/task_*.json`	纯内存	`.team/config.json` + JSONL
并发模型	单线程	主线程 + daemon 后台	主线程 + N × daemon Agent
线程安全	不需要	`threading.Lock` 保护队列	不需锁（文件天然隔离）
Agent 数量	1	1	领导 + N 个队友
通信	无	内部通知队列	JSONL 文件收件箱
新增工具	4 个任务工具	2 个后台工具	5 个团队工具
关键抽象	依赖图（blockedBy）	异步结果通知	消息类型系统

从 s07 到 s09 的演进逻辑："先把状态留住 → 再把耗时操作异步化 → 最后让多个 Agent 协作"，每一步都在扩展 Agent Harness 的能力边界，为 s10-s12 的协议驱动、自主行为、任务隔离奠定基础。

下一阶段（s10-s12）：在团队基础上引入结构化通信协议、自主任务认领、以及基于 git worktree 的目录级任务隔离。