OpenClaw Cron 模块深度解析 --- 第三部分
七、Isolated Agent
📊 Isolated Agent执行流程图

执行引擎深度解析
Isolated Agent 执行引擎是 Cron 系统的核心运行时------负责将定时触发事件转化为一次完整的 Agent 交互会话,管理模型选择、会话生命周期、技能快照、交付调度等全链路逻辑。该子系统位于 isolated-agent/ 目录下,由 ~15 个文件组成,形成一个分层的执行管道。
7.1 run.ts --- 主入口流程
文件路径 : isolated-agent/run.ts
代码行数 : ~530 行
核心职责: 作为 isolated agent turn 的总编排器,协调准备(prepare)→ 执行(execute)→ 收尾(finalize)三阶段流水线。
7.1.1 架构分层
run.ts 采用经典的 prepare-execute-finalize 三段式架构:
runCronIsolatedAgentTurn()
├─ prepareCronRunContext() ← 阶段一:准备上下文
├─ executeCronRun() ← 阶段二:执行 Agent Turn(委托 run-executor.ts)
└─ finalizeCronRun() ← 阶段三:收尾(遥测、交付、会话持久化)
7.1.2 懒加载运行时模式
文件顶部声明了 7 个懒加载的运行时 Promise 变量:
typescript
let sessionStoreRuntimePromise: Promise<typeof import("../../config/sessions/store.runtime.js")> | undefined;
let cronExecutorRuntimePromise: ...;
let cronExternalContentRuntimePromise: ...;
let cronAuthProfileRuntimePromise: ...;
let cronContextRuntimePromise: ...;
let cronModelCatalogRuntimePromise: ...;
let cronDeliveryRuntimePromise: ...;
每个运行时对应一个 loadXxxRuntime() 异步函数,采用 ??= 单次赋值保证幂等------首次调用时动态 import(),后续复用已缓存的 Promise。这种模式的价值在于:
- 启动加速 :
run.ts被大量模块引用,若同步 import 所有运行时会拖慢冷启动。 - 循环依赖解耦 :将重量级运行时(如
run-executor.runtime.js、run-delivery.runtime.js)的导入推迟到实际调用时。 - 可测试性 :单元测试可以用
OPENCLAW_TEST_FAST=1跳过运行时加载。
7.1.3 prepareCronRunContext() --- 准备阶段详解
这是整个系统最复杂的单一函数(~250 行),逐行解析核心逻辑:
Step 1: Agent 身份解析
typescript
const defaultAgentId = resolveDefaultAgentId(input.cfg);
const requestedAgentId = ... // 优先级: params.agentId > job.agentId
const agentId = normalizedRequested ?? defaultAgentId;
Agent ID 的解析遵循三层优先级:调用方显式指定 > Job 配置 > 全局默认。normalizeAgentId() 确保 ID 规范化(小写、去除空白)。
Step 2: Agent 配置合并
typescript
const agentCfg = buildCronAgentDefaultsConfig({
defaults: input.cfg.agents?.defaults,
agentConfigOverride,
});
const cfgWithAgentDefaults = { ...input.cfg, agents: { ...input.cfg.agents, defaults: agentCfg } };
buildCronAgentDefaultsConfig(来自 run-config.ts)将 agent 级别的配置覆盖合并到全局默认之上,生成一个 增强版配置 cfgWithAgentDefaults,后续所有子系统都使用此配置而非原始配置。注意:sandbox 字段被显式排除在合并之外(在 run-config.ts 中处理),避免双重应用。
Step 3: 会话 Key 构建
typescript
const baseSessionKey = (input.sessionKey?.trim() || `cron:${input.job.id}`).trim();
const agentSessionKey = resolveCronAgentSessionKey({
sessionKey: baseSessionKey,
agentId,
mainKey: input.cfg.session?.mainKey,
cfg: input.cfg,
});
会话 Key 经历两级转换:
- 若
sessionKey为空,回退到cron:<jobId>模式 resolveCronAgentSessionKey将原始 key 转为 agent store 格式(toAgentStoreSessionKey),再通过canonicalizeMainSessionAlias将agent:<id>:main映射为配置的 mainKey 别名------修复了 issue #29683 中 cron 会话孤立的问题
Step 4: 会话解析/复用
typescript
const cronSession = resolveCronSession({
cfg: input.cfg,
sessionKey: agentSessionKey,
agentId,
nowMs: now,
forceNew: input.job.sessionTarget === "isolated",
});
resolveCronSession(来自 session.ts)执行会话复用决策:
- 若
sessionTarget === "isolated"或forceNew=true,强制创建新会话 - 否则评估会话新鲜度(
evaluateSessionFreshness),根据 reset policy 决定是否复用 - 新会话会清除
lastChannel/lastTo/lastAccountId/lastThreadId/deliveryContext/sessionFile,防止旧的交付路由信息泄漏
Step 5: 模型选择
typescript
const resolvedModelSelection = await resolveCronModelSelection({
cfg: input.cfg,
cfgWithAgentDefaults,
agentConfigOverride,
sessionEntry: cronSession.sessionEntry,
payload: input.job.payload,
isGmailHook,
});
模型选择失败时,prepareCronRunContext 返回 { ok: false, result } 提前退出------这是快速失败策略,避免在模型不可用时浪费后续资源。
Step 6: Thinking Level 解析
typescript
let thinkLevel = jobThink ?? hooksGmailThinking;
if (!thinkLevel) {
thinkLevel = resolveThinkingDefault({ cfg, provider, model, catalog });
}
if (thinkLevel === "xhigh" && !supportsXHighThinking(provider, model)) {
thinkLevel = "high";
}
优先级链:Job payload 指定 > Gmail hook 配置 > 全局默认推理。xhigh 级别在不支持的模型上自动降级为 high,避免运行时错误。
Step 7: 交付上下文解析
typescript
const { deliveryRequested, resolvedDelivery, toolPolicy } = await resolveCronDeliveryContext({
cfg: cfgWithAgentDefaults,
job: input.job,
agentId,
deliveryContract: input.deliveryContract ?? "cron-owned",
});
deliveryContract 区分两种执行模式:
"cron-owned":cron runner 拥有交付权,禁用 agent 的 message tool(disableMessageTool: true),所有交付由 cron delivery 管道处理"shared":共享模式,允许 agent 自行发送消息,但 cron 仍可尝试交付
Step 8: 命令体构建
typescript
if (shouldWrapExternal) {
// 外部 hook 内容通过安全包装器处理
commandBody = `${safeContent}\n\n${timeLine}`.trim();
} else {
commandBody = `${base}\n${timeLine}`.trim();
}
commandBody = appendCronDeliveryInstruction({ commandBody, deliveryRequested });
外部内容(webhook/gmail)经过 detectSuspiciousPatterns 安全检查,再由 buildSafeExternalPrompt 包装,防止注入攻击。若启用了交付,追加指令告知 agent 以纯文本返回、交由系统自动投递。
Step 9: 技能快照
typescript
const skillsSnapshot = await resolveCronSkillsSnapshot({
workspaceDir,
config: cfgWithAgentDefaults,
agentId,
existingSnapshot: cronSession.sessionEntry.skillsSnapshot,
isFastTestEnv,
});
增量更新策略:仅在 snapshot version 变更或 skill filter 变更时重新构建,否则复用已有快照。
Step 10: Auth Profile 解析
typescript
const authProfileId = !hasSessionAuthProfileOverride && !hasConfiguredAuthProfiles(...) && !hasAnyAuthProfileStoreSource(agentDir)
? undefined
: await resolveSessionAuthProfileOverride({...});
三重短路条件:若会话无覆盖、全局无配置、agent 目录无 store 源,直接跳过 auth profile 解析,避免不必要的 I/O。
7.1.4 finalizeCronRun() --- 收尾阶段详解
收尾阶段是遥测统计和交付调度的核心:
遥测收集:
typescript
const modelUsed = finalRunResult.meta?.agentMeta?.model ?? execution.fallbackModel ?? execution.liveSelection.model;
const providerUsed = finalRunResult.meta?.agentMeta?.provider ?? execution.fallbackProvider ?? execution.liveSelection.provider;
模型/提供商的解析遵循三级回退:运行时 meta > fallback 模型 > 初始选择。若 usage 非零,计算 token 成本并累计到 sessionEntry.estimatedCostUsd。
Payload 结果解析:
typescript
const { summary, outputText, synthesizedText, deliveryPayloads, hasFatalErrorPayload, ... } =
resolveCronPayloadOutcome({ payloads, runLevelError, finalAssistantVisibleText, preferFinalAssistantVisibleText });
resolveCronPayloadOutcome(来自 helpers.ts)执行复杂的 payload 分类逻辑:
- 错误 payload 后若无成功 payload,标记为
hasFatalErrorPayload - Telegram 渠道优先使用
finalAssistantVisibleText - 结构化内容(媒体/交互卡片)保持原样,不折叠为纯文本
Heartbeat 过滤:
typescript
const skipHeartbeatDelivery = prepared.deliveryRequested &&
isHeartbeatOnlyResponse(payloads, resolveHeartbeatAckMaxChars(prepared.agentCfg));
当 agent 仅返回心跳确认(如 "OK"、"ack")且文本不超过 ackMaxChars 时,跳过交付------避免无意义通知打扰用户。
交付调度:
typescript
const deliveryResult = await dispatchCronDelivery({...});
委托给 delivery-dispatch.ts,详见第八节。
7.1.5 runCronIsolatedAgentTurn() --- 顶层入口
typescript
export async function runCronIsolatedAgentTurn(params) {
const abortSignal = params.abortSignal ?? params.signal;
const isAborted = () => abortSignal?.aborted === true;
const abortReason = () => { ... };
const prepared = await prepareCronRunContext({ input: params, isFastTestEnv });
if (!prepared.ok) return prepared.result;
try {
const execution = await executeCronRun({...});
if (isAborted()) return prepared.context.withRunSession({ status: "error", error: abortReason() });
return await finalizeCronRun({ prepared: prepared.context, execution, ... });
} catch (err) {
return prepared.context.withRunSession({ status: "error", error: String(err) });
}
}
设计要点:
- 双 signal 支持 :
abortSignal和signal参数并存(向后兼容),优先使用abortSignal - isAborted/abortReason 闭包 :避免重复读取 signal 状态,且
abortReason优先使用 signal.reason 字符串 - withRunSession 包装器 :确保所有返回值都携带
sessionId和sessionKey,即使 prepare 阶段失败也不例外
📊 建议配图:run.ts 执行流水线
节点: [runCronIsolatedAgentTurn] → [prepareCronRunContext] → [executeCronRun] → [finalizeCronRun]
边:
- prepare→execute: "prepared context (ok=true)"
- prepare→exit: "ok=false, early return"
- execute→finalize: "CronExecutionResult"
- execute→catch: "unhandled exception"
子节点(prepare 内部):
[Agent ID 解析] → [Config 合并] → [Session Key] → [Session 解析] → [Model 选择] → [Thinking Level] → [Delivery 上下文] → [Command Body 构建] → [Skills Snapshot] → [Auth Profile] → [Session Pre-Run 持久化]
标注: 每个子节点标注关键决策点(如 "forceNew?", "model ok?", "deliveryRequested?")
7.2 run-executor.ts --- 执行编排
文件路径 : isolated-agent/run-executor.ts
代码行数 : ~210 行
核心职责: 创建 Agent 执行器,管理模型故障转移、LiveSession 模型切换重试、以及"中间确认"(interim ack)检测与重试。
7.2.1 createCronPromptExecutor() --- 执行器工厂
这是一个闭包工厂 ,返回 { runPrompt, getState } 对象:
typescript
export function createCronPromptExecutor(params) {
let runResult: CronPromptRunResult | undefined;
let fallbackProvider = params.liveSelection.provider;
let fallbackModel = params.liveSelection.model;
let runEndedAt = Date.now();
let bootstrapPromptWarningSignaturesSeen = ...;
const runPrompt = async (promptText: string) => {
const fallbackResult = await runWithModelFallback({
cfg: params.cfgWithAgentDefaults,
provider: params.liveSelection.provider,
model: params.liveSelection.model,
fallbacksOverride: cronFallbacksOverride,
run: async (providerOverride, modelOverride, runOptions) => {
// 实际执行逻辑
if (isCliProvider(providerOverride, ...)) {
return await runCliAgent({...});
}
return await runEmbeddedPiAgent({...});
},
});
runResult = fallbackResult.result;
fallbackProvider = fallbackResult.provider;
fallbackModel = fallbackResult.model;
runEndedAt = Date.now();
};
return { runPrompt, getState: () => ({ runResult, fallbackProvider, fallbackModel, runEndedAt, liveSelection }) };
}
关键设计:
- 双执行路径 :
isCliProvider分支走runCliAgent(CLI 外部进程),否则走runEmbeddedPiAgent(内嵌引擎) - fallback 委托 :
runWithModelFallback处理主模型失败时的故障转移链,cronFallbacksOverride来自resolveCronFallbacksOverride - 状态可变闭包 :
runResult、fallbackProvider、fallbackModel在闭包内可变,getState()返回当前快照
embedded agent 参数详解:
typescript
await runEmbeddedPiAgent({
trigger: "cron",
allowGatewaySubagentBinding: true, // 允许 gateway 子代理绑定
senderIsOwner: false, // cron 触发非 owner
messageChannel: params.messageChannel,
agentAccountId: params.resolvedDelivery.accountId,
fastMode: resolveFastModeState({...}).enabled,
bootstrapContextMode: params.agentPayload?.lightContext ? "lightweight" : undefined,
toolsAllow: params.agentPayload?.toolsAllow, // 工具白名单
requireExplicitMessageTarget: params.toolPolicy.requireExplicitMessageTarget,
disableMessageTool: params.toolPolicy.disableMessageTool,
allowTransientCooldownProbe: runOptions?.allowTransientCooldownProbe,
abortSignal: params.abortSignal,
bootstrapPromptWarningSignaturesSeen, // 引导签名追踪
});
lightContext 模式减少 bootstrap 上下文注入,toolsAllow 限制可用工具集,disableMessageTool 在 cron-owned 模式下阻止 agent 自行发送消息。
7.2.2 executeCronRun() --- 主执行循环
typescript
export async function executeCronRun(params): Promise<CronExecutionResult> {
// 注册 agent run context(用于日志关联)
registerAgentRunContext(params.cronSession.sessionEntry.sessionId, {
sessionKey: params.agentSessionKey,
verboseLevel: resolvedVerboseLevel,
});
const executor = createCronPromptExecutor({...});
const runStartedAt = params.runStartedAt ?? Date.now();
const MAX_MODEL_SWITCH_RETRIES = 2;
let modelSwitchRetries = 0;
// LiveSession 模型切换重试循环
while (true) {
try {
await executor.runPrompt(params.commandBody);
break;
} catch (err) {
if (!(err instanceof LiveSessionModelSwitchError)) throw err;
modelSwitchRetries += 1;
if (modelSwitchRetries > MAX_MODEL_SWITCH_RETRIES) throw err;
// 更新 live selection 并同步到会话
params.liveSelection.provider = err.provider;
params.liveSelection.model = err.model;
params.liveSelection.authProfileId = err.authProfileId;
syncCronSessionLiveSelection({ entry: params.cronSession.sessionEntry, liveSelection });
await params.persistSessionEntry();
continue;
}
}
LiveSessionModelSwitchError 是一种特殊的"非致命错误"------表示会话运行时决定切换模型(如模型过载、配额耗尽)。执行器捕获后更新选择并重试,最多重试 2 次(共 3 次尝试)。
7.2.3 Interim Ack 检测与重试
这是 cron 执行引擎最精巧的部分之一:
typescript
const shouldRetryInterimAck =
!runResult.meta?.error && // 无运行时错误
!runResult.didSendViaMessagingTool && // agent 未自行发送消息
!interimPayloadHasStructuredContent && // 无结构化内容
!interimPayloads.some((payload) => payload?.isError === true) && // 无错误 payload
isLikelyInterimCronMessage(interimText); // 文本匹配"中间确认"模式
当 agent 返回类似 "on it"、"working on it"、"give me a few minutes" 的临时确认时,系统自动重试------追加一条 continuation prompt:
typescript
const continuationPrompt = [
"Your previous response was only an acknowledgement and did not complete this cron task.",
"Complete the original task now.",
"Do not send a status update like 'on it'.",
"Use tools when needed, including sessions_spawn for parallel subtasks, wait for spawned subagents to finish, then return only the final summary.",
].join(" ");
await executor.runPrompt(continuationPrompt);
子代理活跃检测:重试前还会检查是否有活跃的子代理------若子代理正在运行,说明 agent 已委派任务,无需重试:
typescript
if (shouldRetryInterimAck) {
hasFreshDescendants = listDescendantRunsForRequester(params.agentSessionKey).some(entry => {
const descendantStartedAt = ...;
return descendantStartedAt >= runStartedAt; // 只算本次运行后启动的子代理
});
hasActiveDescendants = countActiveDescendantRuns(params.agentSessionKey) > 0;
}
if (shouldRetryInterimAck && !hasFreshDescendants && !hasActiveDescendants) {
await executor.runPrompt(continuationPrompt);
}
📊 建议配图:executeCronRun 执行流
节点:
[executeCronRun] → [createCronPromptExecutor] → [runPrompt Loop] → [Interim Ack 检测] → [重试/退出]
边:
- executeCronRun→runPrompt: "commandBody"
- runPrompt→catch: "LiveSessionModelSwitchError" (最多 2 次重试)
- runPrompt→success: "CronPromptRunResult"
- success→interim check: "shouldRetryInterimAck?"
- interim→retry: "continuationPrompt" (条件: !hasFreshDescendants && !hasActiveDescendants)
- interim→exit: "返回 CronExecutionResult"
标注:
- "MAX_MODEL_SWITCH_RETRIES=2"
- "isLikelyInterimCronMessage: 45词以内 + 匹配 hint 列表"
- "子代理检测: listDescendantRunsForRequester + countActiveDescendantRuns"
7.3 model-selection.ts --- 模型选择与故障转移
文件路径 : isolated-agent/model-selection.ts
代码行数 : ~100 行
核心职责: 解析 cron job 应使用的 AI 模型,实现五级优先级链和访问控制校验。
7.3.1 五级模型优先级
resolveCronModelSelection() 优先级(从高到低):
1. payload.model --- Job 级别显式指定(最高优先级,含访问控制校验)
2. hooksGmailModel --- Gmail hook 专用模型覆盖
3. sessionEntry.modelOverride --- 会话级持久化覆盖(上次运行时切换的模型)
4. agentConfigOverride --- Agent 级别配置(subagents.model > model)
5. 全局默认 --- resolveConfiguredModelRef(cfg)
7.3.2 逐级解析逻辑
Level 5: 全局默认
typescript
const resolvedDefault = resolveConfiguredModelRef({
cfg: params.cfgWithAgentDefaults,
defaultProvider: DEFAULT_PROVIDER,
defaultModel: DEFAULT_MODEL,
});
let provider = resolvedDefault.provider;
let model = resolvedDefault.model;
resolveConfiguredModelRef 读取 agents.defaults.model 配置,解析 provider/model 对。
Level 4: Agent 配置覆盖
typescript
const subagentModelRaw =
normalizeModelSelection(params.agentConfigOverride?.subagents?.model) ??
normalizeModelSelection(params.agentConfigOverride?.model) ??
normalizeModelSelection(params.cfg.agents?.defaults?.subagents?.model);
if (subagentModelRaw) {
const resolvedSubagent = resolveAllowedModelRef({
cfg, catalog, raw: subagentModelRaw, defaultProvider, defaultModel
});
if (!("error" in resolvedSubagent)) {
provider = resolvedSubagent.ref.provider;
model = resolvedSubagent.ref.model;
}
}
注意:优先使用 subagents.model(子代理专用模型),回退到 model(通用模型),再回退到全局 subagents.model。若 resolveAllowedModelRef 返回 error,静默跳过此级------不会中断,只是不覆盖。
Level 3: Gmail hook 模型
typescript
const hooksGmailModelRef = params.isGmailHook ? resolveHooksGmailModel({cfg, defaultProvider}) : null;
if (hooksGmailModelRef) {
const status = getModelRefStatus({cfg, catalog, ref: hooksGmailModelRef, defaultProvider, defaultModel});
if (status.allowed) {
provider = hooksGmailModelRef.provider;
model = hooksGmailModelRef.model;
hooksGmailModelApplied = true;
}
}
Gmail hook 模型需要通过 allowed 校验,不通过则静默跳过。
Level 2: Payload 显式指定
typescript
const modelOverride = typeof modelOverrideRaw === "string" ? modelOverrideRaw.trim() : undefined;
if (modelOverride !== undefined && modelOverride.length > 0) {
const resolvedOverride = resolveAllowedModelRef({cfg, catalog, raw: modelOverride, ...});
if ("error" in resolvedOverride) {
if (resolvedOverride.error.startsWith("model not allowed:")) {
// 模型不允许 → 回退到当前选择 + 发出 warning
return { ok: true, provider, model, warning: `...` };
}
// 其他错误(格式错误等)→ 硬失败
return { ok: false, error: resolvedOverride.error };
}
provider = resolvedOverride.ref.provider;
model = resolvedOverride.ref.model;
}
关键区分:
"model not allowed:"→ 软降级(ok: true+ warning),job 仍可运行- 其他错误 → 硬失败(
ok: false),job 无法执行
Level 1: 会话级覆盖
typescript
if (!modelOverride && !hooksGmailModelApplied) {
const sessionModelOverride = params.sessionEntry.modelOverride?.trim();
if (sessionModelOverride) {
const resolvedSessionOverride = resolveAllowedModelRef({...});
if (!("error" in resolvedSessionOverride)) {
provider = resolvedSessionOverride.ref.provider;
model = resolvedSessionOverride.ref.model;
}
}
}
会话覆盖仅在 payload 和 Gmail 均未覆盖时生效。错误被静默吞掉------因为会话覆盖是运行时自动产生的,不应因格式问题阻断执行。
7.3.3 Model Catalog 延迟加载
typescript
let catalog: Awaited<ReturnType<typeof loadModelCatalog>> | undefined;
const loadCatalogOnce = async () => {
if (!catalog) {
catalog = await loadModelCatalog({ config: params.cfgWithAgentDefaults });
}
return catalog;
};
Catalog 只在首次需要校验时加载,后续复用缓存。
📊 建议配图:模型选择优先级瀑布
节点(从上到下):
[全局默认: resolveConfiguredModelRef] → [Agent 配置: subagents.model / model] → [Gmail Hook: resolveHooksGmailModel] → [Payload: job.payload.model] → [会话覆盖: sessionEntry.modelOverride]
边:
- 每级→下一级: "未覆盖/校验失败,向下传递"
- 每级→exit: "校验通过,使用此模型"
- Payload→exit(warning): "model not allowed, 降级 + warning"
- Payload→exit(error): "其他错误, 硬失败"
标注:
- "resolveAllowedModelRef: 访问控制校验"
- "getModelRefStatus: Gmail 模型 allowed 检查"
- "静默跳过: Agent/Gmail/Session 级别校验失败不报错"
7.4 subagent-followup.ts --- 子代理委派
文件路径 : isolated-agent/subagent-followup.ts
代码行数 : ~120 行
核心职责: 等待子代理完成并收集其输出,作为 cron agent 最终交付内容的来源。
7.4.1 readDescendantSubagentFallbackReply()
typescript
export async function readDescendantSubagentFallbackReply(params: {
sessionKey: string;
runStartedAt: number;
}): Promise<string | undefined> {
const descendants = listDescendantRunsForRequester(params.sessionKey)
.filter(entry => typeof entry.endedAt === "number" && entry.endedAt >= params.runStartedAt)
.toSorted((a, b) => (a.endedAt ?? 0) - (b.endedAt ?? 0));
步骤:
- 列出所有已完成的子代运行(
endedAt >= runStartedAt,只算本次触发的) - 按子 session 去重,每个子 session 只保留最新一次运行
- 取最近 4 个子 session 的最新回复
- 过滤
SILENT_REPLY_TOKEN,拼接为多段落文本
冻结结果回退:
typescript
if (!reply && typeof entry.frozenResultText === "string" && entry.frozenResultText.trim()) {
reply = entry.frozenResultText.trim();
}
当子 session transcript 已被删除(如 announce 清理),回退到注册表中冻结的结果文本。
7.4.2 waitForDescendantSubagentSummary()
typescript
export async function waitForDescendantSubagentSummary(params: {
sessionKey: string;
initialReply?: string;
timeoutMs: number;
observedActiveDescendants?: boolean;
}): Promise<string | undefined> {
执行流程:
- 快路径 :若无活跃子代且未被观察到,直接返回
initialReply - 等待排空 :调用
waitForAgentRunsToDrain()--- 基于 push 的等待(gateway RPCagent.wait),而非忙轮询 - 优雅期轮询 :子代全部完成后,等待 cron agent 产出合成消息(
finalReplyGraceMs,默认 5 秒) - 最终读取:优雅期结束后最后一次读取
时序配置:
typescript
function resolveCronSubagentTimings() {
const fastTestMode = process.env.OPENCLAW_TEST_FAST === "1";
return {
waitMinMs: fastTestMode ? 10 : 30_000, // 最小等待时间
finalReplyGraceMs: fastTestMode ? 50 : 5_000, // 优雅期
gracePollMs: fastTestMode ? 8 : 200, // 优雅期轮询间隔
};
}
合成消息识别:
typescript
const resolveUsableLatestReply = async () => {
const latest = (await readLatestAssistantReply({ sessionKey }))?.trim();
if (latest && latest !== SILENT_REPLY_TOKEN &&
(latest !== initialReply || !isLikelyInterimCronMessage(latest))) {
return latest;
}
return undefined;
};
只有当最新回复不同于初始临时确认时才视为有效的合成消息。
📊 建议配图:子代理等待流程
节点:
[waitForDescendantSubagentSummary] → [快路径: 无活跃子代?] → [waitForAgentRunsToDrain] → [优雅期轮询] → [最终读取]
边:
- 快路径→exit: "返回 initialReply"
- 快路径→wait: "有活跃子代"
- wait→grace: "所有子代完成"
- grace→found: "最新回复 ≠ 初始临时确认"
- grace→final: "优雅期超时"
- final→exit: "返回最终回复或 undefined"
标注:
- "push-based: gateway RPC agent.wait"
- "finalReplyGraceMs=5000"
- "gracePollMs=200"
- "isLikelyInterimCronMessage: 过滤临时确认"
7.5 subagent-followup-hints.ts --- 子代理委派提示词
文件路径 : isolated-agent/subagent-followup-hints.ts
代码行数 : ~40 行
核心职责: 定义"临时确认"和"子代理委派"的文本模式匹配规则。
7.5.1 两个提示词列表
typescript
const SUBAGENT_FOLLOWUP_HINTS = [
"subagent spawned", "spawned a subagent", "auto-announce when done",
"both subagents are running", "wait for them to report back",
] as const;
const INTERIM_CRON_HINTS = [
"on it", "pulling everything together", "give me a few", "give me a few min",
"few minutes", "let me compile", "i'll gather", "i will gather",
"working on it", "retrying now", "should be about", "should have your summary",
"it'll auto-announce when done", "it will auto-announce when done",
...SUBAGENT_FOLLOWUP_HINTS,
] as const;
SUBAGENT_FOLLOWUP_HINTS(5 条):强信号------agent 明确表示已委派子代理INTERIM_CRON_HINTS(17 条):弱信号------agent 表达了"正在处理"但未完成
7.5.2 匹配算法
typescript
export function isLikelyInterimCronMessage(value: string): boolean {
const normalized = normalizeHintText(value);
if (!normalized) return false; // 空文本 ≠ 临时确认(可能是 NO_REPLY)
const words = normalized.split(" ").filter(Boolean).length;
return words <= 45 && INTERIM_CRON_HINTS.some(hint => normalized.includes(hint));
}
export function expectsSubagentFollowup(value: string): boolean {
const normalized = normalizeHintText(value);
return Boolean(normalized && SUBAGENT_FOLLOWUP_HINTS.some(hint => normalized.includes(hint)));
}
设计要点:
- 45 词上限:长回复即使包含 "on it" 也不是临时确认(可能是详细报告的一部分)
- 子串匹配 :
includes而非严格相等,容忍上下文修饰 - 空文本特殊处理 :空文本返回
false------表示 agent 选择沉默(NO_REPLY),不应被重试
📊 建议配图:提示词分类
节点:
[输入文本] → [normalizeHintText] → [词数检查] → [子串匹配]
边:
- 词数>45→exit(false): "长文本不可能是临时确认"
- 空文本→exit(false): "NO_REPLY,不重试"
- INTERIM_CRON_HINTS 匹配→exit(true): "临时确认,需要重试"
- SUBAGENT_FOLLOWUP_HINTS 匹配→expectsSubagentFollowup(true): "需要等待子代理"
标注:
- "INTERIM_CRON_HINTS: 17 条模式"
- "SUBAGENT_FOLLOWUP_HINTS: 5 条模式(INTERIM 的子集)"
7.6 session.ts --- 会话管理
文件路径 : isolated-agent/session.ts
代码行数 : ~60 行
核心职责: 解析或创建 cron 会话,实现会话复用/滚动策略。
7.6.1 resolveCronSession() 核心逻辑
typescript
export function resolveCronSession(params) {
const storePath = resolveStorePath(sessionCfg?.store, { agentId });
const store = loadSessionStore(storePath);
const entry = store[params.sessionKey];
if (!params.forceNew && entry?.sessionId) {
const resetPolicy = resolveSessionResetPolicy({ sessionCfg, resetType: "direct" });
const freshness = evaluateSessionFreshness({
updatedAt: entry.updatedAt, now: params.nowMs, policy: resetPolicy,
});
if (freshness.fresh) {
// 复用: sessionId = entry.sessionId, isNewSession = false
} else {
// 滚动: sessionId = crypto.randomUUID(), isNewSession = true
}
} else {
// 强制新: sessionId = crypto.randomUUID(), isNewSession = true
}
clearBootstrapSnapshotOnSessionRollover({
sessionKey, previousSessionId: isNewSession ? entry?.sessionId : undefined,
});
const sessionEntry = {
...entry, // 保留已有的 per-session 覆盖
sessionId, updatedAt, systemSent,
...(isNewSession && { // 新会话清除路由状态
lastChannel: undefined, lastTo: undefined, lastAccountId: undefined,
lastThreadId: undefined, deliveryContext: undefined, sessionFile: undefined,
}),
};
}
关键设计决策:
resetType: "direct":Cron 会话使用"直聊"重置策略------与用户 1:1 对话模式相同的新鲜度评估- 路由状态清除 :新会话时清除
lastThreadId和deliveryContext,防止旧的 thread 路由泄漏到新会话(例如,上次运行在 Telegram 线程中回复,新会话不应自动继续该线程) - Spread 保留 :
...entry保留authProfileOverride、contextTokens等跨会话持久化的字段,即使 sessionId 滚动
Bootstrap 快照清理:
typescript
clearBootstrapSnapshotOnSessionRollover({
sessionKey, previousSessionId: isNewSession ? entry?.sessionId : undefined,
});
当会话滚动时,清除 bootstrap prompt 的缓存快照------新会话需要重新生成 system prompt。
📊 建议配图:会话复用决策树
节点:
[resolveCronSession] → [forceNew?] → [entry 存在?] → [freshness.fresh?] → [返回结果]
边:
- forceNew=true→新会话: "isolated sessionTarget"
- forceNew=false, entry 不存在→新会话: "首次运行"
- forceNew=false, entry 存在, fresh=true→复用: "保留 sessionId"
- forceNew=false, entry 存在, fresh=false→滚动: "新 UUID + 清除路由状态"
标注:
- "resetType=direct: 1:1 对话新鲜度策略"
- "clearBootstrapSnapshotOnSessionRollover: 会话滚动时清除缓存"
- "保留: authProfileOverride, contextTokens, estimatedCostUsd"
- "清除: lastChannel, lastTo, lastThreadId, deliveryContext, sessionFile"
7.7 skills-snapshot.ts --- 技能快照
文件路径 : isolated-agent/skills-snapshot.ts
代码行数 : ~45 行
核心职责: 解析 cron agent 可用的技能快照,实现增量更新策略。
7.7.1 resolveCronSkillsSnapshot() 核心逻辑
typescript
export async function resolveCronSkillsSnapshot(params): Promise<SkillSnapshot> {
if (params.isFastTestEnv) {
return params.existingSnapshot ?? { prompt: "", skills: [] };
}
const snapshotVersion = runtime.getSkillsSnapshotVersion(params.workspaceDir);
const skillFilter = runtime.resolveAgentSkillsFilter(params.config, params.agentId);
const shouldRefresh =
!existingSnapshot ||
existingSnapshot.version !== snapshotVersion ||
!matchesSkillFilter(existingSnapshot.skillFilter, skillFilter);
if (!shouldRefresh) return existingSnapshot;
return runtime.buildWorkspaceSkillSnapshot(params.workspaceDir, {
config, agentId, skillFilter,
eligibility: {
remote: runtime.getRemoteSkillEligibility({
advertiseExecNode: runtime.canExecRequestNode({ cfg: config, agentId }),
}),
},
snapshotVersion,
});
}
增量更新三条件(任一为真则刷新):
- 无已有快照(首次运行)
snapshotVersion变更(workspace 文件有变动)skillFilter不匹配(agent 的技能过滤配置变更)
远程技能资格:
typescript
eligibility: {
remote: runtime.getRemoteSkillEligibility({
advertiseExecNode: runtime.canExecRequestNode({ cfg, agentId }),
}),
}
canExecRequestNode 检查 agent 是否配置了执行节点能力,决定是否在快照中包含远程技能。
📊 建议配图:技能快照更新决策
节点:
[resolveCronSkillsSnapshot] → [isFastTestEnv?] → [shouldRefresh?] → [buildWorkspaceSkillSnapshot]
边:
- fastTest→exit: "返回 existingSnapshot 或空快照"
- !shouldRefresh→exit: "返回 existingSnapshot"
- shouldRefresh→build: "重新构建"
标注:
- "shouldRefresh = !existing || version 变更 || filter 变更"
- "eligibility.remote: 取决于 canExecRequestNode"
八、交付系统深度解析
交付系统是 Cron 模块与外部世界(Telegram、Discord、Feishu 等)的桥梁------负责将 agent 的执行结果投递到指定的渠道和目标。该系统由三层组成:计划生成(delivery-plan)、目标解析(delivery-target)、调度执行(delivery-dispatch)。
8.1 delivery-plan.ts --- 交付计划生成
文件路径 : delivery-plan.ts
代码行数 : ~180 行
核心职责: 从 Job 的 delivery 配置生成结构化的交付计划,并解析故障目标(failureDestination)。
8.1.1 resolveCronDeliveryPlan() --- 主交付计划
typescript
export function resolveCronDeliveryPlan(job: CronJob): CronDeliveryPlan {
const delivery = job.delivery;
const hasDelivery = delivery && typeof delivery === "object";
// Mode 规范化
const mode = normalizedMode === "announce" ? "announce"
: normalizedMode === "webhook" ? "webhook"
: normalizedMode === "none" ? "none"
: normalizedMode === "deliver" ? "announce" // deliver → announce 别名
: undefined;
Mode 映射规则:
| 输入 | 映射 | 说明 |
|---|---|---|
announce |
announce |
直接投递到渠道 |
deliver |
announce |
向后兼容别名 |
webhook |
webhook |
HTTP POST 回调 |
none |
none |
不交付 |
| 未指定 | 按 payload kind 决定 | 见下文 |
隐式模式推断(当 job 无 delivery 配置时):
typescript
const isIsolatedAgentTurn =
job.payload.kind === "agentTurn" &&
(job.sessionTarget === "isolated" || job.sessionTarget === "current" ||
job.sessionTarget.startsWith("session:"));
const resolvedMode = isIsolatedAgentTurn ? "announce" : "none";
Isolated agent turn 默认使用 announce 模式------因为独立会话的 agent 无法自行发送消息(disableMessageTool=true),结果必须由系统投递。
8.1.2 resolveFailureDestination() --- 故障目标
typescript
export function resolveFailureDestination(
job: CronJob, globalConfig?: CronFailureDestinationConfig,
): CronFailureDeliveryPlan | null {
优先级 :Job 级 failureDestination > 全局配置 cron.failureDestination。
去重逻辑:
typescript
if (delivery && isSameDeliveryTarget(delivery, result)) {
return null; // 故障目标与正常交付目标相同时,不重复发送
}
isSameDeliveryTarget 比较 mode/channel/to/accountId------若故障目标与正常交付目标一致,跳过故障通知(因为正常交付已经会发送错误消息)。
webhook 模式特殊校验:
typescript
if (resolvedMode === "webhook" && !to) {
return null; // webhook 必须有 URL,否则无效
}
Job 覆盖逻辑的精细设计:
typescript
if (hasJobChannelField) channel = jobChannel; // "channel" in jobFailureDest → 覆盖
if (hasJobToField) to = jobTo; // "to" in jobFailureDest → 覆盖
if (hasJobAccountIdField) accountId = jobAccountId; // "accountId" in jobFailureDest → 覆盖
使用 "field" in obj 而非 obj.field !== undefined 判断------区分"显式设置为空"和"未设置"。只有显式存在的字段才覆盖全局配置。
Mode 切换与 to 的联动:
typescript
if (jobMode !== undefined) {
const globalMode = globalConfig?.mode ?? "announce";
if (!jobToExplicitValue && globalMode !== jobMode) {
to = undefined; // 全局 announce→webhook 切换时,清空 to(因为 to 可能是 chat ID 而非 URL)
}
mode = jobMode;
}
当从 announce 切换到 webhook 模式且 job 未显式设置 to 时,清空全局的 to------因为 announce 模式的 to 是 chat ID,而 webhook 的 to 是 URL,两者不可混用。
📊 建议配图:交付计划生成
节点:
[resolveCronDeliveryPlan] → [hasDelivery?] → [mode 规范化] → [字段提取] → [返回 CronDeliveryPlan]
[resolveFailureDestination] → [全局配置] → [Job 覆盖] → [去重检查] → [返回 CronFailureDeliveryPlan | null]
边:
- hasDelivery=false→隐式推断: "isolated→announce, 其他→none"
- mode=deliver→announce: "别名映射"
- isSameDeliveryTarget→null: "故障目标与正常目标相同,去重"
- webhook && !to→null: "缺少 URL,无效"
标注:
- "Job 覆盖优先于全局配置"
- "hasJobChannelField: 'channel' in obj(区分显式空与未设置)"
- "mode 切换时 to 联动清空"
8.2 delivery-dispatch.ts --- 交付调度
文件路径 : isolated-agent/delivery-dispatch.ts
代码行数 : ~430 行
核心职责: 实现交付结果的实际调度------处理 announce/webhook 模式投递、子代理输出收集、NO_REPLY 抑制、幂等性缓存、瞬态重试、陈旧性检测。
这是整个 cron 模块最复杂的单一文件,交织了交付路由、子代理等待、重试策略、缓存管理等多重关注点。
8.2.1 dispatchCronDelivery() --- 总调度入口
函数签名包含 25+ 个参数,返回 DispatchCronDeliveryState:
typescript
export async function dispatchCronDelivery(params): Promise<DispatchCronDeliveryState> {
早期退出路径:
- delivery 未请求 :直接返回
{ delivered, deliveryAttempted, ... } - heartbeat-only :跳过交付(
skipHeartbeatDelivery) - messaging-tool 已发送 :
skipMessagingToolDelivery时标记已交付 - resolvedDelivery 失败:非 bestEffort → 错误返回;bestEffort → 警告后继续
8.2.2 deliverViaDirect() --- 直投路径
typescript
const deliverViaDirect = async (delivery, options?) => {
// 1. NO_REPLY 抑制
const payloadsForDelivery = rawPayloads.filter(p => {
const text = p.text ?? "";
if (isSilentReplyText(text, SILENT_REPLY_TOKEN)) return false;
const upper = text.toUpperCase();
const stripped = stripSilentToken(upper, SILENT_REPLY_TOKEN);
return stripped === upper.trim();
});
// 2. 空过滤 → finishSilentReplyDelivery
if (payloadsForDelivery.length === 0) return await finishSilentReplyDelivery();
// 3. 陈旧性检测
if (isStaleCronDelivery({ job, runStartedAt })) { ... skip ... }
// 4. 幂等性缓存
const cachedResults = getCompletedDirectCronDelivery(deliveryIdempotencyKey);
if (cachedResults) { delivered = true; return null; }
// 5. 实际投递
const deliveryResults = options?.retryTransient
? await retryTransientDirectCronDelivery({ jobId, signal, run: runDelivery })
: await runDelivery();
// 6. bestEffort 部分失败处理
let hadPartialFailure = false;
const onError = params.deliveryBestEffort
? (err, _payload) => { hadPartialFailure = true; logCronDeliveryErrorDeferred(...); }
: undefined;
// 7. 成功后缓存 + awareness 通知
if (delivered && shouldQueueCronAwareness(job, deliveryBestEffort)) {
await queueCronAwarenessSystemEvent({...});
}
if (delivered) rememberCompletedDirectCronDelivery(deliveryIdempotencyKey, deliveryResults);
};
七个处理步骤的设计意图:
- NO_REPLY 抑制 :同时处理完全匹配和尾部附加的
NO_REPLYtoken - 空过滤 :所有 payload 都被抑制后,标记
delivered=false但deliveryAttempted=true - 陈旧性检测 :如果交付延迟超过 3 小时(
STALE_CRON_DELIVERY_MAX_START_DELAY_MS),跳过投递 - 幂等性缓存 :基于
runSessionId + channel + accountId + to + threadId构建键,防止重复投递 - 瞬态重试 :
retryTransientDirectCronDelivery最多 3 次重试,间隔 5s/10s/20s - bestEffort :部分失败时记录错误但不中断,
delivered标记为 false - Awareness 通知:isolated 运行成功交付后,向主会话发送系统事件,让用户知道有新输出
8.2.3 finalizeTextDelivery() --- 文本交付终态化
typescript
const finalizeTextDelivery = async (delivery) => {
// 1. 子代理输出收集
const expectedSubagentFollowup = expectsSubagentFollowup(initialSynthesizedText);
let activeSubagentRuns = countActiveDescendantRuns(agentSessionKey);
// 2. 已完成子代回退
const completedDescendantReply = shouldCheckCompletedDescendants
? await readDescendantSubagentFallbackReply({ sessionKey, runStartedAt })
: undefined;
// 3. 等待活跃子代
if (activeSubagentRuns > 0 || expectedSubagentFollowup) {
let finalReply = await waitForDescendantSubagentSummary({...});
// ...
}
// 4. 子代仍在活跃 → 返回部分结果
if (activeSubagentRuns > 0) {
deliveryAttempted = true;
return { status: "ok", deliveryAttempted, ... };
}
// 5. 无改进的临时确认抑制
if (hadDescendants && synthesizedText === initialSynthesizedText && isLikelyInterimCronMessage(...)) {
deliveryAttempted = true;
return { status: "ok", ... }; // 抑制陈旧的"on it"
}
// 6. NO_REPLY 检测
// 7. 实际投递
return await deliverViaDirect(delivery, { retryTransient: true });
};
子代理输出收集的三级策略:
| 条件 | 行为 |
|---|---|
| 有活跃子代 或 expectsSubagentFollowup | waitForDescendantSubagentSummary() --- push-based 等待 |
| 无活跃但有已完成子代 + isLikelyInterimCronMessage | readDescendantSubagentFallbackReply() --- 读取冻结结果 |
| 均不满足 | 直接投递初始文本 |
8.2.4 瞬态错误分类
typescript
const TRANSIENT_DIRECT_CRON_DELIVERY_ERROR_PATTERNS: readonly RegExp[] = [
/\berrorcode=unavailable\b/i,
/\bUNAVAILABLE\b/,
/no active .* listener/i,
/gateway not connected/i,
/gateway closed \(1006/i,
/\b(econnreset|econnrefused|etimedout|enotfound|ehostunreach|network error)\b/i,
];
const PERMANENT_DIRECT_CRON_DELIVERY_ERROR_PATTERNS: readonly RegExp[] = [
/unsupported channel/i,
/chat not found/i,
/bot.*not.*member/i,
/bot was blocked by the user/i,
/forbidden: bot was kicked/i,
];
分类逻辑:
- 先检查是否匹配永久错误模式
- 若不匹配永久模式,再检查是否匹配瞬态模式
- 未匹配任何模式 → 不重试
8.2.5 幂等性缓存与裁剪
typescript
const COMPLETED_DIRECT_CRON_DELIVERIES = new Map<string, CompletedDirectCronDelivery>();
function pruneCompletedDirectCronDeliveries(now: number) {
const ttlMs = process.env.OPENCLAW_TEST_FAST === "1" ? 60_000 : 24 * 60 * 60 * 1000;
// TTL 过期清除
// 超过 2000 条时按时间排序删除最旧的
}
内存缓存,24 小时 TTL,上限 2000 条。每次读写时触发裁剪。防止进程长时间运行后内存泄漏。
📊 建议配图:交付调度流
节点:
[dispatchCronDelivery] → [交付请求?] → [resolveDelivery 成功?] → [结构化内容?] → [deliverViaDirect / finalizeTextDelivery]
子流程 (deliverViaDirect):
[NO_REPLY 过滤] → [陈旧性检测] → [幂等性缓存检查] → [实际投递] → [bestEffort 处理] → [awareness 通知] → [缓存记录]
子流程 (finalizeTextDelivery):
[子代理检测] → [等待/读取子代输出] → [临时确认抑制] → [NO_REPLY 检测] → [deliverViaDirect]
边:
- heartbeat-only→exit: "skipHeartbeatDelivery=true"
- messaging-tool→exit: "skipMessagingToolDelivery=true"
- 陈旧→exit: "STALE_CRON_DELIVERY_MAX_START_DELAY_MS=3h"
- 缓存命中→exit: "delivered=true, return null"
- 瞬态错误→重试: "3次, 5s/10s/20s 间隔"
- 永久错误→exit(error): "非 bestEffort 时返回错误"
标注:
- "幂等性键: runSessionId:channel:accountId:to:threadId"
- "bestEffort: hadPartialFailure → delivered=false"
- "子代理等待: push-based (gateway RPC)"
8.3 delivery-target.ts --- 目标解析
文件路径 : isolated-agent/delivery-target.ts
代码行数 : ~130 行
核心职责: 将 delivery 配置解析为具体的渠道/目标/账户/线程,处理"last"模式的回退逻辑。
8.3.1 resolveDeliveryTarget() 核心流程
typescript
export async function resolveDeliveryTarget(cfg, agentId, jobPayload): Promise<DeliveryTargetResolution> {
const requestedChannel = typeof jobPayload.channel === "string" ? jobPayload.channel : "last";
const explicitTo = typeof jobPayload.to === "string" ? jobPayload.to : undefined;
Step 1: 会话查找
typescript
const mainSessionKey = resolveAgentMainSessionKey({ cfg, agentId });
const store = loadSessionStore(storePath);
const threadEntry = threadSessionKey ? store[threadSessionKey] : undefined;
const main = threadEntry ?? store[mainSessionKey];
优先查找线程专用会话(如 agent:main:main:thread:1234),回退到主会话。
Step 2: 初步解析
typescript
const preliminary = resolveSessionDeliveryTarget({
entry: main, requestedChannel, explicitTo, explicitThreadId, allowMismatchedLastTo,
});
resolveSessionDeliveryTarget 从会话条目中提取 lastChannel/lastTo/lastAccountId/lastThreadId,与请求参数匹配。
Step 3: Channel 回退
typescript
if (!preliminary.channel) {
if (preliminary.lastChannel) {
fallbackChannel = preliminary.lastChannel;
} else {
const selection = await resolveMessageChannelSelection({ cfg });
fallbackChannel = selection.channel;
}
}
三级回退:请求 channel → 会话 lastChannel → 全局默认 channel 选择。
Step 4: AccountId 解析
typescript
const explicitAccountId = jobPayload.accountId?.trim();
let accountId = explicitAccountId ?? resolved.accountId;
if (!accountId && channel) {
accountId = resolveFirstBoundAccountId({ cfg, channelId: channel, agentId });
}
if (jobPayload.accountId) accountId = jobPayload.accountId; // 最高优先级
AccountId 优先级:job.delivery.accountId > explicitAccountId > 会话 lastAccountId > agent 绑定账户。
Step 5: Target 解析(docking)
typescript
const docked = await resolveOutboundTargetWithRuntime({
channel, to: toCandidate, cfg, accountId, mode, allowFrom: effectiveAllowFrom,
});
const idLikeTarget = await maybeResolveIdLikeTarget({ cfg, channel, input: docked.to, accountId });
两步解析:
resolveOutboundTargetWithRuntime--- 将目标标识符解析为渠道内部格式(如 Telegram chat ID)maybeResolveIdLikeTarget--- 将 ID 格式的字符串(如ou_xxxx)解析为实际目标
Step 6: allowFrom 安全检查
typescript
if (mode === "implicit") {
const configuredAllowFrom = channelPlugin?.config.resolveAllowFrom?.({ cfg, accountId });
const storeAllowFrom = readChannelAllowFromStoreEntriesSync(channel, env, accountId);
const allowFromOverride = [...new Set([...configuredAllowFrom, ...storeAllowFrom])];
if (toCandidate && allowFromOverride.length > 0) {
const currentTargetResolution = await resolveOutboundTargetWithRuntime({...});
if (!currentTargetResolution.ok) {
toCandidate = allowFromOverride[0]; // 回退到 allowFrom 第一个允许的目标
}
}
}
implicit 模式下,若目标不在 allowFrom 列表中,自动回退到列表中的第一个允许目标------确保不会向未授权的目标发送消息。
📊 建议配图:目标解析流
节点:
[resolveDeliveryTarget] → [会话查找] → [初步解析] → [Channel 回退] → [AccountId 解析] → [Target Docking] → [allowFrom 检查] → [返回结果]
边:
- 会话查找→thread优先→main回退: "先线程会话,后主会话"
- Channel 回退→lastChannel→全局默认: "requested→last→resolveMessageChannelSelection"
- AccountId→job.delivery.accountId: "最高优先级"
- allowFrom 不匹配→回退第一个: "implicit 模式安全保护"
- docking 失败→ok=false: "返回错误"
标注:
- "allowFrom = configured ∪ store, 去重"
- "maybeResolveIdLikeTarget: ID 格式解析"
8.4 webhook 与 announce 模式对比
| 维度 | announce | webhook |
|---|---|---|
| 投递方式 | 渠道消息(Telegram/Discord/Feishu 等) | HTTP POST 请求 |
| 目标解析 | resolveDeliveryTarget → 渠道/用户/线程 |
delivery.to 作为 URL |
| 子代理等待 | ✅ finalizeTextDelivery 中等待子代理完成 | ❌ 直接发送,不等待 |
| 瞬态重试 | ✅ 3 次指数退避 | 取决于 webhook 实现 |
| 幂等缓存 | ✅ COMPLETED_DIRECT_CRON_DELIVERIES |
❌ |
| 陈旧检测 | ✅ 3 小时阈值 | ❌ |
| NO_REPLY 抑制 | ✅ | ❌ |
| bestEffort | ✅ 部分失败容忍 | ❌ |
| Awareness 通知 | ✅ queueCronAwarenessSystemEvent | ❌ |
| Channel/AccountId | 必需 | 不需要 |
| 线程支持 | ✅ threadId | ❌ |
设计哲学差异:
- announce 是"人可读"交付------面向最终用户,需要完整的消息路由、格式转换、安全检查
- webhook 是"机器可读"交付------面向外部系统,关注可靠性和简单性
8.5 故障目标(failureDestination)处理
触发路径 :delivery.ts → sendFailureNotificationAnnounce()
typescript
export async function sendFailureNotificationAnnounce(
deps, cfg, agentId, jobId, target, message,
): Promise<void> {
const resolvedTarget = await resolveDeliveryTarget(cfg, agentId, {
channel: target.channel, to: target.to, accountId: target.accountId,
sessionKey: target.sessionKey,
});
// 投递失败消息(单条 text payload)
await deliverOutboundPayloads({
channel: resolvedTarget.channel, to: resolvedTarget.to, ...
payloads: [{ text: message }],
bestEffort: false, // 故障通知不允许部分失败
});
}
关键约束:
- 30 秒超时(
FAILURE_NOTIFICATION_TIMEOUT_MS) bestEffort: false--- 故障通知必须完整投递- 投递失败只记录警告,不影响主流程返回
去重逻辑 (在 resolveFailureDestination 中):
typescript
if (delivery && isSameDeliveryTarget(delivery, result)) {
return null; // 与正常交付相同,避免双重通知
}
📊 建议配图:故障目标处理
节点:
[Cron 运行失败] → [resolveFailureDestination] → [sendFailureNotificationAnnounce] → [resolveDeliveryTarget] → [deliverOutboundPayloads]
边:
- resolveFailureDestination→null: "与正常交付相同 或 配置无效"
- resolveFailureDestination→plan: "独立故障目标"
- resolveDeliveryTarget 失败→warn+exit: "只记录警告"
- deliverOutboundPayloads 失败→warn: "bestEffort=false, 30s 超时"
标注:
- "30s 超时独立 AbortController"
- "去重: isSameDeliveryTarget"
九、辅助模块深度解析
9.1 normalize.ts --- 输入规范化
文件路径 : normalize.ts
代码行数 : ~300 行
核心职责: 将用户/CLI/API 的原始输入规范化为内部统一的 CronJob 格式。这是系统防御性编程的第一道防线。
9.1.1 整体架构
normalizeCronJobInput(raw, options)
├─ unwrapJob(raw) --- 解包 {data: ...} / {job: ...} 包装
├─ Agent ID / SessionKey / Enabled 规范化
├─ coerceSchedule(base.schedule)
├─ inferTopLevelPayload(next) --- 顶层字段推断 payload
├─ coercePayload(base.payload)
├─ coerceDelivery(base.delivery)
├─ copyTopLevelAgentTurnFields() --- 旧格式字段迁移到 payload
├─ stripLegacyTopLevelFields() --- 清除旧格式顶层字段
└─ applyDefaults (可选)
9.1.2 coerceSchedule() 逐行解析
typescript
function coerceSchedule(schedule: UnknownRecord) {
const next: UnknownRecord = { ...schedule };
Kind 推断:
typescript
const rawKind = normalizeLowercaseStringOrEmpty(schedule.kind);
const kind = rawKind === "at" || rawKind === "every" || rawKind === "cron" ? rawKind : undefined;
// 若 kind 未明确指定:
if (typeof schedule.atMs === "number" || typeof schedule.at === "string" || ...) {
next.kind = "at"; // 有 atMs/at → 一次性
} else if (typeof schedule.everyMs === "number") {
next.kind = "every"; // 有 everyMs → 重复
} else if (normalizedExpr) {
next.kind = "cron"; // 有 cron 表达式 → cron
}
三段式推断:atMs/at → everyMs → expr,覆盖最常见的用户输入模式。
atMs 规范化:
typescript
const parsedAtMs =
typeof atMsRaw === "number" ? atMsRaw // 直接数值
: typeof atMsRaw === "string" ? parseAbsoluteTimeMs(atMsRaw) // 字符串时间戳
: atString ? parseAbsoluteTimeMs(atString) // ISO 时间字符串
: null;
if (atString) {
next.at = parsedAtMs !== null ? new Date(parsedAtMs).toISOString() : atString;
}
delete next.atMs; // 统一为 ISO 字符串格式
Kind 清理(删除不相关的字段):
typescript
if (next.kind === "at") {
delete next.everyMs; delete next.anchorMs; delete next.expr;
delete next.tz; delete next.staggerMs;
} else if (next.kind === "every") {
delete next.at; delete next.expr; delete next.tz; delete next.staggerMs;
} else if (next.kind === "cron") {
delete next.at; delete next.everyMs; delete next.anchorMs;
}
每种 kind 只保留相关字段,减少存储体积和歧义。
Stagger 处理:
typescript
const staggerMs = normalizeCronStaggerMs(schedule.staggerMs);
if (staggerMs !== undefined) next.staggerMs = staggerMs;
else delete next.staggerMs;
normalizeCronStaggerMs 将输入转为非负整数,无效值删除。
9.1.3 coercePayload() 逐行解析
Kind 规范化:
typescript
const kindRaw = normalizeLowercaseStringOrEmpty(next.kind);
if (kindRaw === "agentturn") next.kind = "agentTurn"; // 驼峰化
else if (kindRaw === "systemevent") next.kind = "systemEvent";
隐式 Kind 推断:
typescript
if (!next.kind) {
if (hasMessage) next.kind = "agentTurn"; // 有 message → agent 交互
else if (hasText) next.kind = "systemEvent"; // 有 text → 系统事件
else if (hasAgentTurnPayloadHint(next)) next.kind = "agentTurn"; // 有 model/fallbacks/thinking 等 → agentTurn
}
hasAgentTurnPayloadHint 检测仅包含 agentTurn 专属字段的 patch:
typescript
function hasAgentTurnPayloadHint(payload) {
return hasTrimmedStringValue(payload.model) ||
normalizeTrimmedStringArray(payload.fallbacks) !== undefined ||
normalizeTrimmedStringArray(payload.toolsAllow, { allowNull: true }) !== undefined ||
hasTrimmedStringValue(payload.thinking) ||
typeof payload.timeoutSeconds === "number" ||
typeof payload.lightContext === "boolean" ||
typeof payload.allowUnsafeExternalContent === "boolean";
}
Kind 清理:
typescript
if (next.kind === "systemEvent") {
delete next.message; delete next.model; delete next.fallbacks;
delete next.thinking; delete next.timeoutSeconds; delete next.lightContext;
delete next.allowUnsafeExternalContent; delete next.toolsAllow;
} else if (next.kind === "agentTurn") {
delete next.text; // agentTurn 不使用 text
}
旧格式字段清理:
typescript
delete next.deliver; // 旧交付标志
delete next.channel; // 旧渠道字段
delete next.to; // 旧目标字段
delete next.threadId; // 旧线程字段
delete next.bestEffortDeliver; // 旧 bestEffort
delete next.provider; // 旧 provider
这些字段在旧版 API 中位于 payload 顶层,现已迁移到 delivery 子对象中。
9.1.4 coerceDelivery() 逐行解析
typescript
function coerceDelivery(delivery: UnknownRecord) {
const parsed = parseDeliveryInput(delivery); // 使用 Zod schema 校验
// 逐字段赋值或删除
}
委托 delivery-field-schemas.ts 中的 Zod schema 进行类型安全的解析:
typescript
export const DeliveryModeFieldSchema = z
.preprocess(trimLowercaseStringPreprocess, z.enum(["deliver", "announce", "none", "webhook"]))
.transform(value => value === "deliver" ? "announce" : value);
deliver → announce 的别名转换在 schema 层完成。
9.1.5 applyDefaults 阶段
typescript
if (options.applyDefaults) {
if (!next.wakeMode) next.wakeMode = "now";
if (typeof next.enabled !== "boolean") next.enabled = true;
if (!next.name) next.name = inferLegacyName({...});
// sessionTarget 默认值
if (!next.sessionTarget) {
if (kind === "systemEvent") next.sessionTarget = "main";
else if (kind === "agentTurn") next.sessionTarget = "isolated";
}
// "current" → 实际 sessionKey 解析
if (next.sessionTarget === "current") {
next.sessionTarget = `session:${assertSafeCronSessionTargetId(sessionKey)}`;
// 无 sessionContext 时回退到 isolated
}
// at 类型默认 deleteAfterRun=true
if (schedule.kind === "at" && !next.deleteAfterRun) next.deleteAfterRun = true;
// cron 类型默认 stagger
if (schedule.kind === "cron") { ... resolveDefaultCronStaggerMs ... }
// isolated agentTurn 默认 delivery: { mode: "announce" }
if (!hasDelivery && isIsolatedAgentTurn && payloadKind === "agentTurn") {
next.delivery = { mode: "announce" };
}
}
关键默认值策略:
agentTurn默认sessionTarget="isolated"--- 避免长期积累 tokensystemEvent默认sessionTarget="main"--- 系统事件注入主会话at类型默认deleteAfterRun=true--- 一次性任务执行后自动清理- isolated agentTurn 默认 announce 交付 --- agent 无法自行发送消息
📊 建议配图:规范化流水线
节点:
[normalizeCronJobInput] → [unwrapJob] → [Agent ID/SessionKey/Enabled] → [coerceSchedule] → [inferPayload] → [coercePayload] → [coerceDelivery] → [copyTopLevelFields] → [stripLegacyFields] → [applyDefaults]
边:
- 每个步骤→下一步: "规范化后的中间结果"
- applyDefaults 内部子边:
- sessionTarget=current→session:xxx: "有 sessionContext"
- sessionTarget=current→isolated: "无 sessionContext"
- kind=at→deleteAfterRun=true: "一次性任务"
- cron→staggerMs: "isRecurringTopOfHourCronExpr → 5min"
- isolated agentTurn→delivery.announce: "默认交付"
标注:
- "coerceSchedule: kind 推断 → atMs 规范化 → 字段清理"
- "coercePayload: kind 推断 → 字段清理 → 旧字段删除"
- "coerceDelivery: Zod schema 解析"
9.2 store.ts --- 持久化与备份策略
文件路径 : store.ts
代码行数 : ~160 行
核心职责 : 管理 jobs.json 的读写,实现安全写入、备份、缓存和运行时字段剥离。
9.2.1 loadCronStore() --- 加载
typescript
export async function loadCronStore(storePath: string): Promise<CronStoreFile> {
const raw = await fs.promises.readFile(storePath, "utf-8");
const parsed = parseJsonWithJson5Fallback(raw); // 容忍 JSON5 格式
const store = { version: 1, jobs: jobs.filter(Boolean) };
serializedStoreCache.set(storePath, JSON.stringify(store, null, 2)); // 缓存序列化结果
return store;
}
JSON5 容错 :parseJsonWithJson5Fallback 先尝试标准 JSON 解析,失败后回退到 JSON5------容忍注释、尾逗号等。
缓存策略 :加载后立即缓存 JSON.stringify(store, null, 2) 的结果,用于后续 save 时的短路比较。
9.2.2 saveCronStore() --- 保存
typescript
export async function saveCronStore(storePath, store, opts?) {
const json = JSON.stringify(store, null, 2);
const cached = serializedStoreCache.get(storePath);
// 短路 1: 与缓存相同
if (cached === json) return;
// 短路 2: 与文件相同
let previous = cached ?? await fs.promises.readFile(storePath, "utf-8") ?? null;
if (previous === json) { serializedStoreCache.set(storePath, json); return; }
// 备份检查: 仅运行时字段变化时跳过备份
const skipBackup = opts?.skipBackup || shouldSkipCronBackupForRuntimeOnlyChanges(previous, store);
// 安全写入: tmp → rename
const tmp = `${storePath}.${process.pid}.${randomBytes(8).toString("hex")}.tmp`;
await fs.promises.writeFile(tmp, json, { encoding: "utf-8", mode: 0o600 });
if (previous !== null && !skipBackup) {
await fs.promises.copyFile(storePath, `${storePath}.bak`);
}
await renameWithRetry(tmp, storePath);
serializedStoreCache.set(storePath, json);
}
三层短路:
- 内存缓存比较 --- 避免文件 I/O
- 文件内容比较 --- 避免写入
- 运行时字段差异 --- 避免不必要的备份
9.2.3 运行时字段剥离
typescript
function stripRuntimeOnlyCronFields(store: CronStoreFile): unknown {
return {
version: store.version,
jobs: store.jobs.map(job => {
const { state: _state, updatedAtMs: _updatedAtMs, ...rest } = job;
return rest;
}),
};
}
function shouldSkipCronBackupForRuntimeOnlyChanges(previousRaw, nextStore): boolean {
const previous = parseCronStoreForBackupComparison(previousRaw);
return JSON.stringify(strip(previous)) === JSON.stringify(strip(nextStore));
}
state 和 updatedAtMs 是运行时动态更新的字段(每次 tick 都会更新 nextRunAtMs),剥离后比较------若只有运行时字段变化,不创建 .bak 备份文件,减少磁盘 I/O 和备份噪音。
9.2.4 renameWithRetry() --- 原子写入重试
typescript
async function renameWithRetry(src, dest): Promise<void> {
for (let attempt = 0; attempt <= RENAME_MAX_RETRIES; attempt++) {
try {
await fs.promises.rename(src, dest);
return;
} catch (err) {
if (code === "EBUSY" && attempt < RENAME_MAX_RETRIES) {
await setTimeout(RENAME_BASE_DELAY_MS * 2 ** attempt); // 指数退避
continue;
}
// Windows 兼容: rename 无法替换已存在文件时回退到 copyFile + unlink
if (code === "EPERM" || code === "EEXIST") {
await fs.promises.copyFile(src, dest);
await fs.promises.unlink(src).catch(() => {});
return;
}
throw err;
}
}
}
跨平台原子写入:
- Linux/macOS :
rename()原子替换 - Windows :
EPERM/EEXIST时回退到copyFile+unlink(非原子,但可靠) - EBUSY:指数退避重试(防病毒软件锁文件)
文件权限 :所有文件 0o600(仅 owner 读写),目录 0o700(仅 owner 访问)。
📊 建议配图:store 持久化流
节点:
[saveCronStore] → [缓存比较] → [文件比较] → [备份检查] → [安全写入] → [renameWithRetry]
边:
- 缓存命中→exit: "无变化"
- 文件相同→exit: "无变化"
- 仅运行时变化→skipBackup: "不创建 .bak"
- rename 成功→exit: "完成"
- EBUSY→重试: "指数退避, 最多 3 次"
- EPERM/EEXIST→copyFile: "Windows 兼容"
标注:
- "0o600 文件权限, 0o700 目录权限"
- "tmp 命名: storePath.pid.randomHex.tmp"
- "JSON5 容错解析"
9.3 run-log.ts --- JSONL 日志与裁剪
文件路径 : run-log.ts
代码行数 : ~300 行
核心职责: 记录每次 cron 运行的结果到 JSONL 文件,支持分页查询和自动裁剪。
9.3.1 appendCronRunLog() --- 追加日志
typescript
export async function appendCronRunLog(filePath, entry, opts?) {
const resolved = path.resolve(filePath);
const prev = writesByPath.get(resolved) ?? Promise.resolve();
const next = prev.catch(() => undefined).then(async () => {
await fs.mkdir(runDir, { recursive: true, mode: 0o700 });
await fs.appendFile(resolved, `${JSON.stringify(entry)}\n`, { encoding: "utf-8", mode: 0o600 });
await pruneIfNeeded(resolved, { maxBytes, keepLines });
});
writesByPath.set(resolved, next);
try { await next; }
finally { if (writesByPath.get(resolved) === next) writesByPath.delete(resolved); }
}
串行化写入 :writesByPath Map 确保同一文件的写入严格串行------每次写入等待前一次完成后再开始。防止并发写入导致 JSONL 行交错。
CronRunLogEntry 结构:
typescript
type CronRunLogEntry = {
ts: number; // 时间戳
jobId: string; // Job ID
action: "finished"; // 动作类型(目前只有 finished)
status?: "ok" | "error" | "skipped";
error?: string;
summary?: string;
delivered?: boolean;
deliveryStatus?: "delivered" | "not-delivered" | "unknown" | "not-requested";
deliveryError?: string;
sessionId?: string;
sessionKey?: string;
runAtMs?: number;
durationMs?: number;
nextRunAtMs?: number;
model?: string;
provider?: string;
usage?: { input_tokens, output_tokens, total_tokens, cache_read_tokens, cache_write_tokens };
};
9.3.2 pruneIfNeeded() --- 自动裁剪
typescript
async function pruneIfNeeded(filePath, opts) {
const stat = await fs.stat(filePath);
if (stat.size <= opts.maxBytes) return; // 未超限
const raw = await fs.readFile(filePath, "utf-8");
const lines = raw.split("\n").map(l => l.trim()).filter(Boolean);
const kept = lines.slice(Math.max(0, lines.length - opts.keepLines)); // 保留最新 N 行
// 原子替换
const tmp = `${filePath}.${process.pid}.${randomBytes(8).toString("hex")}.tmp`;
await fs.writeFile(tmp, `${kept.join("\n")}\n`, { mode: 0o600 });
await fs.rename(tmp, filePath);
}
裁剪策略:
- 触发条件:文件大小超过
maxBytes(默认 2MB) - 保留行数:最新的
keepLines行(默认 2000 行) - 可配置:
cron.runLog.maxBytes和cron.runLog.keepLines
9.3.3 readCronRunLogEntriesPage() --- 分页查询
typescript
export async function readCronRunLogEntriesPage(filePath, opts?): Promise<CronRunLogPageResult> {
const all = parseAllRunLogEntries(raw, { jobId }); // 解析全部条目
const filtered = filterRunLogEntries(all, { statuses, deliveryStatuses, query, queryTextForEntry });
const sorted = sortDir === "asc" ? asc : desc;
return { entries: sorted.slice(offset, offset + limit), total, hasMore, nextOffset };
}
查询能力:
- 按 jobId 过滤
- 按 status 过滤(ok/error/skipped)
- 按 deliveryStatus 过滤(delivered/not-delivered/unknown/not-requested)
- 全文搜索(匹配 summary + error + jobId + jobName)
- 升序/降序排序
- 分页(offset + limit,上限 200)
跨 Job 查询 :readCronRunLogEntriesPageAll() 扫描 runs/ 目录下所有 .jsonl 文件,合并后分页。
安全校验:
typescript
function assertSafeCronRunLogJobId(jobId: string): string {
if (trimmed.includes("/") || trimmed.includes("\\") || trimmed.includes("\0")) {
throw new Error("invalid cron run log job id");
}
const resolvedPath = path.resolve(runsDir, `${safeJobId}.jsonl`);
if (!resolvedPath.startsWith(`${runsDir}${path.sep}`)) {
throw new Error("invalid cron run log job id"); // 路径遍历防护
}
}
双重防护:禁止路径分隔符 + 解析后路径必须在 runs/ 目录内。
📊 建议配图:日志系统架构
节点:
[appendCronRunLog] → [串行化队列] → [mkdir + appendFile] → [pruneIfNeeded]
[readCronRunLogEntriesPage] → [drainPendingWrite] → [parseAllRunLogEntries] → [filter] → [sort] → [slice]
[readCronRunLogEntriesPageAll] → [readdir runs/] → [并行 parse] → [flat + filter + sort + slice]
边:
- 串行化队列→前次写入: "writesByPath Map"
- pruneIfNeeded→skip: "size ≤ maxBytes"
- pruneIfNeeded→trim: "保留最新 keepLines 行"
- drainPendingWrite→read: "确保写入完成后再读"
标注:
- "maxBytes=2MB, keepLines=2000"
- "文件权限: 0o600"
- "路径遍历防护: assertSafeCronRunLogJobId"
9.4 session-reaper.ts --- 会话收割器
文件路径 : session-reaper.ts
代码行数 : ~100 行
核心职责: 定期清理过期的 cron 运行会话,防止会话存储无限膨胀。
9.4.1 sweepCronRunSessions() --- 清理扫描
typescript
export async function sweepCronRunSessions(params): Promise<ReaperResult> {
const now = params.nowMs ?? Date.now();
const lastSweepAtMs = lastSweepAtMsByStore.get(storePath) ?? 0;
// 节流: 5 分钟内不重复扫描
if (!params.force && now - lastSweepAtMs < MIN_SWEEP_INTERVAL_MS) {
return { swept: false, pruned: 0 };
}
const retentionMs = resolveRetentionMs(params.cronConfig);
if (retentionMs === null) return { swept: false, pruned: 0 }; // 禁用
await updateSessionStore(storePath, (store) => {
const cutoff = now - retentionMs;
for (const key of Object.keys(store)) {
if (!isCronRunSessionKey(key)) continue; // 只清理 cron 运行 key
if (entry.updatedAt < cutoff) {
prunedSessions.set(entry.sessionId, entry.sessionFile);
delete store[key];
pruned++;
}
}
});
// 归档 transcript 文件
await archiveRemovedSessionTranscripts({ removedSessionFiles, referencedSessionIds, ... });
await cleanupArchivedSessionTranscripts({ directories, olderThanMs: retentionMs, ... });
}
设计要点:
- 自节流 :
MIN_SWEEP_INTERVAL_MS=5min,通过lastSweepAtMsByStoreMap 按 store 路径独立节流 - 选择性清理 :
isCronRunSessionKey()只匹配...:cron:<jobId>:run:<uuid>格式的 key,保留基础会话 - Transcript 归档 :先归档到
archive/目录,再在归档内按 retention 时间二次清理 - 引用检查 :
referencedSessionIds确保不会删除仍在被其他会话引用的 transcript
锁序约束(注释特别强调):
此函数通过 updateSessionStore 获取会话存储文件锁。
必须在 cron service 的 locked() 段之外调用,避免锁序反转。
9.4.2 retention 配置
typescript
export function resolveRetentionMs(cronConfig?): number | null {
if (cronConfig?.sessionRetention === false) return null; // 显式禁用
const raw = cronConfig?.sessionRetention;
if (typeof raw === "string" && raw.trim()) {
return parseDurationMs(raw.trim(), { defaultUnit: "h" }); // 支持字符串如 "12h", "2d"
}
return DEFAULT_RETENTION_MS; // 24 小时
}
📊 建议配图:会话收割器
节点:
[sweepCronRunSessions] → [节流检查] → [retention 检查] → [updateSessionStore 扫描] → [archiveRemovedSessionTranscripts] → [cleanupArchivedSessionTranscripts]
边:
- 节流跳过→exit: "5min 内已扫描"
- 禁用→exit: "sessionRetention=false"
- 扫描→删除: "updatedAt < cutoff"
- 归档→清理: "olderThanMs=retentionMs"
标注:
- "MIN_SWEEP_INTERVAL_MS=5min"
- "DEFAULT_RETENTION_MS=24h"
- "只清理 isCronRunSessionKey 匹配的 key"
- "锁序: 必须在 locked() 段外调用"
9.5 stagger.ts --- 防惊群散列
文件路径 : stagger.ts
代码行数 : ~35 行
核心职责: 计算整点 cron 任务的随机散列延迟,防止多个任务在同一秒触发导致系统过载。
9.5.1 核心逻辑
typescript
export const DEFAULT_TOP_OF_HOUR_STAGGER_MS = 5 * 60 * 1000; // 5 分钟
export function isRecurringTopOfHourCronExpr(expr: string): boolean {
const fields = parseCronFields(expr);
if (fields.length === 5) {
return fields[0] === "0" && fields[1].includes("*"); // 0 * * * *
}
if (fields.length === 6) {
return fields[0] === "0" && fields[1] === "0" && fields[2].includes("*"); // 0 0 * * * *
}
return false;
}
export function resolveDefaultCronStaggerMs(expr: string): number | undefined {
return isRecurringTopOfHourCronExpr(expr) ? DEFAULT_TOP_OF_HOUR_STAGGER_MS : undefined;
}
export function resolveCronStaggerMs(schedule): number {
const explicit = normalizeCronStaggerMs(schedule.staggerMs);
if (explicit !== undefined) return explicit;
return resolveDefaultCronStaggerMs(cronExpr) ?? 0;
}
设计意图:
当多个 cron job 都配置为 0 * * * *(每整点)时,若无散列,所有 job 会在整点 0 秒同时触发,造成:
- API 限流:同时发起大量 LLM 请求
- 资源争抢:CPU/内存瞬时峰值
- 交付拥塞:消息队列积压
staggerMs 的值(5 分钟)作为上限 ,实际延迟在运行时由调度器随机分配 [0, staggerMs) 范围内的值。
仅对整点 cron 自动启用 :0 9 * * * 或 0 0 * * * * 等 minute=0 且 hour 含通配符的表达式。其他表达式(如 30 * * * *)不自动启用,因为它们已经自然分散。
📊 建议配图:散列策略
节点:
[resolveCronStaggerMs] → [explicit staggerMs?] → [isRecurringTopOfHourCronExpr?] → [返回值]
边:
- explicit→exit: "用户指定值"
- 整点 cron→5min: "DEFAULT_TOP_OF_HOUR_STAGGER_MS"
- 非整点→0: "无需散列"
标注:
- "整点判定: minute=0 && hour 含 *"
- "6 字段格式: second=0 && minute=0 && hour 含 *"
- "实际延迟: random(0, staggerMs)"
9.6 validate-timestamp.ts --- 时间戳校验
文件路径 : validate-timestamp.ts
代码行数 : ~40 行
核心职责 : 校验 schedule.at 时间戳的合法性,防止过去时间和过远未来时间。
9.6.1 validateScheduleTimestamp()
typescript
export function validateScheduleTimestamp(schedule, nowMs = Date.now()): TimestampValidationResult {
if (schedule.kind !== "at") return { ok: true }; // 仅校验 at 类型
const atMs = parseAbsoluteTimeMs(atRaw);
if (atMs === null || !Number.isFinite(atMs)) {
return { ok: false, message: `Invalid schedule.at: expected ISO-8601 timestamp` };
}
const diffMs = atMs - nowMs;
// 过去时间(1 分钟宽限)
if (diffMs < -ONE_MINUTE_MS) {
return { ok: false, message: `schedule.at is in the past: ${atDate} (${minutesAgo} minutes ago)` };
}
// 过远未来(10 年上限)
if (diffMs > TEN_YEARS_MS) {
return { ok: false, message: `schedule.at is too far in the future: ${atDate} (${yearsAhead} years ahead)` };
}
return { ok: true };
}
两个边界:
- 过去 :1 分钟宽限(
ONE_MINUTE_MS = 60000),容忍时钟偏移和调度延迟 - 未来 :10 年上限(
TEN_YEARS_MS),防止用户误输入错误年份
仅校验 at 类型 :every 和 cron 类型天然是循环的,无需时间戳校验。
📊 建议配图:时间戳校验
节点:
[validateScheduleTimestamp] → [kind=at?] → [parseAbsoluteTimeMs] → [过去检查] → [未来检查]
边:
- kind≠at→exit(ok): "仅校验 at 类型"
- parse 失败→exit(error): "Invalid timestamp"
- diff < -1min→exit(error): "过去时间"
- diff > 10yr→exit(error): "过远未来"
- 合法→exit(ok): "通过"
标注:
- "1 分钟宽限: 时钟偏移容忍"
- "10 年上限: 误输入防护"
9.7 active-jobs.ts --- 内存活跃追踪
文件路径 : active-jobs.ts
代码行数 : ~30 行
核心职责: 在进程内存中追踪当前正在执行的 cron job,防止同一 job 被并发执行。
9.7.1 实现
typescript
const CRON_ACTIVE_JOB_STATE_KEY = Symbol.for("openclaw.cron.activeJobs");
function getCronActiveJobState(): CronActiveJobState {
return resolveGlobalSingleton<CronActiveJobState>(CRON_ACTIVE_JOB_STATE_KEY, () => ({
activeJobIds: new Set<string>(),
}));
}
export function markCronJobActive(jobId: string) { getCronActiveJobState().activeJobIds.add(jobId); }
export function clearCronJobActive(jobId: string) { getCronActiveJobState().activeJobIds.delete(jobId); }
export function isCronJobActive(jobId: string) { return getCronActiveJobState().activeJobIds.has(jobId); }
Symbol.for 全局单例:
使用 Symbol.for("openclaw.cron.activeJobs") 而非模块级变量------确保即使模块被多次实例化(如测试中的不同 import 路径),状态仍然是全局共享的。resolveGlobalSingleton 按 Symbol 键存储单例,首次调用时初始化,后续复用。
使用场景:
- 调度器在 tick 时检查
isCronJobActive(jobId)------若 job 正在执行,跳过本次触发 markCronJobActive在执行开始时调用clearCronJobActive在执行结束后调用(无论成功/失败)resetCronActiveJobsForTests用于测试清理
注意 :这是进程级追踪,不支持跨进程/跨机器的去重。分布式环境下需要依赖外部锁服务。
📊 建议配图:活跃追踪
节点:
[调度器 tick] → [isCronJobActive?] → [markCronJobActive] → [执行 job] → [clearCronJobActive]
边:
- active=true→skip: "跳过本次触发"
- active=false→mark→execute: "开始执行"
- 执行完成→clear: "无论成功/失败"
标注:
- "Symbol.for 全局单例: 跨模块实例共享"
- "进程级: 不支持分布式去重"
- "Set<string> activeJobIds"
9.8 补充模块
9.8.1 normalize-job-identity.ts
typescript
export function normalizeCronJobIdentityFields(raw): { mutated: boolean; legacyJobIdIssue: boolean } {
const rawId = normalizeOptionalString(raw.id) ?? "";
const legacyJobId = normalizeOptionalString(raw.jobId) ?? "";
const hadJobIdKey = "jobId" in raw;
const normalizedId = rawId || legacyJobId;
const idChanged = Boolean(normalizedId && raw.id !== normalizedId);
if (idChanged) raw.id = normalizedId;
if (hadJobIdKey) delete raw.jobId;
return { mutated: idChanged || hadJobIdKey, legacyJobIdIssue: hadJobIdKey };
}
向后兼容:jobId 旧字段迁移到 id,删除旧键。legacyJobIdIssue 标志用于向用户发出弃用警告。
9.8.2 webhook-url.ts
typescript
export function normalizeHttpWebhookUrl(value: unknown): string | null {
const trimmed = value.trim();
try {
const parsed = new URL(trimmed);
if (!isAllowedWebhookProtocol(parsed.protocol)) return null; // 只允许 http/https
return trimmed;
} catch { return null; }
}
简单但严格:只允许 http: 和 https: 协议,防止 javascript:、data: 等 SSRF 向量。
9.8.3 parse.ts --- 绝对时间解析
typescript
export function parseAbsoluteTimeMs(input: string): number | null {
const raw = input.trim();
if (/^\d+$/.test(raw)) {
const n = Number(raw);
if (Number.isFinite(n) && n > 0) return Math.floor(n); // 纯数字 → 毫秒时间戳
}
const parsed = Date.parse(normalizeUtcIso(raw));
return Number.isFinite(parsed) ? parsed : null;
}
function normalizeUtcIso(raw: string) {
if (ISO_TZ_RE.test(raw)) return raw; // 已有时区 → 直接解析
if (ISO_DATE_RE.test(raw)) return `${raw}T00:00:00Z`; // 2024-01-01 → 午夜 UTC
if (ISO_DATE_TIME_RE.test(raw)) return `${raw}Z`; // 2024-01-01T12:00 → 追加 Z
return raw;
}
三种输入格式:
- 纯数字(
1703980800000)→ 直接作为毫秒时间戳 - ISO 日期时间(
2024-01-01T12:00:00Z)→ Date.parse - ISO 日期(
2024-01-01)→ 追加T00:00:00Z
无时区字符串默认 UTC :追加 Z 后缀而非使用本地时区,保证跨时区一致性。
9.8.4 delivery-field-schemas.ts --- Zod Schema
typescript
export const DeliveryModeFieldSchema = z
.preprocess(trimLowercaseStringPreprocess, z.enum(["deliver", "announce", "none", "webhook"]))
.transform(value => value === "deliver" ? "announce" : value);
export const DeliveryThreadIdFieldSchema = z.union([
TrimmedNonEmptyStringFieldSchema,
z.number().finite(),
]);
export function parseDeliveryInput(input): ParsedDeliveryInput {
return {
mode: parseOptionalField(DeliveryModeFieldSchema, input.mode),
channel: parseOptionalField(LowercaseNonEmptyStringFieldSchema, input.channel),
to: parseOptionalField(TrimmedNonEmptyStringFieldSchema, input.to),
threadId: parseOptionalField(DeliveryThreadIdFieldSchema, input.threadId),
accountId: parseOptionalField(TrimmedNonEmptyStringFieldSchema, input.accountId),
};
}
每个字段独立解析,任一字段校验失败不影响其他字段------parseOptionalField 使用 safeParse,失败返回 undefined 而非抛出异常。
threadId 双类型:支持字符串和数字------Telegram 的 thread ID 是数字,某些渠道是字符串。
9.8.5 helpers.ts --- Payload 工具函数
已在 7.1.4 节详述 resolveCronPayloadOutcome,此处补充其他工具函数:
pickSummaryFromOutput():
typescript
export function pickSummaryFromOutput(text: string | undefined) {
const clean = (text ?? "").trim();
if (!clean) return undefined;
const limit = 2000;
return clean.length > limit ? `${truncateUtf16Safe(clean, limit)}...` : clean;
}
截断上限 2000 字符,使用 truncateUtf16Safe 确保 UTF-16 代理对不被切断。
pickDeliverablePayloads():
typescript
export function pickDeliverablePayloads(payloads): DeliveryPayload[] {
const successful = payloads.filter(p => p != null && p.isError !== true && isDeliverablePayload(p));
if (successful.length > 0) return successful;
const last = pickLastDeliverablePayload(payloads); // 无成功 payload 时回退到最后一个可交付的
return last ? [last] : [];
}
优先返回所有成功的可交付 payload,若无则回退到包含错误的最后一个------确保至少有内容交付。
isDeliverablePayload():
typescript
function isDeliverablePayload(payload): boolean {
const hasInteractive = (payload.interactive?.blocks?.length ?? 0) > 0;
const hasChannelData = Object.keys(payload.channelData ?? {}).length > 0;
return hasOutboundReplyContent(payload, { trimText: true }) || hasInteractive || hasChannelData;
}
三维度判定:文本内容 / 交互式卡片 / 渠道特定数据。
9.8.6 run-config.ts --- 运行配置构建
typescript
export function buildCronAgentDefaultsConfig(params) {
const { overrideModel, definedOverrides } = extractCronAgentDefaultsOverride(params.agentConfigOverride);
return mergeCronAgentModelOverride({
defaults: Object.assign({}, params.defaults, definedOverrides),
overrideModel,
});
}
sandbox 排除:
typescript
function extractCronAgentDefaultsOverride(agentConfigOverride?) {
const { model: overrideModel, sandbox: _agentSandboxOverride, ...agentOverrideRest } = agentConfigOverride ?? {};
return { overrideModel, definedOverrides: ...agentOverrideRest... };
}
sandbox 被解构后丢弃(_agentSandboxOverride 前缀 _ 表示未使用),因为 sandbox 解析已在独立路径中处理,不需要在 defaults 层合并。
9.8.7 session-key.ts --- 会话 Key 生成
typescript
export function resolveCronAgentSessionKey(params): string {
const raw = toAgentStoreSessionKey({
agentId: params.agentId,
requestKey: params.sessionKey.trim(),
mainKey: params.mainKey,
});
return canonicalizeMainSessionAlias({ cfg: params.cfg, agentId: params.agentId, sessionKey: raw });
}
两步转换:
toAgentStoreSessionKey:将请求 key 转为存储格式(添加 agent 前缀)canonicalizeMainSessionAlias:将agent:<id>:main映射为配置的 mainKey 别名
这修复了 issue #29683:当 cfg.session.mainKey 不是 "main" 时,cron 会话会在读取路径中被孤立。
9.8.8 job-fixtures.ts --- 测试固件
typescript
export function makeIsolatedAgentJobFixture(overrides?) {
return {
id: "test-job",
name: "Test Job",
schedule: { kind: "cron", expr: "0 9 * * *", tz: "UTC" },
sessionTarget: "isolated",
payload: { kind: "agentTurn", message: "test" },
...overrides,
} as never;
}
export function makeIsolatedAgentParamsFixture(overrides?) {
return {
cfg: {},
deps: {} as never,
job: makeIsolatedAgentJobFixture(jobOverrides),
message: "test",
sessionKey: "cron:test",
...overrides,
};
}
as never 类型断言绕过 TypeScript 的严格类型检查,允许测试中只覆盖必要字段。
总结:模块间关系全景图
┌─────────────────────────────┐
│ cron service (timer) │
└──────────┬──────────────────┘
│ 触发
┌──────────▼──────────────────┐
│ runCronIsolatedAgentTurn │ ← run.ts
│ (prepare→execute→finalize) │
└──────────┬──────────────────┘
│
┌──────────────────────┼──────────────────────────┐
│ │ │
┌──────────▼─────────┐ ┌────────▼─────────┐ ┌─────────────▼──────────────┐
│ model-selection │ │ run-executor │ │ delivery-dispatch │
│ (5级优先级) │ │ (执行+重试) │ │ (交付调度) │
└────────────────────┘ └────────┬─────────┘ └────────────┬───────────────┘
│ │
┌────────▼─────────┐ ┌─────────▼──────────────┐
│ subagent-followup │ │ delivery-target │
│ (子代理等待) │ │ (目标解析) │
└──────────────────┘ └────────────────────────┘
辅助层:
┌──────────┐ ┌─────────┐ ┌──────────┐ ┌──────────────┐ ┌────────────┐ ┌───────────────┐
│ normalize│ │ store │ │ run-log │ │session-reaper│ │ stagger │ │ active-jobs │
│ (输入校验)│ │(持久化) │ │ (JSONL) │ │ (清理) │ │ (防惊群) │ │ (并发控制) │
└──────────┘ └─────────┘ └──────────┘ └──────────────┘ └────────────┘ └───────────────┘
数据流:
- 用户输入 →
normalize规范化 →store持久化 - 定时触发 →
active-jobs并发检查 →run.ts准备上下文 - 准备阶段 →
model-selection选模型 →session解析会话 →skills-snapshot构建快照 - 执行阶段 →
run-executor运行 agent →subagent-followup等待子代理 - 收尾阶段 →
delivery-plan生成计划 →delivery-target解析目标 →delivery-dispatch投递 - 记录 →
run-log追加日志 →session-reaper定期清理
核心设计模式:
- 懒加载运行时:解决循环依赖和启动性能
- 闭包工厂 :
createCronPromptExecutor封装可变状态 - 幂等性缓存 :
COMPLETED_DIRECT_CRON_DELIVERIES防重复投递 - 串行化写入 :
writesByPath保证 JSONL 写入原子性 - 全局单例 :
Symbol.for+resolveGlobalSingleton跨模块共享状态 - 三级短路:内存缓存 → 文件比较 → 实际写入
- 快速失败:prepare 阶段错误立即返回,不浪费后续资源
- 优雅降级:模型不可用时降级而非硬失败,交付失败时 bestEffort 容忍