第9章 工具调用循环------Agent的行动闭环
引言
想象一个厨师的烹饪流程:他查看菜单(用户需求),准备食材(上下文),开始烹饪(调用工具),品尝味道(获取结果),调整调味(继续对话),直到完成菜品。这个过程不断循环,直到厨师满意或客人要求停止。
在 Claude Code 中,query() 函数就是这个"厨师",而工具调用循环就是烹饪的核心流程。本章将深入剖析这个状态机的设计和实现,你将看到:
while(true)循环如何实现状态机- State 对象如何管理跨轮次的状态
- StreamingToolExecutor 如何实现流式工具执行
- 并发控制如何平衡性能和正确性
- Stop Hooks 如何实现每轮的守卫机制
- 上下文压缩如何防止 Token 溢出
概念讲解
状态机的本质
query() 函数本质上是一个状态机,它不断循环直到满足终止条件:
typescript
while (true) {
// 处理当前状态
// 决定下一步行动
// 更新状态
// 产生输出
}
这种设计的好处:
- 清晰性:状态转换逻辑集中在一处
- 可扩展性:容易添加新的状态和转换
- 可测试性:每个状态和转换都可以独立测试
State 对象的设计
State 对象封装了所有跨轮次的状态:
typescript
type State = {
messages: Message[]
toolUseContext: ToolUseContext
autoCompactTracking: AutoCompactTrackingState | undefined
maxOutputTokensRecoveryCount: number
hasAttemptedReactiveCompact: boolean
maxOutputTokensOverride: number | undefined
pendingToolUseSummary: Promise<ToolUseSummaryMessage | null> | undefined
stopHookActive: boolean | undefined
turnCount: number
transition: Continue | undefined
}
每个字段都有明确的职责:
messages:对话历史toolUseContext:工具调用上下文autoCompactTracking:自动压缩跟踪maxOutputTokensRecoveryCount:最大输出 Token 恢复计数hasAttemptedReactiveCompact:是否尝试过响应式压缩maxOutputTokensOverride:最大输出 Token 覆盖pendingToolUseSummary:待处理的工具使用摘要stopHookActive:停止钩子是否激活turnCount:轮次计数transition:转换原因
流式工具执行
StreamingToolExecutor 实现了流式工具执行:
- 工具在到达时立即开始执行(如果条件允许)
- 并发安全的工具可以并行执行
- 结果按工具到达顺序输出
- 支持进度消息的实时输出
这就像一个流水线,原料不断进入,产品不断产出,中间的加工过程可以并行进行。
源码分析
query 函数入口
让我们从 query() 函数的入口开始:
typescript
export async function* query(
params: QueryParams,
): AsyncGenerator<
| StreamEvent
| RequestStartEvent
| Message
| TombstoneMessage
| ToolUseSummaryMessage,
Terminal
> {
const consumedCommandUuids: string[] = []
const terminal = yield* queryLoop(params, consumedCommandUuids)
for (const uuid of consumedCommandUuids) {
notifyCommandLifecycle(uuid, 'completed')
}
return terminal
}
设计要点:
- 委托模式 :
query()委托给queryLoop()处理核心逻辑 - 命令生命周期:记录并通知已消费的命令
- 返回类型 :
AsyncGenerator逐步产生各种类型的事件和消息 - 终止信号 :返回
Terminal对象表示查询结束
queryLoop 函数
queryLoop() 是核心的状态机循环:
typescript
async function* queryLoop(
params: QueryParams,
consumedCommandUuids: string[],
): AsyncGenerator<
| StreamEvent
| RequestStartEvent
| Message
| TombstoneMessage
| ToolUseSummaryMessage,
Terminal
> {
const {
systemPrompt,
userContext,
systemContext,
canUseTool,
fallbackModel,
querySource,
maxTurns,
skipCacheWrite,
} = params
const deps = params.deps ?? productionDeps()
let state: State = {
messages: params.messages,
toolUseContext: params.toolUseContext,
maxOutputTokensOverride: params.maxOutputTokensOverride,
autoCompactTracking: undefined,
stopHookActive: undefined,
maxOutputTokensRecoveryCount: 0,
hasAttemptedReactiveCompact: false,
turnCount: 1,
pendingToolUseSummary: undefined,
transition: undefined,
}
const budgetTracker = feature('TOKEN_BUDGET') ? createBudgetTracker() : null
let taskBudgetRemaining: number | undefined = undefined
const config = buildQueryConfig()
using pendingMemoryPrefetch = startRelevantMemoryPrefetch(
state.messages,
state.toolUseContext,
)
while (true) {
// 循环体...
}
}
初始化分析:
- 参数解构:提取不可变参数
- 依赖注入 :
deps提供所有依赖 - 状态初始化:创建初始 State 对象
- 预算跟踪:可选的 Token 预算跟踪器
- 内存预取 :使用
using语法确保资源释放
主循环结构
主循环的结构如下:
typescript
while (true) {
let { toolUseContext } = state
const {
messages,
autoCompactTracking,
maxOutputTokensRecoveryCount,
hasAttemptedReactiveCompact,
maxOutputTokensOverride,
pendingToolUseSummary,
stopHookActive,
turnCount,
} = state
const pendingSkillPrefetch = skillPrefetch?.startSkillDiscoveryPrefetch(
null,
messages,
toolUseContext,
)
yield { type: 'stream_request_start' }
queryCheckpoint('query_fn_entry')
if (!toolUseContext.agentId) {
headlessProfilerCheckpoint('query_started')
}
const queryTracking = toolUseContext.queryTracking
? {
chainId: toolUseContext.queryTracking.chainId,
depth: toolUseContext.queryTracking.depth + 1,
}
: {
chainId: deps.uuid(),
depth: 0,
}
toolUseContext = {
...toolUseContext,
queryTracking,
}
let messagesForQuery = [...getMessagesAfterCompactBoundary(messages)]
// ... 更多处理逻辑
}
每轮迭代的关键步骤:
- 状态解构:解构 State 对象,获取当前状态
- 技能预取:启动技能发现预取
- 请求开始事件 :产生
stream_request_start事件 - 性能检查点:记录性能检查点
- 查询跟踪:初始化或更新查询跟踪
消息准备与压缩
在调用 API 之前,消息需要经过多步处理:
typescript
let messagesForQuery = [...getMessagesAfterCompactBoundary(messages)]
let tracking = autoCompactTracking
// 应用工具结果预算
messagesForQuery = await applyToolResultBudget(
messagesForQuery,
toolUseContext.contentReplacementState,
persistReplacements
? records =>
void recordContentReplacement(
records,
toolUseContext.agentId,
).catch(logError)
: undefined,
new Set(
toolUseContext.options.tools
.filter(t => !Number.isFinite(t.maxResultSizeChars))
.map(t => t.name),
),
)
// 应用 snip 压缩
let snipTokensFreed = 0
if (feature('HISTORY_SNIP')) {
queryCheckpoint('query_snip_start')
const snipResult = snipModule!.snipCompactIfNeeded(messagesForQuery)
messagesForQuery = snipResult.messages
snipTokensFreed = snipResult.tokensFreed
if (snipResult.boundaryMessage) {
yield snipResult.boundaryMessage
}
queryCheckpoint('query_snip_end')
}
// 应用微压缩
queryCheckpoint('query_microcompact_start')
const microcompactResult = await deps.microcompact(
messagesForQuery,
toolUseContext,
querySource,
)
messagesForQuery = microcompactResult.messages
const pendingCacheEdits = feature('CACHED_MICROCOMPACT')
? microcompactResult.compactionInfo?.pendingCacheEdits
: undefined
queryCheckpoint('query_microcompact_end')
// 应用上下文折叠
if (feature('CONTEXT_COLLAPSE') && contextCollapse) {
const collapseResult = await contextCollapse.applyCollapsesIfNeeded(
messagesForQuery,
toolUseContext,
querySource,
)
messagesForQuery = collapseResult.messages
}
压缩层次:
- 工具结果预算:限制单个工具结果的大小
- Snip 压缩:移除不重要的消息
- 微压缩:压缩重复的文件内容
- 上下文折叠:折叠相似的消息
自动压缩
自动压缩是防止 Token 溢出的关键机制:
typescript
const fullSystemPrompt = asSystemPrompt(
appendSystemContext(systemPrompt, systemContext),
)
queryCheckpoint('query_autocompact_start')
const { compactionResult, consecutiveFailures } = await deps.autocompact(
messagesForQuery,
toolUseContext,
{
systemPrompt,
userContext,
systemContext,
toolUseContext,
forkContextMessages: messagesForQuery,
},
querySource,
tracking,
snipTokensFreed,
)
queryCheckpoint('query_autocompact_end')
if (compactionResult) {
const {
preCompactTokenCount,
postCompactTokenCount,
truePostCompactTokenCount,
compactionUsage,
} = compactionResult
logEvent('tengu_auto_compact_succeeded', {
originalMessageCount: messages.length,
compactedMessageCount:
compactionResult.summaryMessages.length +
compactionResult.attachments.length +
compactionResult.hookResults.length,
preCompactTokenCount,
postCompactTokenCount,
truePostCompactTokenCount,
compactionInputTokens: compactionUsage?.input_tokens,
compactionOutputTokens: compactionUsage?.output_tokens,
compactionCacheReadTokens:
compactionUsage?.cache_read_input_tokens ?? 0,
compactionCacheCreationTokens:
compactionUsage?.cache_creation_input_tokens ?? 0,
compactionTotalTokens: compactionUsage
? compactionUsage.input_tokens +
(compactionUsage.cache_creation_input_tokens ?? 0) +
(compactionUsage.cache_read_input_tokens ?? 0) +
compactionUsage.output_tokens
: 0,
queryChainId: queryChainIdForAnalytics,
queryDepth: queryTracking.depth,
})
if (params.taskBudget) {
const preCompactContext =
finalContextTokensFromLastResponse(messagesForQuery)
taskBudgetRemaining = Math.max(
0,
(taskBudgetRemaining ?? params.taskBudget.total) - preCompactContext,
)
}
tracking = {
compacted: true,
turnId: deps.uuid(),
turnCounter: 0,
consecutiveFailures: 0,
}
const postCompactMessages = buildPostCompactMessages(compactionResult)
for (const message of postCompactMessages) {
yield message
}
messagesForQuery = postCompactMessages
} else if (consecutiveFailures !== undefined) {
tracking = {
...(tracking ?? { compacted: false, turnId: '', turnCounter: 0 }),
consecutiveFailures,
}
}
关键逻辑:
- 压缩触发:当消息超过阈值时触发压缩
- 统计记录:记录压缩前后的 Token 数量
- 任务预算:更新任务预算的剩余量
- 跟踪更新:重置压缩跟踪状态
- 消息替换:用压缩后的消息替换原始消息
工具执行器初始化
在调用 API 之前,初始化工具执行器:
typescript
const assistantMessages: AssistantMessage[] = []
const toolResults: (UserMessage | AttachmentMessage)[] = []
const toolUseBlocks: ToolUseBlock[] = []
let needsFollowUp = false
queryCheckpoint('query_setup_start')
const useStreamingToolExecution = config.gates.streamingToolExecution
let streamingToolExecutor = useStreamingToolExecution
? new StreamingToolExecutor(
toolUseContext.options.tools,
canUseTool,
toolUseContext,
)
: null
const appState = toolUseContext.getAppState()
const permissionMode = appState.toolPermissionContext.mode
let currentModel = getRuntimeMainLoopModel({
permissionMode,
mainLoopModel: toolUseContext.options.mainLoopModel,
exceeds200kTokens:
permissionMode === 'plan' &&
doesMostRecentAssistantMessageExceed200k(messagesForQuery),
})
queryCheckpoint('query_setup_end')
初始化内容:
- 消息容器 :
assistantMessages、toolResults、toolUseBlocks - 工具执行器:根据配置创建流式工具执行器
- 模型选择:根据权限模式和 Token 数量选择模型
API 调用循环
API 调用在一个嵌套的 while 循环中:
typescript
let attemptWithFallback = true
queryCheckpoint('query_api_loop_start')
try {
while (attemptWithFallback) {
attemptWithFallback = false
try {
let streamingFallbackOccured = false
queryCheckpoint('query_api_streaming_start')
for await (const message of deps.callModel({
messages: prependUserContext(messagesForQuery, userContext),
systemPrompt: fullSystemPrompt,
thinkingConfig: toolUseContext.options.thinkingConfig,
tools: toolUseContext.options.tools,
signal: toolUseContext.abortController.signal,
options: {
async getToolPermissionContext() {
const appState = toolUseContext.getAppState()
return appState.toolPermissionContext
},
model: currentModel,
...(config.gates.fastModeEnabled && {
fastMode: appState.fastMode,
}),
toolChoice: undefined,
isNonInteractiveSession:
toolUseContext.options.isNonInteractiveSession,
fallbackModel,
onStreamingFallback: () => {
streamingFallbackOccured = true
},
querySource,
agents: toolUseContext.options.agentDefinitions.activeAgents,
allowedAgentTypes:
toolUseContext.options.agentDefinitions.allowedAgentTypes,
hasAppendSystemPrompt:
!!toolUseContext.options.appendSystemPrompt,
maxOutputTokensOverride,
fetchOverride: dumpPromptsFetch,
mcpTools: appState.mcp.tools,
hasPendingMcpServers: appState.mcp.clients.some(
c => c.type === 'pending',
),
queryTracking,
effortValue: appState.effortValue,
advisorModel: appState.advisorModel,
skipCacheWrite,
agentId: toolUseContext.agentId,
addNotification: toolUseContext.addNotification,
...(params.taskBudget && {
taskBudget: {
total: params.taskBudget.total,
...(taskBudgetRemaining !== undefined && {
remaining: taskBudgetRemaining,
}),
},
}),
},
})) {
// 处理流式消息...
}
} catch (innerError) {
if (innerError instanceof FallbackTriggeredError && fallbackModel) {
// 处理模型回退...
}
}
}
} catch (error) {
// 处理错误...
}
嵌套循环的目的:
- 外层循环:处理模型回退
- 内层循环:处理流式响应
流式消息处理
流式消息处理是整个循环的核心:
typescript
for await (const message of deps.callModel({...})) {
if (streamingFallbackOccured) {
for (const msg of assistantMessages) {
yield { type: 'tombstone' as const, message: msg }
}
logEvent('tengu_orphaned_messages_tombstoned', {
orphanedMessageCount: assistantMessages.length,
queryChainId: queryChainIdForAnalytics,
queryDepth: queryTracking.depth,
})
assistantMessages.length = 0
toolResults.length = 0
toolUseBlocks.length = 0
needsFollowUp = false
if (streamingToolExecutor) {
streamingToolExecutor.discard()
streamingToolExecutor = new StreamingToolExecutor(
toolUseContext.options.tools,
canUseTool,
toolUseContext,
)
}
}
let yieldMessage: typeof message = message
if (message.type === 'assistant') {
let clonedContent: typeof message.message.content | undefined
for (let i = 0; i < message.message.content.length; i++) {
const block = message.message.content[i]!
if (
block.type === 'tool_use' &&
typeof block.input === 'object' &&
block.input !== null
) {
const tool = findToolByName(
toolUseContext.options.tools,
block.name,
)
if (tool?.backfillObservableInput) {
const originalInput = block.input as Record<string, unknown>
const inputCopy = { ...originalInput }
tool.backfillObservableInput(inputCopy)
const addedFields = Object.keys(inputCopy).some(
k => !(k in originalInput),
)
if (addedFields) {
clonedContent ??= [...message.message.content]
clonedContent[i] = { ...block, input: inputCopy }
}
}
}
}
if (clonedContent) {
yieldMessage = {
...message,
message: { ...message.message, content: clonedContent },
}
}
}
let withheld = false
if (feature('CONTEXT_COLLAPSE')) {
if (
contextCollapse?.isWithheldPromptTooLong(
message,
isPromptTooLongMessage,
querySource,
)
) {
withheld = true
}
}
if (reactiveCompact?.isWithheldPromptTooLong(message)) {
withheld = true
}
if (
mediaRecoveryEnabled &&
reactiveCompact?.isWithheldMediaSizeError(message)
) {
withheld = true
}
if (isWithheldMaxOutputTokens(message)) {
withheld = true
}
if (!withheld) {
yield yieldMessage
}
if (message.type === 'assistant') {
assistantMessages.push(message)
const msgToolUseBlocks = message.message.content.filter(
content => content.type === 'tool_use',
) as ToolUseBlock[]
if (msgToolUseBlocks.length > 0) {
toolUseBlocks.push(...msgToolUseBlocks)
needsFollowUp = true
}
if (
streamingToolExecutor &&
!toolUseContext.abortController.signal.aborted
) {
for (const toolBlock of msgToolUseBlocks) {
streamingToolExecutor.addTool(toolBlock, message)
}
}
}
if (
streamingToolExecutor &&
!toolUseContext.abortController.signal.aborted
) {
for (const result of streamingToolExecutor.getCompletedResults()) {
if (result.message) {
yield result.message
toolResults.push(
...normalizeMessagesForAPI(
[result.message],
toolUseContext.options.tools,
).filter(_ => _.type === 'user'),
)
}
}
}
}
处理逻辑:
- 回退处理:如果发生流式回退,清理旧消息和工具执行器
- 输入回填:为工具输入回填可观察字段
- 错误保留:保留可恢复的错误,等待恢复机制
- 助手消息:收集助手消息和工具调用块
- 工具执行:将工具调用添加到执行器
- 工具结果:获取并产生已完成的工具结果
StreamingToolExecutor 实现
让我们看看 StreamingToolExecutor 的实现:
typescript
export class StreamingToolExecutor {
private tools: TrackedTool[] = []
private toolUseContext: ToolUseContext
private hasErrored = false
private erroredToolDescription = ''
private siblingAbortController: AbortController
private discarded = false
private progressAvailableResolve?: () => void
constructor(
private readonly toolDefinitions: Tools,
private readonly canUseTool: CanUseToolFn,
toolUseContext: ToolUseContext,
) {
this.toolUseContext = toolUseContext
this.siblingAbortController = createChildAbortController(
toolUseContext.abortController,
)
}
discard(): void {
this.discarded = true
}
addTool(block: ToolUseBlock, assistantMessage: AssistantMessage): void {
const toolDefinition = findToolByName(this.toolDefinitions, block.name)
if (!toolDefinition) {
this.tools.push({
id: block.id,
block,
assistantMessage,
status: 'completed',
isConcurrencySafe: true,
pendingProgress: [],
results: [
createUserMessage({
content: [
{
type: 'tool_result',
content: `<tool_use_error>Error: No such tool available: ${block.name}</tool_use_error>`,
is_error: true,
tool_use_id: block.id,
},
],
toolUseResult: `Error: No such tool available: ${block.name}`,
sourceToolAssistantUUID: assistantMessage.uuid,
}),
],
})
return
}
const parsedInput = toolDefinition.inputSchema.safeParse(block.input)
const isConcurrencySafe = parsedInput?.success
? (() => {
try {
return Boolean(toolDefinition.isConcurrencySafe(parsedInput.data))
} catch {
return false
}
})()
: false
this.tools.push({
id: block.id,
block,
assistantMessage,
status: 'queued',
isConcurrencySafe,
pendingProgress: [],
})
void this.processQueue()
}
private canExecuteTool(isConcurrencySafe: boolean): boolean {
const executingTools = this.tools.filter(t => t.status === 'executing')
return (
executingTools.length === 0 ||
(isConcurrencySafe && executingTools.every(t => t.isConcurrencySafe))
)
}
private async processQueue(): Promise<void> {
for (const tool of this.tools) {
if (tool.status !== 'queued') continue
if (this.canExecuteTool(tool.isConcurrencySafe)) {
await this.executeTool(tool)
} else {
if (!tool.isConcurrencySafe) break
}
}
}
}
设计要点:
- 并发控制 :通过
isConcurrencySafe标记控制并发 - 队列处理:工具按到达顺序排队
- 错误处理:找不到工具时立即返回错误
- 子中断器:创建子中断器,可以独立中断工具执行
设计启示
1. 状态机的威力
while(true) 循环加上 State 对象,形成了一个清晰的状态机:
typescript
while (true) {
const { messages, toolUseContext, ... } = state
// 处理当前状态
// ...
// 决定是否继续
if (shouldStop) {
return terminal
}
// 更新状态
state = { ...state, ...updates }
}
这种设计的优势:
- 清晰性:状态转换逻辑集中在一处
- 可维护性:容易添加新的状态和转换
- 可测试性:每个状态都可以独立测试
2. 流式处理的优雅性
异步生成器让流式处理变得优雅:
typescript
for await (const message of deps.callModel({...})) {
// 逐步处理每个消息
if (message.type === 'assistant') {
// 处理助手消息
}
yield message
}
优势:
- 实时性:消息到达时立即处理
- 内存效率:不需要缓冲所有消息
- 可中断性:可以随时中断流
3. 并发控制的智慧
StreamingToolExecutor 的并发控制设计非常精妙:
typescript
private canExecuteTool(isConcurrencySafe: boolean): boolean {
const executingTools = this.tools.filter(t => t.status === 'executing')
return (
executingTools.length === 0 ||
(isConcurrencySafe && executingTools.every(t => t.isConcurrencySafe))
)
}
规则:
- 如果没有工具在执行,可以执行任何工具
- 如果有工具在执行,只有并发安全的工具可以并行执行
- 非并发安全的工具必须独占执行
这就像高速公路:多车道可以并行行驶,但某些车辆(如超宽车辆)需要独占一条车道。
4. 错误恢复的层次性
系统在多个层次实现错误恢复:
- 流式回退:API 流式失败时自动回退
- 模型回退:模型不可用时切换到备用模型
- 压缩恢复:Token 超限时自动压缩
- 上下文折叠:长上下文时自动折叠
每个层次专注于自己的错误类型,形成完整的错误恢复体系。
5. 性能优化的权衡
系统在多个地方做出了性能权衡:
- 预取:内存和技能预取减少等待时间
- 压缩:提前压缩减少 Token 消耗
- 缓存:文件缓存减少重复读取
- 并发:并发安全的工具并行执行
思考题
-
设计题:如何为 StreamingToolExecutor 添加优先级支持,让某些工具优先执行?
-
优化题:在什么情况下,应该禁用流式工具执行?优缺点是什么?
-
扩展题:如何为状态机添加超时机制,防止无限循环?
-
测试题 :如何为
query()函数编写集成测试?需要模拟哪些依赖? -
架构题:状态机和事件驱动架构有什么区别?在什么场景下选择哪种?
总结
工具调用循环是 Claude Code 的行动闭环,它:
- 管理状态:使用 State 对象封装跨轮次的状态
- 流式处理:使用异步生成器优雅地处理流式响应
- 并发控制 :通过
isConcurrencySafe标记控制工具并发 - 错误恢复:在多个层次实现错误恢复机制
- 性能优化:通过预取、压缩、缓存等优化性能
就像厨师的烹饪流程一样,工具调用循环不断循环:准备食材(上下文)、烹饪(工具调用)、品尝(获取结果)、调整(继续对话),直到完成菜品(任务完成)。它的设计体现了软件工程中的诸多最佳实践:状态机设计、流式处理、并发控制、错误恢复等。
理解工具调用循环的设计,你就能理解整个 Claude Code 系统的核心交互机制和行动逻辑。