Claude Code源码剖析 - ShellTool与真实动作

Phase 6: ShellTool 与真实动作

Phase 6 先记住一句话

Phase 5 讲的是：

text 复制代码

模型提出动作，本地 runtime 决定能不能执行。

Phase 6 讲的是：

text 复制代码

一旦允许执行，Claude Code 如何真的把一条 Bash 命令跑起来，
如何限制它、观察它、截断它、后台化它，并把结果重新塞回 agent loop。

也就是说，权限系统解决的是"该不该执行"，BashTool 解决的是"怎么安全地执行真实动作"。

这一阶段的主线可以这样看：

text 复制代码

assistant/tool_use(name="Bash", input={ command, timeout, ... })
-> toolExecution.checkPermissionsAndCallTool()
-> schema 校验
-> validateInput()
-> PreToolUse hooks
-> BashTool.checkPermissions()
-> BashTool.call()
-> runShellCommand()
-> exec(command, signal, 'bash', options)
-> progress / timeout / background / output file
-> Out
-> mapToolResultToToolResultBlockParam()
-> user/tool_result
-> 下一轮模型继续

src/tools/BashTool/BashTool.tsx：Bash 工具对象

【当前源码】

src/tools/BashTool/BashTool.tsx:227-264

【这段解决什么问题】

这段定义了模型调用 Bash 工具时能传什么参数。

Claude Code 没有把 shell tool 暴露成一个随便传字符串的裸函数，而是用 Zod schema 规定输入结构。

【源码怎么做】

ts 复制代码

const fullInputSchema = lazySchema(() => z.strictObject({
  command: z.string().describe('The command to execute'),
  timeout: semanticNumber(z.number().optional()).describe(...),
  description: z.string().optional().describe(...),
  run_in_background: semanticBoolean(z.boolean().optional()).describe(...),
  dangerouslyDisableSandbox: semanticBoolean(z.boolean().optional()).describe(...),
  _simulatedSedEdit: z.object({
    filePath: z.string(),
    newContent: z.string()
  }).optional().describe('Internal: pre-computed sed edit result from preview')
}))

const inputSchema = lazySchema(() =>
  isBackgroundTasksDisabled
    ? fullInputSchema().omit({
        run_in_background: true,
        _simulatedSedEdit: true,
      })
    : fullInputSchema().omit({
        _simulatedSedEdit: true,
      }),
)

最重要的字段是：

command：真正要执行的 shell 命令。
timeout：命令超时时间。
description：给 UI 和用户看的简短说明。
run_in_background：是否后台执行。
dangerouslyDisableSandbox：是否请求绕过沙箱。
_simulatedSedEdit：内部字段，不能暴露给模型。

这里有一个很关键的安全设计：

ts 复制代码

// Always omit _simulatedSedEdit from the model-facing schema.

_simulatedSedEdit 是权限弹窗预览 sed 编辑后，由本地权限系统注入的内部字段。模型不能自己构造它。

原因也写在源码注释里：如果模型能传这个字段，就可能伪装成无害命令，然后把任意文件内容写入 _simulatedSedEdit.newContent，绕过 sed 预览和沙箱。

你可以这样记：

text 复制代码

fullInputSchema：内部完整输入
inputSchema：真正发给模型看的输入

内部字段可以存在，但不能进入模型可控 schema。

【C++ 类比】

可以把它类比成：

cpp 复制代码

struct BashInputInternal {
    std::string command;
    std::optional<int> timeout;
    std::optional<std::string> description;
    std::optional<bool> run_in_background;
    std::optional<bool> dangerouslyDisableSandbox;

    // private runtime-only field
    std::optional<SimulatedSedEdit> simulatedSedEdit;
};

struct BashInputPublic {
    std::string command;
    std::optional<int> timeout;
    std::optional<std::string> description;
    std::optional<bool> run_in_background;
    std::optional<bool> dangerouslyDisableSandbox;
};

模型只能看到 BashInputPublic，本地 runtime 才能使用 BashInputInternal。

【笔记】

md 复制代码

### BashTool 的输入 schema

`BashTool` 的输入不是一个裸字符串，而是一个结构化对象：

```ts
{
  command: string
  timeout?: number
  description?: string
  run_in_background?: boolean
  dangerouslyDisableSandbox?: boolean
}
```

源码内部还有 `_simulatedSedEdit`，但它被 `inputSchema()` 从模型可见 schema 中移除。

这个设计说明：工具输入 schema 不只是类型约束，也是安全边界。模型能看到哪些字段，决定了模型能控制哪些行为。

BashTool 的输出结构

【当前源码】

src/tools/BashTool/BashTool.tsx:279-296

【这段解决什么问题】

shell 命令执行结果不是只有 stdout。

Claude Code 需要告诉模型：

命令输出了什么。
有没有 stderr。
是否被中断。
是否产生图片。
是否后台运行。
是否有大输出被持久化到文件。
非零退出码是否真的算错误。

【源码怎么做】

ts 复制代码

const outputSchema = lazySchema(() => z.object({
  stdout: z.string().describe('The standard output of the command'),
  stderr: z.string().describe('The standard error output of the command'),
  rawOutputPath: z.string().optional().describe(...),
  interrupted: z.boolean().describe('Whether the command was interrupted'),
  isImage: z.boolean().optional().describe(...),
  backgroundTaskId: z.string().optional().describe(...),
  backgroundedByUser: z.boolean().optional().describe(...),
  assistantAutoBackgrounded: z.boolean().optional().describe(...),
  dangerouslyDisableSandbox: z.boolean().optional().describe(...),
  returnCodeInterpretation: z.string().optional().describe(...),
  noOutputExpected: z.boolean().optional().describe(...),
  structuredContent: z.array(z.any()).optional().describe(...),
  persistedOutputPath: z.string().optional().describe(...),
  persistedOutputSize: z.number().optional().describe(...),
}))

这里的 Out 可以理解成 Bash 工具的内部结果结构：

text 复制代码

Out =
  stdout/stderr
  + 执行状态
  + 后台任务信息
  + 沙箱信息
  + 大输出文件信息
  + 特殊内容信息

这和普通 CLI 的输出不一样。

普通 CLI 只关心：

text 复制代码

stdout + stderr + exit code

Agent runtime 还要关心：

text 复制代码

这个结果怎么显示给用户？
怎么安全地给模型？
太大怎么办？
后台任务怎么让模型知道？
图片怎么转成 image block？

【笔记】

md 复制代码

### BashTool 的输出不是简单字符串

`BashTool` 的输出类型 `Out` 至少包含：

- `stdout`
- `stderr`
- `interrupted`
- `backgroundTaskId`
- `returnCodeInterpretation`
- `persistedOutputPath`
- `persistedOutputSize`

这说明 shell tool 是一个"真实进程执行器"，不是简单文本函数。它必须把进程生命周期、输出大小、后台任务、错误语义都带回 agent loop。

buildTool：把 Bash 注册成真正的 Tool

【当前源码】

src/tools/BashTool/BashTool.tsx:420-545

【这段解决什么问题】

这一段把 BashTool 接入 Phase 3 讲过的统一 Tool 抽象。

也就是让它具备：

工具名。
prompt。
schema。
只读判断。
权限检查。
输入校验。
UI 渲染。
tool_result 转换。
真正执行逻辑。

【源码怎么做】

ts 复制代码

export const BashTool = buildTool({
  name: BASH_TOOL_NAME,
  searchHint: 'execute shell commands',
  maxResultSizeChars: 30_000,
  strict: true,

  async description({ description }) {
    return description || 'Run shell command'
  },

  async prompt() {
    return getSimplePrompt()
  },

  isConcurrencySafe(input) {
    return this.isReadOnly?.(input) ?? false
  },

  isReadOnly(input) {
    const compoundCommandHasCd = commandHasAnyCd(input.command)
    const result = checkReadOnlyConstraints(input, compoundCommandHasCd)
    return result.behavior === 'allow'
  },

  async validateInput(input) { ... },

  async checkPermissions(input, context) {
    return bashToolHasPermission(input, context)
  },

  async call(input, toolUseContext, ...) { ... },
})

这里有几个特别重要的点。

第一，BashTool 的名字是 BASH_TOOL_NAME。模型发出的 tool_use.name 必须匹配这个名字，本地 runtime 才能找到工具。

第二，maxResultSizeChars: 30_000 表示 Bash 结果有大小阈值。大输出不能无限塞进上下文。

第三，strict: true 表示输入必须严格符合 schema。

第四，isConcurrencySafe() 只在命令是 read-only 时返回 true：

ts 复制代码

isConcurrencySafe(input) {
  return this.isReadOnly?.(input) ?? false
}

这说明 Claude Code 不会轻易并发执行可写 shell 命令。因为两个写命令并发跑，可能互相踩文件状态。

第五，isReadOnly() 并不是靠命令字符串表面判断，而是调用 checkReadOnlyConstraints()：

ts 复制代码

const compoundCommandHasCd = commandHasAnyCd(input.command)
const result = checkReadOnlyConstraints(input, compoundCommandHasCd)
return result.behavior === 'allow'

这说明只读判断也要考虑 cd、复合命令、git sandbox escape 等复杂情况。

【C++ 类比】

cpp 复制代码

struct BashTool : Tool<BashInput, BashOutput> {
    std::string name() { return "Bash"; }

    bool isConcurrencySafe(const BashInput& input) {
        return isReadOnly(input);
    }

    bool isReadOnly(const BashInput& input) {
        return checkReadOnlyConstraints(input.command).allow;
    }

    PermissionResult checkPermissions(const BashInput& input, Context& ctx) {
        return bashToolHasPermission(input, ctx);
    }

    ToolResult<BashOutput> call(const BashInput& input, Context& ctx) {
        return runShellCommand(input, ctx);
    }
};

【笔记】

md 复制代码

### BashTool 接入 Tool 抽象

`BashTool` 是通过 `buildTool({...})` 注册的工具对象。

它最核心的几个字段是：

- `name: BASH_TOOL_NAME`
- `inputSchema`
- `outputSchema`
- `isReadOnly()`
- `isConcurrencySafe()`
- `validateInput()`
- `checkPermissions()`
- `call()`
- `mapToolResultToToolResultBlockParam()`

`BashTool.isConcurrencySafe()` 只在 `isReadOnly()` 为真时允许并发，说明 shell tool 的调度策略和安全属性绑定在一起。

validateInput：阻止不合适的阻塞 sleep

【当前源码】

src/tools/BashTool/BashTool.tsx:317-337

src/tools/BashTool/BashTool.tsx:524-538

【这段解决什么问题】

模型有时会用 sleep 30 && curl ... 这种方式等待状态变化。

但在交互式 agent loop 里，长时间 sleep 会浪费前台会话，让用户感觉卡住。

所以 BashTool 在真正执行前，会把长 sleep 引导到后台任务或 Monitor 工具。

【源码怎么做】

先检测命令是否是长 sleep：

ts 复制代码

export function detectBlockedSleepPattern(command: string): string | null {
  const parts = splitCommand_DEPRECATED(command)
  if (parts.length === 0) return null
  const first = parts[0]?.trim() ?? ''

  const m = /^sleep\s+(\d+)\s*$/.exec(first)
  if (!m) return null

  const secs = parseInt(m[1]!, 10)
  if (secs < 2) return null

  const rest = parts.slice(1).join(' ').trim()
  return rest ? `sleep ${secs} followed by: ${rest}` : `standalone sleep ${secs}`
}

然后在 validateInput() 里拦截：

ts 复制代码

async validateInput(input: BashToolInput): Promise<ValidationResult> {
  if (feature('MONITOR_TOOL') && !isBackgroundTasksDisabled && !input.run_in_background) {
    const sleepPattern = detectBlockedSleepPattern(input.command)
    if (sleepPattern !== null) {
      return {
        result: false,
        message: `Blocked: ${sleepPattern}. Run blocking commands in the background ...`,
        errorCode: 10,
      }
    }
  }
  return { result: true }
}

这不是权限拒绝，而是输入校验失败。

区别是：

text 复制代码

permission deny：这件事不允许做
validateInput false：这个工具调用方式不合适

两者最终都会回成 tool_result is_error=true，让模型下一轮修正策略。

【笔记】

md 复制代码

### validateInput 阻止长时间 sleep

`BashTool.validateInput()` 会在工具执行前拦截不适合前台执行的 `sleep N`。

这不是权限系统判断危险，而是 runtime 判断"这个调用方式会让 agent loop 卡住"。如果模型确实要等待，应使用后台任务或 Monitor 工具。

mapToolResultToToolResultBlockParam：把进程结果回填给模型

【当前源码】

src/tools/BashTool/BashTool.tsx:555-623

【这段解决什么问题】

BashTool.call() 返回的是内部 Out，但模型下一轮需要看到的是 Anthropic API 的 tool_result block。

这一段负责做转换。

【源码怎么做】

源码先处理结构化内容：

ts 复制代码

if (structuredContent && structuredContent.length > 0) {
  return {
    tool_use_id: toolUseID,
    type: 'tool_result',
    content: structuredContent,
  }
}

再处理图片：

ts 复制代码

if (isImage) {
  const block = buildImageToolResult(stdout, toolUseID)
  if (block) return block
}

普通文本则清理 stdout：

ts 复制代码

let processedStdout = stdout
if (stdout) {
  processedStdout = stdout.replace(/^(\s*\n)+/, '')
  processedStdout = processedStdout.trimEnd()
}

如果输出太大，之前已经被持久化到工具结果目录，这里会给模型一个"可读取文件路径 + preview"：

ts 复制代码

if (persistedOutputPath) {
  const preview = generatePreview(processedStdout, PREVIEW_SIZE_BYTES)
  processedStdout = buildLargeToolResultMessage({
    filepath: persistedOutputPath,
    originalSize: persistedOutputSize ?? 0,
    isJson: false,
    preview: preview.preview,
    hasMore: preview.hasMore,
  })
}

然后拼 stderr、interrupted、backgroundInfo：

ts 复制代码

let errorMessage = stderr.trim()
if (interrupted) {
  if (stderr) errorMessage += EOL
  errorMessage += '<error>Command was aborted before completion</error>'
}

let backgroundInfo = ''
if (backgroundTaskId) {
  const outputPath = getTaskOutputPath(backgroundTaskId)
  backgroundInfo = `Command running in background with ID: ${backgroundTaskId}. Output is being written to: ${outputPath}`
}

最后返回真正的 tool_result：

ts 复制代码

return {
  tool_use_id: toolUseID,
  type: 'tool_result',
  content: [processedStdout, errorMessage, backgroundInfo].filter(Boolean).join('\n'),
  is_error: interrupted,
}

注意：非零退出码不一定直接在这里设置 is_error。前面 call() 会用 interpretCommandResult() 决定是否抛 ShellError。

【C++ 类比】

cpp 复制代码

ToolResultBlock toApiBlock(BashOutput out, std::string toolUseId) {
    if (out.isImage) {
        return makeImageToolResult(out.stdout, toolUseId);
    }

    std::string content = trim(out.stdout);

    if (out.persistedOutputPath) {
        content = makeLargeOutputMessage(out.persistedOutputPath, preview(out.stdout));
    }

    if (!out.stderr.empty()) {
        content += "\n" + out.stderr;
    }

    if (out.backgroundTaskId) {
        content += "\nCommand running in background with ID: " + *out.backgroundTaskId;
    }

    return ToolResultBlock{toolUseId, content, out.interrupted};
}

【笔记】

md 复制代码

### Bash 输出如何回到模型

`mapToolResultToToolResultBlockParam()` 负责把 `BashTool.call()` 的内部输出 `Out` 转成 API 需要的 `tool_result`。

它会处理：

- 结构化内容
- 图片输出
- stdout 清理
- 大输出持久化 preview
- stderr
- interrupted
- background task 信息

最终回到模型的是：

```ts
{
  type: "tool_result",
  tool_use_id,
  content,
  is_error
}
```

这就是 Phase1 里 agent loop 继续下一轮推理的输入。

BashTool.call：从工具调用进入真实执行

【当前源码】

src/tools/BashTool/BashTool.tsx:624-820

【这段解决什么问题】

这一段是 BashTool 的真正执行入口。

权限已经通过后，runtime 会调用：

ts 复制代码

BashTool.call(input, toolUseContext, canUseTool, parentMessage, onProgress)

它负责：

处理内部 sed edit。
调用 runShellCommand()。
消费 progress generator。
解释退出码。
处理错误。
持久化大输出。
压缩图片输出。
构造 Out。

【源码怎么做】

第一步，处理 _simulatedSedEdit：

ts 复制代码

if (input._simulatedSedEdit) {
  return applySedEdit(input._simulatedSedEdit, toolUseContext, parentMessage)
}

这个分支不会真的运行 sed，而是把权限预览过的新内容直接写入文件。这样可以保证：

text 复制代码

用户在权限弹窗里看到的 diff
==
最终写入文件的内容

第二步，准备运行上下文：

ts 复制代码

const {
  abortController,
  getAppState,
  setAppState,
  setToolJSX,
} = toolUseContext

const stdoutAccumulator = new EndTruncatingAccumulator()
let stderrForShellReset = ''
let wasInterrupted = false
let result: ExecResult
const isMainThread = !toolUseContext.agentId
const preventCwdChanges = !isMainThread

这里要注意 preventCwdChanges：

text 复制代码

主 agent 可以改变工作目录。
subagent 里默认阻止 cwd 改变，避免子任务污染主会话状态。

第三步，调用 runShellCommand()：

ts 复制代码

const commandGenerator = runShellCommand({
  input,
  abortController,
  setAppState: toolUseContext.setAppStateForTasks ?? setAppState,
  setToolJSX,
  preventCwdChanges,
  isMainThread,
  toolUseId: toolUseContext.toolUseId,
  agentId: toolUseContext.agentId,
})

runShellCommand() 是一个 async generator。它会不断 yield progress，最后 return ExecResult。

消费方式是：

ts 复制代码

let generatorResult
do {
  generatorResult = await commandGenerator.next()
  if (!generatorResult.done && onProgress) {
    const progress = generatorResult.value
    onProgress({
      toolUseID: `bash-progress-${progressCounter++}`,
      data: {
        type: 'bash_progress',
        output: progress.output,
        fullOutput: progress.fullOutput,
        elapsedTimeSeconds: progress.elapsedTimeSeconds,
        totalLines: progress.totalLines,
        totalBytes: progress.totalBytes,
        taskId: progress.taskId,
        timeoutMs: progress.timeoutMs,
      },
    })
  }
} while (!generatorResult.done)

result = generatorResult.value

这就是 ShellTool 能实时显示输出进度的原因。

第四步，解释命令结果：

ts 复制代码

interpretationResult = interpretCommandResult(
  input.command,
  result.code,
  result.stdout || '',
  '',
)

不是所有非 0 都是错误。比如 grep 返回 1 表示没匹配到，不一定是工具失败。

第五步，如果语义上是错误，就抛 ShellError：

ts 复制代码

if (interpretationResult.isError && !isInterrupt) {
  if (result.code !== 0) {
    stdoutAccumulator.append(`Exit code ${result.code}`)
  }
}

...

if (interpretationResult.isError && !isInterrupt) {
  throw new ShellError('', outputWithSbFailures, result.code, result.interrupted)
}

这会被外层工具执行系统捕获，并转成错误 tool_result。

第六步，持久化大输出：

ts 复制代码

if (result.outputFilePath && result.outputTaskId) {
  const fileStat = await fsStat(result.outputFilePath)
  persistedOutputSize = fileStat.size
  await ensureToolResultsDir()
  const dest = getToolResultPath(result.outputTaskId, false)
  if (fileStat.size > MAX_PERSISTED_SIZE) {
    await fsTruncate(result.outputFilePath, MAX_PERSISTED_SIZE)
  }
  await link(result.outputFilePath, dest)
  persistedOutputPath = dest
}

这里的策略是：

text 复制代码

小输出：直接放进 stdout
大输出：写入 tool-results 目录，只给模型 preview + 文件路径
超过 64MB：截断持久化文件

第七步，处理 Claude Code hint 和图片输出：

ts 复制代码

const extracted = extractClaudeCodeHints(strippedStdout, input.command)
strippedStdout = extracted.stripped

let isImage = isImageOutput(strippedStdout)
if (isImage) {
  const resized = await resizeShellImageOutput(...)
  ...
}

<claude-code-hint /> 是某些 CLI/SDK 通过 stderr 发出的提示标签。源码会记录提示，但不会让模型看到这个标签。

如果 stdout 是图片 data URI，会尝试压缩/缩放，然后作为 image tool_result 返回。

最后构造 Out：

ts 复制代码

const data: Out = {
  stdout: compressedStdout,
  stderr: stderrForShellReset,
  interrupted: wasInterrupted,
  isImage,
  returnCodeInterpretation: interpretationResult?.message,
  noOutputExpected: isSilentBashCommand(input.command),
  backgroundTaskId: result.backgroundTaskId,
  backgroundedByUser: result.backgroundedByUser,
  assistantAutoBackgrounded: result.assistantAutoBackgrounded,
  dangerouslyDisableSandbox: input.dangerouslyDisableSandbox,
  persistedOutputPath,
  persistedOutputSize,
}

return { data }

【笔记】

md 复制代码

### BashTool.call 的职责

`BashTool.call()` 是 shell tool 的执行入口。

它不是简单调用 `exec()`，而是做了一整层 runtime 包装：

1. 内部 sed edit 走直接写入，保证预览和写入一致。
2. 调用 `runShellCommand()` 获得 progress + final result。
3. 用 `interpretCommandResult()` 解释退出码。
4. 语义错误抛 `ShellError`。
5. 大输出持久化到 tool-results 文件。
6. 处理 Claude Code hints。
7. 识别并压缩图片输出。
8. 构造结构化 `Out`。

所以 BashTool.call 可以看成：

```text
process execution wrapper
  + progress bridge
  + semantic error handling
  + output persistence
  + image handling
  + agent-loop result adapter
```

runShellCommand：真实 shell 进程生命周期

【当前源码】

src/tools/BashTool/BashTool.tsx:826-1143

【这段解决什么问题】

runShellCommand() 是真正把命令交给底层 shell 执行的地方。

它用 async generator 表达一个进程生命周期：

text 复制代码

启动命令
-> 等待初始结果或进度阈值
-> 轮询输出文件
-> yield progress
-> 支持后台化
-> 返回最终 ExecResult

【源码怎么做】

函数签名：

ts 复制代码

async function* runShellCommand(...): AsyncGenerator<Progress, ExecResult, void>

这意味着它不是普通 async function。

普通 async function：

text 复制代码

await run() -> 一次性得到最终结果

async generator：

text 复制代码

for await / next()
  -> 中途多次得到 progress
  -> 最后得到 ExecResult

第一步，取出输入并计算 timeout：

ts 复制代码

const {
  command,
  description,
  timeout,
  run_in_background,
} = input

const timeoutMs = timeout || getDefaultTimeoutMs()

第二步，判断是否允许自动后台化：

ts 复制代码

const shouldAutoBackground =
  !isBackgroundTasksDisabled &&
  isAutobackgroundingAllowed(command)

源码里明确 sleep 不应被自动后台化：

ts 复制代码

const DISALLOWED_AUTO_BACKGROUND_COMMANDS = ['sleep']

因为单纯 sleep 通常不是有意义的后台任务。

第三步，真正执行命令：

ts 复制代码

const shellCommand = await exec(command, abortController.signal, 'bash', {
  timeout: timeoutMs,
  onProgress(lastLines, allLines, totalLines, totalBytes, isIncomplete) {
    lastProgressOutput = lastLines
    fullOutput = allLines
    lastTotalLines = totalLines
    lastTotalBytes = isIncomplete ? totalBytes : 0
    const resolve = resolveProgress
    if (resolve) {
      resolveProgress = null
      resolve()
    }
  },
  preventCwdChanges,
  shouldUseSandbox: shouldUseSandbox(input),
  shouldAutoBackground,
})

这里是 Phase6 最核心的真实动作：

text 复制代码

exec(command, signal, 'bash', options)

几个关键参数：

abortController.signal：用户中断或上层取消时可以停止命令。
timeout：命令运行时间限制。
onProgress：输出更新时唤醒 progress generator。
preventCwdChanges：是否阻止 cwd 被命令改变。
shouldUseSandbox：是否放入沙箱执行。
shouldAutoBackground：超时或长运行时是否自动后台化。

第四步，如果模型显式请求后台运行：

ts 复制代码

if (run_in_background === true && !isBackgroundTasksDisabled) {
  const shellId = await spawnBackgroundTask()
  return {
    stdout: '',
    stderr: '',
    code: 0,
    interrupted: false,
    backgroundTaskId: shellId,
  }
}

这时工具调用会很快返回一个 backgroundTaskId，模型和用户之后可以通过任务输出路径继续看结果。

第五步，先等一个 progress threshold：

ts 复制代码

const initialResult = await Promise.race([
  resultPromise,
  new Promise<null>(resolve => {
    const t = setTimeout((r) => r(null), PROGRESS_THRESHOLD_MS, resolve)
    t.unref()
  }),
])

if (initialResult !== null) {
  shellCommand.cleanup()
  return initialResult
}

含义是：

text 复制代码

如果命令很快结束，就不进入复杂 progress UI。
如果超过阈值还没结束，才开始轮询输出并显示进度。

第六步，启动输出轮询：

ts 复制代码

TaskOutput.startPolling(shellCommand.taskOutput.taskId)

然后进入 progress loop：

ts 复制代码

while (true) {
  const progressSignal = createProgressSignal()
  const result = await Promise.race([resultPromise, progressSignal])

  if (result !== null) {
    shellCommand.cleanup()
    return result
  }

  if (backgroundShellId) {
    return {
      stdout: '',
      stderr: '',
      code: 0,
      interrupted: false,
      backgroundTaskId: backgroundShellId,
      assistantAutoBackgrounded,
    }
  }

  yield {
    type: 'progress',
    fullOutput,
    output: lastProgressOutput,
    elapsedTimeSeconds,
    totalLines: lastTotalLines,
    totalBytes: lastTotalBytes,
    taskId: shellCommand.taskOutput.taskId,
    timeoutMs,
  }
}

这里有三种可能：

text 复制代码

命令完成：return ExecResult
命令被后台化：return backgroundTaskId
命令还在跑：yield progress

最后无论如何停止轮询：

ts 复制代码

finally {
  TaskOutput.stopPolling(shellCommand.taskOutput.taskId)
}

【C++ 类比】

可以把它想成一个协程：

cpp 复制代码

Generator<Progress, ExecResult> runShellCommand(BashInput input) {
    auto process = exec(input.command, options);

    while (process.running()) {
        if (process.hasNewOutput()) {
            co_yield Progress{process.lastLines()};
        }

        if (process.shouldBackground()) {
            co_return ExecResult{.backgroundTaskId = process.taskId()};
        }
    }

    co_return process.result();
}

【笔记】

md 复制代码

### runShellCommand 是 shell 进程生命周期协程

`runShellCommand()` 是 `async function*`，它会中途 `yield progress`，最后 `return ExecResult`。

核心链路：

```text
exec(command, signal, "bash", options)
-> resultPromise
-> progress threshold
-> TaskOutput.startPolling()
-> while true:
     resultPromise 完成：return ExecResult
     backgroundShellId 出现：return background task result
     否则 yield bash_progress
-> finally stopPolling
```

这就是 Claude Code 可以一边跑命令、一边显示输出进度、一边支持后台化的原因。

shouldUseSandbox：命令是否进入沙箱

【当前源码】

src/tools/BashTool/shouldUseSandbox.ts:130-153

【这段解决什么问题】

沙箱不是权限系统的替代品，而是 Bash 执行时的一层隔离。

这一段决定某条命令是否应该在沙箱中执行。

【源码怎么做】

ts 复制代码

export function shouldUseSandbox(input: Partial<SandboxInput>): boolean {
  if (!SandboxManager.isSandboxingEnabled()) {
    return false
  }

  if (
    input.dangerouslyDisableSandbox &&
    SandboxManager.areUnsandboxedCommandsAllowed()
  ) {
    return false
  }

  if (!input.command) {
    return false
  }

  if (containsExcludedCommand(input.command)) {
    return false
  }

  return true
}

判断顺序是：

全局没开沙箱：不用。
输入请求 dangerouslyDisableSandbox，且策略允许无沙箱命令：不用。
没有命令：不用。
命中 excluded commands：不用。
其他情况：使用沙箱。

注意源码注释强调：

ts 复制代码

// excludedCommands is a user-facing convenience feature, not a security boundary.

也就是说：

text 复制代码

excludedCommands 只是用户配置的便利机制。
真正安全边界仍然是权限系统和沙箱本身。

【笔记】

md 复制代码

### shouldUseSandbox

`shouldUseSandbox()` 决定 Bash 命令是否在沙箱中执行。

它会考虑：

- 沙箱是否全局启用。
- 是否显式请求 `dangerouslyDisableSandbox`。
- 当前策略是否允许无沙箱命令。
- 命令是否命中 excludedCommands。

需要注意：`excludedCommands` 不是安全边界，只是用户体验配置。真正的安全控制仍然是权限判断和沙箱机制。

commandSemantics：不是所有非零退出码都是错误

【当前源码】

src/tools/BashTool/commandSemantics.ts:1-140

【这段解决什么问题】

在 shell 里，退出码不总是"0 成功，非 0 失败"这么简单。

例如：

text 复制代码

grep pattern file

返回 1 可能只是没有匹配结果，不代表命令运行失败。

如果 agent 把这种情况当成工具错误，就会误导模型。

【源码怎么做】

默认语义：

ts 复制代码

const DEFAULT_SEMANTIC: CommandSemantic = (exitCode, _stdout, _stderr) => ({
  isError: exitCode !== 0,
  message:
    exitCode !== 0 ? `Command failed with exit code ${exitCode}` : undefined,
})

特殊语义：

ts 复制代码

const COMMAND_SEMANTICS: Map<string, CommandSemantic> = new Map([
  [
    'grep',
    (exitCode, _stdout, _stderr) => ({
      isError: exitCode >= 2,
      message: exitCode === 1 ? 'No matches found' : undefined,
    }),
  ],
  [
    'rg',
    (exitCode, _stdout, _stderr) => ({
      isError: exitCode >= 2,
      message: exitCode === 1 ? 'No matches found' : undefined,
    }),
  ],
  [
    'diff',
    (exitCode, _stdout, _stderr) => ({
      isError: exitCode >= 2,
      message: exitCode === 1 ? 'Files differ' : undefined,
    }),
  ],
])

入口函数：

ts 复制代码

export function interpretCommandResult(
  command: string,
  exitCode: number,
  stdout: string,
  stderr: string,
): {
  isError: boolean
  message?: string
} {
  const semantic = getCommandSemantic(command)
  const result = semantic(exitCode, stdout, stderr)

  return {
    isError: result.isError,
    message: result.message,
  }
}

这里的设计很重要：

text 复制代码

process exit code
不直接等于
tool_result is_error

中间要经过 command semantics。

【C++ 类比】

cpp 复制代码

struct CommandResultInterpretation {
    bool isError;
    std::optional<std::string> message;
};

CommandResultInterpretation interpret(std::string cmd, int exitCode) {
    if (cmd == "grep" || cmd == "rg") {
        if (exitCode == 1) return {false, "No matches found"};
        return {exitCode >= 2, std::nullopt};
    }

    if (cmd == "diff") {
        if (exitCode == 1) return {false, "Files differ"};
        return {exitCode >= 2, std::nullopt};
    }

    return {exitCode != 0, exitCode != 0 ? "Command failed" : std::nullopt};
}

【笔记】

md 复制代码

### commandSemantics

`commandSemantics.ts` 解决的是：不同 CLI 的退出码语义不同。

比如：

- `grep` / `rg`：退出码 1 表示没匹配，不是工具错误。
- `diff`：退出码 1 表示文件不同，不是工具错误。
- `find`：退出码 1 可能表示部分目录不可访问。

所以 BashTool 不直接把非零退出码当成失败，而是调用：

```ts
interpretCommandResult(command, exitCode, stdout, stderr)
```

判断这次命令结果在 agent loop 里是否应该算 error。

destructiveCommandWarning：危险命令提示不是权限判定

【当前源码】

src/tools/BashTool/destructiveCommandWarning.ts:1-102

【这段解决什么问题】

有些命令不一定被权限系统直接拒绝，但应该在用户确认时显示明确风险。

例如：

git reset --hard
git push --force
git clean -f
rm -rf
kubectl delete
terraform destroy

【源码怎么做】

源码定义了一组正则模式：

ts 复制代码

const DESTRUCTIVE_PATTERNS: DestructivePattern[] = [
  {
    pattern: /\bgit\s+reset\s+--hard\b/,
    warning: 'Note: may discard uncommitted changes',
  },
  {
    pattern: /\bgit\s+push\b[^;&|\n]*[ \t](--force|--force-with-lease|-f)\b/,
    warning: 'Note: may overwrite remote history',
  },
  {
    pattern:
      /(^|[;&|\n]\s*)rm\s+-[a-zA-Z]*[rR][a-zA-Z]*f|.../,
    warning: 'Note: may recursively force-remove files',
  },
]

然后用一个简单函数返回提示：

ts 复制代码

export function getDestructiveCommandWarning(command: string): string | null {
  for (const { pattern, warning } of DESTRUCTIVE_PATTERNS) {
    if (pattern.test(command)) {
      return warning
    }
  }
  return null
}

源码开头注释特别关键：

ts 复制代码

// this is purely informational --- it doesn't affect permission logic or auto-approval.

也就是说：

text 复制代码

destructiveCommandWarning 只负责 UI 提醒。
真正 allow / deny / ask 仍然由权限系统决定。

【笔记】

md 复制代码

### destructiveCommandWarning

`destructiveCommandWarning.ts` 用正则识别一些破坏性命令，并返回用户可读的提示文案。

它覆盖：

- Git 数据丢失命令
- Git 安全绕过命令
- 文件删除命令
- 数据库删除命令
- 基础设施删除命令

但它不改变权限判定结果。源码注释明确说：这是 informational，不影响 permission logic 或 auto-approval。

toolExecution：Bash 执行前还有一层统一工具闸门

【当前源码】

src/services/tools/toolExecution.ts:599-733

src/services/tools/toolExecution.ts:916-1104

src/services/tools/toolExecution.ts:1128-1222

【这段解决什么问题】

BashTool 不是由 agent loop 直接调用。

中间还有统一的工具执行层：

text 复制代码

checkPermissionsAndCallTool()

它负责把所有工具调用变成一致流程。

【源码怎么做】

第一步，schema 校验：

ts 复制代码

const parsedInput = tool.inputSchema.safeParse(input)
if (!parsedInput.success) {
  return [
    {
      message: createUserMessage({
        content: [
          {
            type: 'tool_result',
            content: `<tool_use_error>InputValidationError: ${errorContent}</tool_use_error>`,
            is_error: true,
            tool_use_id: toolUseID,
          },
        ],
      }),
    },
  ]
}

这一步失败时，不会崩溃，而是构造一个错误 tool_result 交还给模型。

第二步，调用工具自己的 validateInput()：

ts 复制代码

const isValidCall = await tool.validateInput?.(
  parsedInput.data,
  toolUseContext,
)
if (isValidCall?.result === false) {
  return [
    {
      message: createUserMessage({
        content: [
          {
            type: 'tool_result',
            content: `<tool_use_error>${isValidCall.message}</tool_use_error>`,
            is_error: true,
            tool_use_id: toolUseID,
          },
        ],
      }),
    },
  ]
}

第三步，如果是 Bash，提前启动 speculative classifier：

ts 复制代码

if (tool.name === BASH_TOOL_NAME && 'command' in parsedInput.data) {
  startSpeculativeClassifierCheck(
    (parsedInput.data as BashToolInput).command,
    appState.toolPermissionContext,
    toolUseContext.abortController.signal,
    toolUseContext.options.isNonInteractiveSession,
  )
}

这个优化让权限分类器可以和 hooks / permission UI 并行工作。

第四步，运行 PreToolUse hooks：

ts 复制代码

for await (const result of runPreToolUseHooks(...)) {
  switch (result.type) {
    case 'hookPermissionResult':
      hookPermissionResult = result.hookPermissionResult
      break
    case 'hookUpdatedInput':
      processedInput = result.updatedInput
      break
    case 'stop':
      return resultingMessages
  }
}

hook 可以：

输出消息。
修改输入。
给出权限结果。
阻止继续执行。

第五步，解析权限：

ts 复制代码

const resolved = await resolveHookPermissionDecision(
  hookPermissionResult,
  tool,
  processedInput,
  toolUseContext,
  canUseTool,
  assistantMessage,
  toolUseID,
)

const permissionDecision = resolved.decision
processedInput = resolved.input

如果权限不是 allow：

ts 复制代码

if (permissionDecision.behavior !== 'allow') {
  const messageContent: ContentBlockParam[] = [
    {
      type: 'tool_result',
      content: errorMessage,
      is_error: true,
      tool_use_id: toolUseID,
    },
  ]

  resultingMessages.push({
    message: createUserMessage({
      content: messageContent,
      toolUseResult: `Error: ${errorMessage}`,
      sourceToolAssistantUUID: assistantMessage.uuid,
    }),
  })

  return resultingMessages
}

这就是 Phase5 最重要的闭环：权限拒绝不是程序异常，而是一个普通的 tool_result is_error=true。

第六步，权限允许后才进入 tool.call()：

ts 复制代码

const result = await tool.call(
  callInput,
  {
    ...toolUseContext,
    toolUseId: toolUseID,
    userModified: permissionDecision.userModified ?? false,
  },
  canUseTool,
  assistantMessage,
  progress => {
    onToolProgress({
      toolUseID: progress.toolUseID,
      data: progress.data,
    })
  },
)

所以 BashTool 的真实执行位置是在整个统一工具管线的最后阶段。

【笔记】

md 复制代码

### BashTool 前面的统一执行闸门

`BashTool.call()` 不是直接被 agent loop 调用，而是先经过：

```text
checkPermissionsAndCallTool()
```

这个统一执行层会做：

1. `inputSchema.safeParse()`
2. `tool.validateInput()`
3. Bash speculative classifier
4. PreToolUse hooks
5. `resolveHookPermissionDecision()`
6. 权限不允许时生成错误 `tool_result`
7. 权限允许后调用 `tool.call()`

这说明 Claude Code 的工具执行不是：

```text
model -> tool.call()
```

而是：

```text
model -> schema -> validation -> hooks -> permission -> call -> tool_result
```

Phase 6 总结：ShellTool 是最危险也最完整的工具

ShellTool 的完整调用链

text 复制代码

assistant/tool_use Bash
  input:
    command
    timeout?
    description?
    run_in_background?
    dangerouslyDisableSandbox?

-> toolExecution.checkPermissionsAndCallTool()
  -> inputSchema.safeParse()
  -> BashTool.validateInput()
  -> startSpeculativeClassifierCheck()
  -> runPreToolUseHooks()
  -> resolveHookPermissionDecision()
  -> permissionDecision

-> if permission != allow:
  -> createUserMessage(tool_result is_error=true)
  -> return to model

-> if permission == allow:
  -> BashTool.call()
     -> maybe applySedEdit()
     -> runShellCommand()
        -> exec(command, signal, "bash", {
             timeout,
             preventCwdChanges,
             shouldUseSandbox,
             shouldAutoBackground,
             onProgress,
           })
        -> progress loop
        -> background task or final ExecResult
     -> interpretCommandResult()
     -> persist large output
     -> resize image output
     -> return Out

-> mapToolResultToToolResultBlockParam()
  -> structured content / image / text
  -> large output preview
  -> background task info
  -> user/tool_result

-> query loop appends tool_result
-> next model call

Phase 6 和 Phase 5 的关系

Phase 5 看的是：

text 复制代码

是否允许执行 Bash？

Phase 6 看的是：

text 复制代码

允许后如何执行 Bash？

二者合起来才是 shell tool 的完整安全模型：

text 复制代码

权限系统：
  防止模型做不该做的事

执行系统：
  防止真实进程失控、输出爆炸、前台阻塞、错误语义误判

Python mini-agent 的 ShellTool 抽象

如果要在 mini-agent 里实践 Phase6，可以先做一个简化版：

python 复制代码

from dataclasses import dataclass
import subprocess

@dataclass
class ShellResult:
    stdout: str
    stderr: str
    exit_code: int
    timed_out: bool = False

class ShellTool:
    name = "shell"

    def __init__(self, permission_manager):
        self.permission_manager = permission_manager

    def call(self, command: str, timeout: int = 10) -> ShellResult:
        decision = self.permission_manager.check_shell(command)
        if not decision.allow:
            return ShellResult(
                stdout="",
                stderr=f"Permission denied: {decision.reason}",
                exit_code=126,
            )

        try:
            completed = subprocess.run(
                command,
                shell=True,
                text=True,
                capture_output=True,
                timeout=timeout,
            )
            return ShellResult(
                stdout=self._truncate(completed.stdout),
                stderr=self._truncate(completed.stderr),
                exit_code=completed.returncode,
            )
        except subprocess.TimeoutExpired as e:
            return ShellResult(
                stdout=self._truncate(e.stdout or ""),
                stderr="Command timed out",
                exit_code=124,
                timed_out=True,
            )

    def _truncate(self, text: str, limit: int = 12000) -> str:
        if len(text) <= limit:
            return text
        return text[:limit] + "\n...[truncated]"

再进一步，可以补上：

read-only 命令自动允许。
危险命令询问或拒绝。
timeout。
stdout/stderr/exit code。
大输出截断。
非零退出码语义解释。
后台任务。

最终可以这样记

text 复制代码

BashTool =
  schema：模型可以怎么请求 shell
  validateInput：调用方式是否合理
  checkPermissions：这条命令是否允许执行
  call：把命令变成真实进程
  runShellCommand：管理进程生命周期
  commandSemantics：解释退出码
  output persistence：处理大输出
  mapToolResult：把结果回填给模型

ShellTool 是 AI coding agent 里最危险的工具，因为它能影响真实机器。

但也正因为它危险，Claude Code 给它加了最多层保护：

text 复制代码

schema
权限规则
安全解析
沙箱
hooks
timeout
后台任务
输出限制
错误语义解释
tool_result 回填