七、敏感路径预检——Protected Paths

本篇讲解 src/protectedPaths.ts------在命令执行之前，检查它想不想碰敏感文件（如 ~/.ssh、~/.npmrc）或工作区外的路径。

1. 为什么需要路径预检？

假设 AI 助手执行了这样一条命令：

bash 复制代码

cat ~/.ssh/id_rsa

虽然 cat 只是读文件（L0），但读取 SSH 私钥是绝对不允许的！风险分级只看命令本身，不看命令操作的路径。路径预检就是来补这个缺的。

再比如：

bash 复制代码

cp /etc/passwd /tmp/stolen

cp 是 L2，但如果它试图从系统目录拷贝文件到临时目录，也应该被拦截。

2. 三类保护规则

2.1 敏感目录保护（protected-secret）

typescript 复制代码

export const DEFAULT_PROTECTED_HOME_RELATIVE = [
  '.ssh',               // SSH 密钥
  '.npmrc',             // npm 凭证（可能含 token）
  '.aws/credentials',   // AWS 凭证
  '.config/gh/hosts.yml', // GitHub CLI 凭证
  '.git-credentials',   // Git 凭证
  '.netrc',             // 网络凭证
] as const;

这些是用户 HOME 目录下"绝对不让 AI 碰"的路径。

2.2 工作区外路径保护（outside-workspace）

即使不在敏感目录列表里，只要路径在工作区外（且不在系统白名单里），也应该被拦截。

2.3 系统路径白名单

typescript 复制代码

export const SYSTEM_READ_ALLOWLIST = [
  '/usr', '/bin', '/sbin', '/System', '/Library',
  '/Applications', '/private/tmp', '/var/folders',
  '/dev', '/opt', '/etc',
] as const;

这些是命令正常执行时可能需要访问的系统路径（可执行文件、运行时库等），应该放行。

3. 核心流程

arduino 复制代码

命令规格（SandboxCommandSpec）
    │
    ▼
提取路径候选（extractPathCandidatesFrom*）
    │
    ▼
对每个候选解析绝对路径（resolvePathCandidate）
    │
    ▼
检查是否命中敏感路径（matchProtectedSecret）
    │
    ├── 命中 → 违规（protected-secret）
    │
    └── 未命中 → 检查是否在工作区外
            │
            ├── 在工作区外且不在系统白名单 → 违规（outside-workspace）
            └── 在工作区内或在系统白名单 → 通过

4. 提取路径候选

4.1 Shell 模式

typescript 复制代码

function extractPathCandidatesFromShell(text: string): string[] {
  const candidates: string[] = [];
  const tokens = text.split(/[\s|;&><]+/).filter(Boolean);

  for (const rawToken of tokens) {
    const token = rawToken.replace(/^['"]|['"]$/g, '');  // 去掉引号
    if (looksLikePath(token)) {
      candidates.push(token);
    }
  }

  return candidates;
}

白话：把命令字符串按空格、管道、分号等切分成 token，然后看哪些 token 像路径。

4.2 Exec 模式

typescript 复制代码

function extractPathCandidatesFromExec(executable: string, args: string[]): string[] {
  const candidates: string[] = [];

  for (let index = 0; index < args.length; index += 1) {
    const arg = args[index];
    // -f / --file 等参数的下一个值是路径
    if (PATH_VALUE_FLAGS.has(arg) && args[index + 1]) {
      candidates.push(args[index + 1]);
      index += 1;
      continue;
    }
    if (looksLikePath(arg)) {
      candidates.push(arg);
    }
  }

  if (executable.startsWith('/')) {
    candidates.push(executable);
  }

  return candidates;
}

PATH_VALUE_FLAGS 包括 -f、--file、-c、--config 等参数，它们的下一个参数通常是路径。

4.3 looksLikePath------判断一个 token 是否像路径

typescript 复制代码

function looksLikePath(token: string): boolean {
  if (!token || token.startsWith('-')) return false;  // 排除 flag

  return (
    token.startsWith('/') ||           // 绝对路径
    token.startsWith('~/') ||          // HOME 相对
    token === '~' ||                   // HOME 本身
    token.startsWith('$HOME') ||       // $HOME 变量
    token.startsWith('./') ||          // 当前目录
    token.startsWith('../') ||         // 上级目录
    (token.includes('/') && !/^[\d./]+$/.test(token))  // 包含 / 但不是纯数字
  );
}

5. 路径解析

typescript 复制代码

function resolvePathCandidate(
  candidate: string,
  ctx: { realHome: string; executionRoot: string },
): string {
  const trimmed = candidate.trim();

  if (trimmed === '~') return path.normalize(ctx.realHome);
  if (trimmed.startsWith('~/')) return path.normalize(path.join(ctx.realHome, trimmed.slice(2)));
  if (trimmed === '$HOME') return path.normalize(ctx.realHome);
  if (trimmed.startsWith('$HOME/')) return path.normalize(path.join(ctx.realHome, trimmed.slice(6)));
  if (path.isAbsolute(trimmed)) return path.normalize(trimmed);

  // 相对路径：基于 executionRoot 解析
  return path.normalize(path.join(ctx.executionRoot, trimmed));
}

把各种路径格式统一解析成绝对路径。

6. findProtectedPathViolations------主函数

typescript 复制代码

/**
 * 扫描命令规格，返回所有 protected path 违规；无违规则返回空数组。
 *
 * @param options - 命令规格、执行根、真实 HOME 及可选规则覆盖
 */
export function findProtectedPathViolations(
  options: ProtectedPathCheckOptions,
): ProtectedPathViolation[] {
  const protectedAbs = resolveProtectedAbsPaths(options.realHome, options.protectedHomeRelative);
  const systemAllowlist = options.systemReadAllowlist ?? SYSTEM_READ_ALLOWLIST;
  const candidates = options.commandSpec.kind === 'shell'
    ? extractPathCandidatesFromShell(options.commandSpec.shellCommand)
    : extractPathCandidatesFromExec(options.commandSpec.executable, options.commandSpec.args);

  const violations: ProtectedPathViolation[] = [];
  // 用 seen 去重：同一个绝对路径只报一次违规
  const seen = new Set<string>();

  for (const candidate of candidates) {
    const resolved = resolvePathCandidate(candidate, { realHome: options.realHome, executionRoot: options.executionRoot });
    if (seen.has(resolved)) continue;
    seen.add(resolved);

    const secretViolation = matchProtectedSecret(resolved, protectedAbs);
    if (secretViolation) {
      violations.push(secretViolation);
      continue;
    }

    if (!isInsideWorkspace(options.executionRoot, resolved) && !isSystemReadAllowed(resolved, systemAllowlist)) {
      violations.push({
        kind: 'outside-workspace',
        path: resolved,
        message: `Access outside execution workspace is blocked: ${resolved}`,
      });
    }
  }

  return violations;
}

7. 敏感凭证变量剥离

除了路径保护，还有一个相关的常量：

typescript 复制代码

export const STRIPPED_SECRET_ENV_KEYS = [
  'SSH_AUTH_SOCK',        // SSH 认证 socket
  'SSH_AGENT_PID',        // SSH agent 进程 ID
  'GIT_SSH_COMMAND',      // Git 使用的 SSH 命令
  'GIT_SSH',              // Git SSH 配置
  'NPM_CONFIG_USERCONFIG', // npm 用户配置
  'NPM_CONFIG_GLOBALCONFIG',
  'AWS_SHARED_CREDENTIALS_FILE',
  'AWS_CONFIG_FILE',
  'GITHUB_TOKEN',         // GitHub Token
  'GH_TOKEN',             // GitHub CLI Token
] as const;

这些环境变量会在构建子进程环境时被剥离

8. 局限性

路径预检是静态分析，有自己的局限：

不做 shell 解析 ：cat ~/".ssh"/id_rsa 中的引号不会被正确处理
不看命令动态行为 ：node -e "require('fs').readFileSync('/etc/passwd')" 不会命中任何路径候选
环境变量展开有限 ：只处理了 ~ 和 $HOME，不支持 ${HOME} 或自定义变量

所以------路径预检只是第一道防线，不能替代 OS 级沙箱。

9. 小结

函数/常量	作用
`DEFAULT_PROTECTED_HOME_RELATIVE`	默认保护的 HOME 子路径列表
`SYSTEM_READ_ALLOWLIST`	工作区外的系统路径白名单
`STRIPPED_SECRET_ENV_KEYS`	需要剥离的敏感环境变量名
`resolveProtectedAbsPaths`	把相对路径展开成绝对路径
`findProtectedPathViolations`	主函数：扫描违规并返回列表
`assertProtectedPathAccess`	便捷函数：有违规就抛异常

核心思想：命令还没跑，先看它想碰什么文件。碰了不该碰的，直接拦住。