claw-code 源码分析：大型移植的测试哲学——如何用 unittest 门禁守住「诚实未完成」的口碑？

涉及源码 ：tests/test_porting_workspace.py、src/setup.py、src/parity_audit.py、src/main.py、src/hooks/__init__.py、src/execution_registry.py；对照 Rust rust/crates/compat-harness 中「无夹具则早退」的测试写法。

1. 门禁长什么样：单一 discover 入口

移植工作区的 官方测试命令 写死在 WorkspaceSetup 里，与 setup-report 的「启动步骤」叙事一致：

python 复制代码

# 12:26:src/setup.py
@dataclass(frozen=True)
class WorkspaceSetup:
    python_version: str
    implementation: str
    platform_name: str
    test_command: str = 'python3 -m unittest discover -s tests -v'

    def startup_steps(self) -> tuple[str, ...]:
        return (
            'start top-level prefetch side effects',
            'build workspace context',
            'load mirrored command snapshot',
            'load mirrored tool snapshot',
            'prepare parity audit hooks',
            'apply trust-gated deferred init',
        )

tests/ 下目前只有 test_porting_workspace.py，unittest discover 会整包执行。哲学上：门禁窄而可重复------不依赖 pytest 插件矩阵，只用标准库，降低「测框架比产品还重」的风险。

2. 诚实未完成的核心：`archive_present` 条件门禁

2.1 Parity：有档案才严，没档案不装完成

run_parity_audit() 用 ARCHIVE_ROOT.exists() 判定是否存在本地 TS 快照目录；to_markdown() 在缺失时 明确声明无法对比：

python 复制代码

# 84:88:src/parity_audit.py
    def to_markdown(self) -> str:
        lines = ['# Parity Audit']
        if not self.archive_present:
            lines.append('Local archive unavailable; parity audit cannot compare against the original snapshot.')
            return '\n'.join(lines)

CLI 帮助文案同样写 「在可用时」 才对比，避免暗示「clone 即全量 parity」：

26:26:src/main.py 复制代码

    subparsers.add_parser('parity-audit', help='compare the Python workspace against the local ignored TypeScript archive when available')

单测 不把「必须有 archive」当作默认前提，而是：

python 复制代码

# 45:51:tests/test_porting_workspace.py
    def test_root_file_coverage_is_complete_when_local_archive_exists(self) -> None:
        audit = run_parity_audit()
        if audit.archive_present:
            self.assertEqual(audit.root_file_coverage[0], audit.root_file_coverage[1])
            self.assertGreaterEqual(audit.directory_coverage[0], 28)
            self.assertGreaterEqual(audit.command_entry_ratio[0], 150)
            self.assertGreaterEqual(audit.tool_entry_ratio[0], 100)

语义：

无 archive ：该用例 静默通过（不伪造 green）。
有 archive ：根文件映射必须 满覆盖 ，目录/命令/工具条目数 不低于 约定阈值。

这就是「诚实未完成」的 unittest 表达：缺少上游对照物时，不宣称 parity 已证明；有时则收紧到可量化指标。

2.2 与「永远 green」的虚假胜利划界

若改成「无 archive 也 assert 比例」，会把 CI 变成谎言；若「有 archive 也不测」，则档案形同虚设。当前写法是 二态门禁 ：环境具备对照条件时才验收 结构覆盖率 ，而不是验收 业务等价（后者在单测里基本未做）。

3. 黑盒子进程：测「可运行 + 可观测契约」，不测 TS 行为克隆

PortingWorkspaceTests 大量 subprocess.run(..., check=True) 调 python -m src.main <subcommand>，断言 退出码 0 与 stdout 子串（标题、关键词、模式串）。

效果：

诚实：通过表示「CLI 面在当前仓库数据下能跑通」，不表示「与原版 Ink/React 一致」。
稳定：子串契约比像素级 diff 快照更耐重构。
覆盖面 ：summary、parity-audit、route、bootstrap、exec、turn-loop、remote/ssh/teleport 模式、command-graph、tool-pool 等 移植工作流 被串成回归网。

示例（bootstrap 与 load-session 链）：

python 复制代码

# 104:174:tests/test_porting_workspace.py
    def test_bootstrap_cli_runs(self) -> None:
        result = subprocess.run(
            [sys.executable, '-m', 'src.main', 'bootstrap', 'review MCP tool', '--limit', '5'],
            check=True,
            capture_output=True,
            text=True,
        )
        self.assertIn('Runtime Session', result.stdout)
        ...

    def test_load_session_cli_runs(self) -> None:
        from src.runtime import PortRuntime

        session = PortRuntime().bootstrap_session('review MCP tool', limit=5)
        session_id = Path(session.persisted_session_path).stem
        result = subprocess.run(
            [sys.executable, '-m', 'src.main', 'load-session', session_id],
            check=True,
            capture_output=True,
            text=True,
        )
        self.assertIn(session_id, result.stdout)
        self.assertIn('messages', result.stdout)

口碑含义 ：对外可以说「门禁保证 port 脚手架与镜像数据自洽」，而不是「已完整复刻产品」。

4. 词汇诚实：`Mirrored`、占位包与快照边界

4.1 执行层命名

ExecutionRegistry 使用 MirroredCommand / MirroredTool，测试断言输出含 「Mirrored command」/「Mirrored tool」：

python 复制代码

# 9:24:src/execution_registry.py
@dataclass(frozen=True)
class MirroredCommand:
    name: str
    source_hint: str

    def execute(self, prompt: str) -> str:
        return execute_command(self.name, prompt).message

python 复制代码

# 123:137:tests/test_porting_workspace.py
    def test_exec_command_and_tool_cli_run(self) -> None:
        ...
        self.assertIn("Mirrored command 'review'", command_result.stdout)
        self.assertIn("Mirrored tool 'MCPTool'", tool_result.stdout)

测试在巩固一种用户可见语义 ：这是 镜像/演示执行，不是隐式冒充生产行为。

4.2 子系统占位

hooks 包显式为 placeholder ，元数据来自 reference_data JSON，并导出 PORTING_NOTE：

1:16:src/hooks/__init__.py 复制代码

"""Python package placeholder for the archived `hooks` subsystem."""
...
PORTING_NOTE = f"Python placeholder package for '{ARCHIVE_NAME}' with {MODULE_COUNT} archived module references."

单测通过 MODULE_COUNT、SAMPLE_FILES 等 >0 断言，验证「占位与快照挂钩」而非「hooks 已移植」。

4.3 快照与清单：测「规模与结构」，不测语义

test_command_and_tool_snapshots_are_nontrivial：PORTED_COMMANDS / PORTED_TOOLS 数量下限。
test_manifest_counts_python_files：total_python_files >= 20 等粗阈值。
test_execution_registry_runs：registry 规模 + 执行结果 字符串包含 镜像关键词。

这些与 reference_data/*.json、archive_surface_snapshot.json 共同构成 「表面对齐」 的可测定义；unittest 不把 undefined 的「行为 parity」塞进断言。

5. 内联 API 测：薄层不变量

少量直接调用 Python API（QueryEnginePort.render_summary()、PortRuntime.bootstrap_session()），检查 结构字段 （如 Prompt:、usage 有消耗），与 CLI 黑盒互补：

python 复制代码

# 21:25:tests/test_porting_workspace.py
    def test_query_engine_summary_mentions_workspace(self) -> None:
        summary = QueryEnginePort.from_workspace().render_summary()
        self.assertIn('Python Porting Workspace Summary', summary)
        self.assertIn('Command surface:', summary)
        self.assertIn('Tool surface:', summary)

哲学：内联测适合 纯函数/聚合报告；有副作用与 argv 的走子进程，边界清晰。

6. 与 Rust 侧的呼应（可选对照）

compat-harness 集成测试在 缺少上游 fixture 时 直接 return，与 Python「无 archive 不收紧」同构：

rust 复制代码

// 311:316:rust/crates/compat-harness/src/lib.rs
    #[test]
    fn extracts_non_empty_manifests_from_upstream_repo() {
        let paths = fixture_paths();
        if !has_upstream_fixture(&paths) {
            return;
        }

说明「诚实未完成」在整个 monorepo 里是一种 可复制的测试习惯 ：缺输入就不宣称成功证明 ，而不是 skip 刷屏或假失败。

7. 小结：unittest 如何守住口碑

手法	守住的承诺	明确不承诺的
`if audit.archive_present` 条件断言	「有快照时，根文件映射与规模指标达标」	「无快照也完成 TS 对齐」
Parity 文案 + CLI help「when available」	对照物缺失时显式降级	静默满分
子进程 + `check=True` + 子串	工作流与数据自洽、可演示	与上游运行时行为逐行一致
`Mirrored*` 与占位包测试	用户可见输出不自称为原版	hooks/子系统已完整实现
快照数量阈值	表面（命令/工具条目）不萎缩	每个条目语义已验证
`discover` + 标准库	低依赖门禁，易在任意环境复跑	复杂属性测试/模糊测试

一句话 ：这套 unittest 把「我们保证什么」写成 可执行契约 （能跑、有镜像词、有档案时结构对齐），把「我们还没保证什么」留在 条件分支与文档字符串 里------这就是大型移植里 诚实未完成 的口碑工程化。

claw-code 源码分析：大型移植的测试哲学——如何用 unittest 门禁守住「诚实未完成」的口碑？

1. 门禁长什么样：单一 discover 入口

2. 诚实未完成的核心：archive_present 条件门禁

2.1 Parity：有档案才严，没档案不装完成

2.2 与「永远 green」的虚假胜利划界

3. 黑盒子进程：测「可运行 + 可观测契约」，不测 TS 行为克隆

4. 词汇诚实：Mirrored、占位包与快照边界

4.1 执行层命名

4.2 子系统占位

4.3 快照与清单：测「规模与结构」，不测语义

5. 内联 API 测：薄层不变量

6. 与 Rust 侧的呼应（可选对照）

7. 小结：unittest 如何守住口碑

2. 诚实未完成的核心：`archive_present` 条件门禁

4. 词汇诚实：`Mirrored`、占位包与快照边界