claudecode精炼版-CoreCoder

文章目录

[CoreCoder 框架分析](#CoreCoder 框架分析)
- 一、整体框架结构
- 二、核心功能
- - [2.1 智能体循环（Agent Loop）](#2.1 智能体循环（Agent Loop）)
  - - [🔗 代码联动：Agent Loop 如何实现](#🔗 代码联动：Agent Loop 如何实现)
  - [2.2 LLM 提供商层](#2.2 LLM 提供商层)
  - - [🔗 代码联动：LLM 层如何实现](#🔗 代码联动：LLM 层如何实现)
- 三、基础工具系统
- - [3.1 工具基类](#3.1 工具基类)
  - [3.2 七大内置工具](#3.2 七大内置工具)
  - [3.3 工具注册表](#3.3 工具注册表)
  - [3.4 edit_file ------ 精确编辑的核心创新](#3.4 edit_file —— 精确编辑的核心创新)
  - - [🔗 代码联动：工具系统如何实现](#🔗 代码联动：工具系统如何实现)
- 四、提示词设计
- - [4.1 结构设计](#4.1 结构设计)
  - [4.2 设计原则](#4.2 设计原则)
  - - [🔗 代码联动：提示词如何构建与注入](#🔗 代码联动：提示词如何构建与注入)
- 五、记忆与上下文管理系统
- - [5.1 对话记忆](#5.1 对话记忆)
  - [5.2 三层上下文压缩](#5.2 三层上下文压缩)
  - - [🔗 代码联动：三层上下文压缩如何实现](#🔗 代码联动：三层上下文压缩如何实现)
  - [5.3 Token 估算](#5.3 Token 估算)
- 六、钩子拦截与安全系统
- - [6.1 Shell 命令安全拦截](#6.1 Shell 命令安全拦截)
  - [6.2 输出截断保护](#6.2 输出截断保护)
  - [6.3 edit_file 安全机制](#6.3 edit_file 安全机制)
  - [6.4 子智能体隔离](#6.4 子智能体隔离)
  - [6.5 工作目录跟踪](#6.5 工作目录跟踪)
  - - [🔗 代码联动：安全拦截如何实现](#🔗 代码联动：安全拦截如何实现)
- 七、调度系统
- - [7.1 工具调度](#7.1 工具调度)
  - [7.2 LLM 调度（重试 + 退避）](#7.2 LLM 调度（重试 + 退避）)
  - [7.3 上下文压缩调度](#7.3 上下文压缩调度)
  - - [🔗 代码联动：调度系统如何实现](#🔗 代码联动：调度系统如何实现)
- 八、配置系统
- - [8.1 配置项](#8.1 配置项)
  - [8.2 .env 加载](#8.2 .env 加载)
  - - [🔗 代码联动：配置系统如何实现](#🔗 代码联动：配置系统如何实现)
- [九、交互式 REPL](#九、交互式 REPL)
- - [9.1 两种运行模式](#9.1 两种运行模式)
  - [9.2 内置命令](#9.2 内置命令)
  - [9.3 输入体验](#9.3 输入体验)
  - - [🔗 代码联动：REPL 如何实现](#🔗 代码联动：REPL 如何实现)
- 十、会话持久化
- 十一、测试体系
- - 覆盖范围
  - - [🔗 代码联动：完整启动链路与数据流](#🔗 代码联动：完整启动链路与数据流)
- 十二、依赖关系图
- 十三、架构特色与设计哲学

CoreCoder 框架分析

项目来源 ：https://github.com/he-yufeng/CoreCoder

作者博客 ：《用1400行代码写一个Claude Code》 --- 何宇峰（he-yufeng）

CoreCoder 是知名博主何宇峰受 Claude Code 架构启发，用约 1400 行 Python 实现的极简 AI 编程智能体。其博客文章详细解读了 Claude Code 的核心设计，并以此为基础蒸馏出 CoreCoder------保留了 Agent Loop、精确字符串编辑、三层上下文压缩、并行工具执行、子智能体委派等关键能力，同时去除了冗余框架依赖，仅靠 OpenAI SDK + 4 个轻量依赖即可运行。本文档对该框架进行全面拆解，涵盖核心功能、工具系统、提示词设计、记忆与压缩、安全拦截、调度机制等，并附带代码联动讲解，帮助读者深入理解每一行代码背后的设计意图。

版本：0.2.0 | 协议：MIT | 语言：Python ≥3.10

一、整体框架结构

复制代码

corecoder/
├── __init__.py        # 包入口，导出公共 API（Agent, LLM, Config, ALL_TOOLS）
├── __main__.py        # python -m corecoder 入口
├── agent.py           # 核心智能体循环（Agent Loop）
├── cli.py             # 交互式 REPL 命令行界面
├── config.py          # 配置管理（环境变量 + .env 加载）
├── context.py         # 三层上下文压缩系统
├── llm.py             # LLM 提供商抽象层（OpenAI 兼容 API）
├── prompt.py          # 系统提示词构建
├── session.py         # 会话持久化（保存/恢复）
└── tools/             # 工具系统
    ├── __init__.py    # 工具注册表（ALL_TOOLS + get_tool 查找）
    ├── base.py        # 工具基类（Tool ABC）
    ├── agent.py       # 子智能体工具（sub-agent 委派）
    ├── bash.py        # Shell 命令执行（含安全拦截）
    ├── edit.py        # 精确字符串替换编辑
    ├── glob_tool.py   # 文件模式匹配搜索
    ├── grep.py        # 正则内容搜索
    ├── read.py        # 带行号的文件读取
    └── write.py       # 文件创建/覆写

架构模式 ：用户输入 → LLM（带工具）→ 工具调用？→ 执行 → 循环，直到 LLM 返回纯文本（无工具调用）即为完成。

二、核心功能

2.1 智能体循环（Agent Loop）

文件：corecoder/agent.py

这是 CoreCoder 的心脏，核心流程：

接收用户消息，追加到对话历史
触发上下文压缩检查（context.maybe_compress）
调用 LLM（带系统提示词 + 工具 schema）
无工具调用 → 返回文本给用户
有工具调用 → 执行工具（单个顺序执行 / 多个并行执行）→ 追加结果 → 回到步骤 2
最多循环 max_rounds（默认 50）轮

python 复制代码

# 核心签名
class Agent:
    def __init__(self, llm, tools=None, max_context_tokens=128000, max_rounds=50)
    def chat(self, user_input, on_token=None, on_tool=None) -> str
    def _exec_tool(self, tc) -> str           # 单工具执行
    def _exec_tools_parallel(self, tool_calls, on_tool=None) -> list[str]  # 并行执行
    def reset(self)                           # 清空对话

关键设计：

多工具并行：当 LLM 一次返回多个工具调用时，使用 ThreadPoolExecutor(max_workers=8) 并行执行
流式回调：on_token（流式 token 输出）、on_tool（工具调用通知）
子智能体自动绑定：初始化时自动将 AgentTool 的 _parent_agent 指向自身

🔗 代码联动：Agent Loop 如何实现

初始化阶段 (agent.py:21-39)：

python 复制代码

class Agent:
    def __init__(self, llm, tools=None, max_context_tokens=128_000, max_rounds=50):
        self.llm = llm
        self.tools = tools if tools is not None else ALL_TOOLS  # ← tools/__init__.py:11 注册的7个工具
        self.messages: list[dict] = []                           # ← 工作记忆：对话历史列表
        self.context = ContextManager(max_tokens=max_context_tokens)  # ← context.py:37 三层压缩管理器
        self.max_rounds = max_rounds
        self._system = system_prompt(self.tools)                 # ← prompt.py:7 动态构建系统提示词

        # wire up sub-agent capability
        for t in self.tools:
            if isinstance(t, AgentTool):
                t._parent_agent = self                         # ← 将自身注入AgentTool，支持子智能体委派

实现要点 ：ALL_TOOLS 来自 tools/__init__.py 的全局注册表；system_prompt() 在初始化时一次性构建，将工具列表嵌入提示词；子智能体绑定通过遍历工具列表找到 AgentTool 并注入父引用。

主循环核心 (agent.py:47-91)：

python 复制代码

def chat(self, user_input: str, on_token=None, on_tool=None) -> str:
    self.messages.append({"role": "user", "content": user_input})  # ①追加用户消息
    self.context.maybe_compress(self.messages, self.llm)           # ②触发压缩检查

    for _ in range(self.max_rounds):                               # ③最多50轮循环
        resp = self.llm.chat(                                      # ④调用LLM
            messages=self._full_messages(),                        #   ← 系统提示词 + 对话历史
            tools=self._tool_schemas(),                            #   ← 所有工具的OpenAI schema
            on_token=on_token,                                     #   ← 流式token回调
        )

        if not resp.tool_calls:                                    # ⑤无工具调用=完成
            self.messages.append(resp.message)
            return resp.content

        self.messages.append(resp.message)                         # ⑥追加assistant消息（含tool_calls）

        if len(resp.tool_calls) == 1:                              # ⑦单工具：顺序执行
            tc = resp.tool_calls[0]
            if on_tool: on_tool(tc.name, tc.arguments)
            result = self._exec_tool(tc)
            self.messages.append({"role": "tool", "tool_call_id": tc.id, "content": result})
        else:                                                      # ⑧多工具：并行执行
            results = self._exec_tools_parallel(resp.tool_calls, on_tool)
            for tc, result in zip(resp.tool_calls, results):
                self.messages.append({"role": "tool", "tool_call_id": tc.id, "content": result})

        self.context.maybe_compress(self.messages, self.llm)       # ⑨工具执行后再压缩
    return "(reached maximum tool-call rounds)"

实现要点 ：_full_messages() 将系统提示词拼到对话历史最前面；_tool_schemas() 调用每个工具的 schema() 方法生成 OpenAI function-calling 格式；工具结果以 role="tool" 消息追加，tool_call_id 与 LLM 返回的 tc.id 对应，这是 OpenAI API 的标准协议。

单工具执行 (agent.py:93-103)：

python 复制代码

def _exec_tool(self, tc) -> str:
    tool = get_tool(tc.name)                          # ← tools/__init__.py:22 按名称查找工具
    if tool is None:
        return f"Error: unknown tool '{tc.name}'"
    try:
        return tool.execute(**tc.arguments)            # ← 调用工具的execute方法，**kwargs解包参数
    except TypeError as e:
        return f"Error: bad arguments for {tc.name}: {e}"
    except Exception as e:
        return f"Error executing {tc.name}: {e}"

实现要点 ：get_tool() 遍历 ALL_TOOLS 列表按名称匹配；TypeError 捕获参数不匹配（如缺少必选参数）；所有异常都被捕获并转为错误文本，绝不会让工具执行异常中断主循环。

并行执行 (agent.py:105-118)：

python 复制代码

def _exec_tools_parallel(self, tool_calls, on_tool=None) -> list[str]:
    for tc in tool_calls:
        if on_tool: on_tool(tc.name, tc.arguments)    # ← 先通知所有工具调用（UI显示）
    with concurrent.futures.ThreadPoolExecutor(max_workers=8) as pool:
        futures = [pool.submit(self._exec_tool, tc) for tc in tool_calls]
        return [f.result() for f in futures]           # ← 收集所有结果，顺序与tool_calls一致

实现要点 ：ThreadPoolExecutor 保证多个独立工具调用并发执行；max_workers=8 限制最大并发数；pool.submit 返回的 Future 按提交顺序收集，f.result() 会阻塞等待单个结果。

2.2 LLM 提供商层

文件：corecoder/llm.py

统一的 OpenAI 兼容 API 封装，支持所有提供兼容接口的模型：

提供商	支持方式
OpenAI（GPT-4o/5.4 等）	默认
DeepSeek	`OPENAI_BASE_URL=https://api.deepseek.com`
阿里通义（Qwen）	`OPENAI_BASE_URL` 指向阿里端点
Moonshot（Kimi）	`OPENAI_BASE_URL` 指向 Moonshot 端点
Anthropic Claude	需兼容代理
Ollama（本地）	`OPENAI_BASE_URL=http://localhost:11434/v1`

核心能力：

流式响应 ：始终使用 stream=True，支持逐 token 输出
工具调用 ：跨 chunk 累积 tool_calls，自动解析 JSON 参数
重试机制 ：指数退避重试（RateLimitError、APITimeoutError、APIConnectionError、5xx 服务端错误），最多 3 次
费用估算 ：内置 16 个模型的定价表（_PRICING），可实时计算 USD 成本
Usage 追踪 ：累计 total_prompt_tokens / total_completion_tokens

python 复制代码

class LLM:
    def chat(self, messages, tools=None, on_token=None) -> LLMResponse
    def _call_with_retry(self, params, max_retries=3)  # 带重试的 API 调用

@dataclass
class LLMResponse:
    content: str
    tool_calls: list[ToolCall]
    prompt_tokens: int
    completion_tokens: int

🔗 代码联动：LLM 层如何实现

流式响应 + 工具调用累积 (llm.py:105-182)：

python 复制代码

def chat(self, messages, tools=None, on_token=None) -> LLMResponse:
    params = {"model": self.model, "messages": messages, "stream": True, **self.extra}
    if tools: params["tools"] = tools

    # 尝试启用 stream_options 获取 usage 信息（非所有提供商支持）
    try:
        params["stream_options"] = {"include_usage": True}
        stream = self._call_with_retry(params)
    except Exception:
        params.pop("stream_options", None)              # ← 不支持则降级
        stream = self._call_with_retry(params)

    content_parts: list[str] = []                       # ← 累积文本token
    tc_map: dict[int, dict] = {}                        # ← 累积工具调用（index→{id,name,args}）
    for chunk in stream:
        if chunk.usage:                                 # ← 最后一个chunk带usage
            prompt_tok = chunk.usage.prompt_tokens
            completion_tok = chunk.usage.completion_tokens
        delta = chunk.choices[0].delta
        if delta.content:
            content_parts.append(delta.content)          # ← 累积文本
            if on_token: on_token(delta.content)         # ← 回调给CLI显示
        if delta.tool_calls:
            for tc_delta in delta.tool_calls:
                idx = tc_delta.index
                if idx not in tc_map: tc_map[idx] = {"id": "", "name": "", "args": ""}
                if tc_delta.id: tc_map[idx]["id"] = tc_delta.id
                if tc_delta.function.name: tc_map[idx]["name"] = tc_delta.function.name
                if tc_delta.function.arguments: tc_map[idx]["args"] += tc_delta.function.arguments
                                                        # ↑ 工具调用的参数跨chunk分段到达，需拼接

    # 解析累积的工具调用
    parsed: list[ToolCall] = []
    for idx in sorted(tc_map):
        raw = tc_map[idx]
        args = json.loads(raw["args"])                  # ← 将累积的JSON字符串解析为dict
        parsed.append(ToolCall(id=raw["id"], name=raw["name"], arguments=args))
    return LLMResponse(content="".join(content_parts), tool_calls=parsed, ...)

实现要点 ：OpenAI 流式响应中，工具调用的 function.arguments 是 JSON 片段，分散在多个 chunk 中到达，必须在 tc_map 中按 index 拼接后再 json.loads 解析；stream_options 是 OpenAI 扩展，部分提供商不支持，用 try/except 降级处理。

重试机制 (llm.py:184-199)：

python 复制代码

def _call_with_retry(self, params, max_retries=3):
    for attempt in range(max_retries):
        try:
            return self.client.chat.completions.create(**params)
        except (RateLimitError, APITimeoutError, APIConnectionError) as e:
            if attempt == max_retries - 1: raise        # ← 最后一次仍失败则抛出
            time.sleep(2 ** attempt)                     # ← 指数退避：1s, 2s, 4s
        except APIError as e:
            if e.status_code >= 500 and attempt < max_retries - 1:
                time.sleep(2 ** attempt)                 # ← 5xx服务端错误重试
            else:
                raise                                    # ← 4xx客户端错误不重试

实现要点：4xx 错误（如 401 认证失败、400 参数错误）直接抛出，重试无意义；5xx 错误（如 503 服务过载）指数退避重试。

费用估算 (llm.py:93-103)：

python 复制代码

@property
def estimated_cost(self) -> float | None:
    pricing = _PRICING.get(self.model)                  # ← 从16个模型定价表查找
    if not pricing: return None                          # ← 未知模型返回None
    input_rate, output_rate = pricing
    return (self.total_prompt_tokens * input_rate / 1_000_000
          + self.total_completion_tokens * output_rate / 1_000_000)

实现要点 ：_PRICING 字典存储每百万 token 的 (input_price, output_price)，累加两个 token 计数器后计算 USD 成本。CLI 中 /tokens 命令调用此属性显示费用。

三、基础工具系统

3.1 工具基类

文件：corecoder/tools/base.py

python 复制代码

class Tool(ABC):
    name: str           # 工具名称
    description: str    # 工具描述（给 LLM 看）
    parameters: dict    # JSON Schema 参数定义

    @abstractmethod
    def execute(self, **kwargs) -> str  # 执行工具，返回文本结果

    def schema(self) -> dict  # 生成 OpenAI function-calling 格式的 schema

3.2 七大内置工具

工具名	类名	文件	功能
`bash`	`BashTool`	`tools/bash.py`	Shell 命令执行，含安全拦截、超时、输出截断、工作目录跟踪
`read_file`	`ReadFileTool`	`tools/read.py`	带行号读取文件，支持 offset/limit 分段读取
`write_file`	`WriteFileTool`	`tools/write.py`	创建新文件或完整覆写，自动创建父目录
`edit_file`	`EditFileTool`	`tools/edit.py`	核心创新：精确字符串搜索替换，保证唯一性，输出 unified diff
`glob`	`GlobTool`	`tools/glob_tool.py`	文件模式匹配（支持 `**` 递归），按修改时间排序
`grep`	`GrepTool`	`tools/grep.py`	正则内容搜索，跳过 `.git`/`node_modules` 等，200 匹配上限
`agent`	`AgentTool`	`tools/agent.py`	子智能体委派，独立上下文执行复杂子任务

3.3 工具注册表

文件：corecoder/tools/__init__.py

python 复制代码

ALL_TOOLS = [BashTool(), ReadFileTool(), WriteFileTool(), EditFileTool(),
             GlobTool(), GrepTool(), AgentTool()]

def get_tool(name: str) -> Tool | None  # 按名称查找工具

3.4 edit_file ------ 精确编辑的核心创新

这是从 Claude Code 蒸馏出的关键设计：

不使用行号 ：LLM 指定 old_string（精确子串）和 new_string（替换内容）
唯一性保证 ：old_string 必须在文件中恰好出现一次，否则拒绝执行
安全可审查：每次编辑后生成 unified diff，用户和 LLM 都能看到具体改动
变更追踪 ：_changed_files 集合记录本次会话修改过的所有文件

🔗 代码联动：工具系统如何实现

工具基类 → OpenAI Schema 生成 (tools/base.py:6-18)：

python 复制代码

class Tool(ABC):
    name: str           # 工具名称（LLM调用时使用）
    description: str    # 工具描述（LLM选择工具时参考）
    parameters: dict    # JSON Schema 参数定义

    @abstractmethod
    def execute(self, **kwargs) -> str: ...   # ← 所有工具必须实现此方法

    def schema(self) -> dict:                 # ← 生成OpenAI function-calling格式
        return {
            "type": "function",
            "function": {
                "name": self.name,
                "description": self.description,
                "parameters": self.parameters,
            },
        }

实现要点 ：每个工具只需定义 3 个类属性 + 1 个 execute 方法，schema() 自动生成 OpenAI API 要求的 JSON 结构。Agent 的 _tool_schemas() 调用此方法将所有工具注册给 LLM。

工具注册表 (tools/__init__.py:11-27)：

python 复制代码

ALL_TOOLS = [BashTool(), ReadFileTool(), WriteFileTool(), EditFileTool(),
             GlobTool(), GrepTool(), AgentTool()]   # ← 实例化所有工具

def get_tool(name: str):                             # ← Agent._exec_tool 中按名称查找
    for t in ALL_TOOLS:
        if t.name == name: return t
    return None

实现要点 ：简单列表 + 线性查找，O(n) 但 n=7 极小，无需优化；新增工具只需写类 + 加到 ALL_TOOLS 列表。

edit_file 精确编辑实现 (tools/edit.py:44-73)：

python 复制代码

def execute(self, file_path: str, old_string: str, new_string: str) -> str:
    p = Path(file_path).expanduser().resolve()
    content = p.read_text()
    occurrences = content.count(old_string)           # ← 计算old_string出现次数

    if occurrences == 0:                              # ← 未找到：返回文件预览帮助LLM修正
        preview = content[:500] + ("..." if len(content) > 500 else "")
        return f"Error: old_string not found in {file_path}.\nFile starts with:\n{preview}"
    if occurrences > 1:                               # ← 多次出现：拒绝，要求扩大上下文
        return f"Error: old_string appears {occurrences} times in {file_path}. ..."

    new_content = content.replace(old_string, new_string, 1)  # ← 精确替换1次
    p.write_text(new_content)
    _changed_files.add(str(p))                        # ← 追踪变更（/diff命令使用）

    diff = _unified_diff(content, new_content, str(p)) # ← 生成diff供审查
    return f"Edited {file_path}\n{diff}"

实现要点 ：content.count(old_string) 做唯一性检查------0 次返回预览帮助 LLM 定位，>1 次拒绝防止误改；content.replace(old_string, new_string, 1) 的第 3 参数限制只替换第一个匹配；_unified_diff 用 Python 标准库 difflib.unified_diff 生成人类可读的 diff。

BashTool 安全拦截实现 (tools/bash.py:53-92)：

python 复制代码

def execute(self, command: str, timeout: int = 120) -> str:
    warning = _check_dangerous(command)                # ← 执行前安全检查
    if warning:
        return f"⚠ Blocked: {warning}\nCommand: {command}\nIf intentional, modify the command to be more specific."

    cwd = _cwd or os.getcwd()                          # ← 使用跟踪的工作目录
    proc = subprocess.run(command, shell=True, capture_output=True, text=True, timeout=timeout, cwd=cwd)

    if proc.returncode == 0:
        _update_cwd(command, cwd)                      # ← 成功后更新工作目录跟踪
    out = proc.stdout
    if proc.stderr: out += f"\n[stderr]\n{proc.stderr}"
    if proc.returncode != 0: out += f"\n[exit code: {proc.returncode}]"

    if len(out) > 15_000:                              # ← 输出截断：保留头尾
        out = out[:6000] + f"\n\n... truncated ({len(out)} chars total) ...\n\n" + out[-3000:]
    return out.strip() or "(no output)"

实现要点 ：安全拦截在 subprocess.run 之前执行，通过正则匹配 9 种危险模式；_update_cwd 解析 cd 命令更新全局 _cwd 变量，使后续命令在正确目录执行；输出截断保留前 6000 + 后 3000 字符，中间用省略标记替代。

AgentTool 子智能体实现 (tools/agent.py:36-57)：

python 复制代码

def execute(self, task: str) -> str:
    parent = self._parent_agent                        # ← 由Agent.__init__注入
    sub = Agent(
        llm=parent.llm,                                # ← 共享LLM实例
        tools=[t for t in parent.tools if t.name != "agent"],  # ← 去掉agent工具防止递归
        max_context_tokens=parent.context.max_tokens,  # ← 相同上下文窗口
        max_rounds=20,                                 # ← 子智能体最多20轮
    )
    try:
        result = sub.chat(task)                        # ← 独立对话历史执行
        if len(result) > 5000:                         # ← 截断过长输出
            result = result[:4500] + "\n... (sub-agent output truncated)"
        return f"[Sub-agent completed]\n{result}"
    except Exception as e:
        return f"Sub-agent error: {e}"

实现要点 ：子智能体是完全独立的 Agent 实例，有自己的 messages 列表，不污染父对话历史；递归防护通过 t.name != "agent" 过滤掉 agent 工具；子智能体运行到完成后返回文本摘要给父智能体。

四、提示词设计

文件：corecoder/prompt.py

系统提示词是让 LLM 变成编程智能体的关键指令，采用动态构建方式：

4.1 结构设计

复制代码

┌─────────────────────────────────────────┐
│  角色定义                                │
│  "You are CoreCoder, an AI coding       │
│   assistant running in the user's       │
│   terminal."                            │
├─────────────────────────────────────────┤
│  环境信息（动态注入）                     │
│  - Working directory: {cwd}            │
│  - OS: {system} {release} ({machine})  │
│  - Python: {version}                   │
├─────────────────────────────────────────┤
│  工具列表（动态生成）                     │
│  - **bash**: Execute a shell command... │
│  - **read_file**: Read a file's...      │
│  - ...                                 │
├─────────────────────────────────────────┤
│  行为规则（8 条）                        │
│  1. Read before edit                   │
│  2. edit_file for small changes        │
│  3. Verify your work                   │
│  4. Be concise                         │
│  5. One step at a time                 │
│  6. edit_file uniqueness              │
│  7. Respect existing style             │
│  8. Ask when unsure                    │
└─────────────────────────────────────────┘

4.2 设计原则

环境感知：动态注入当前工作目录、操作系统、Python 版本，使 LLM 能生成符合环境的命令
工具自描述 ：工具列表由 tools 参数动态生成，新增工具自动出现在提示词中
规则导向：8 条核心规则约束 LLM 行为，防止常见错误（如未读就改、全量覆写小改动）
极简主义：提示词精炼，不包含冗长示例，靠工具 schema 的参数描述引导 LLM

🔗 代码联动：提示词如何构建与注入

动态构建函数 (prompt.py:7-33)：

python 复制代码

def system_prompt(tools) -> str:
    cwd = os.getcwd()                                          # ← 运行时获取当前工作目录
    tool_list = "\n".join(f"- **{t.name}**: {t.description}"   # ← 遍历工具列表生成描述
                          for t in tools)
    uname = platform.uname()                                   # ← 获取操作系统信息

    return f"""\
You are CoreCoder, an AI coding assistant running in the user's terminal.
You help with software engineering: writing code, fixing bugs, refactoring, explaining code, running commands, and more.

# Environment
- Working directory: {cwd}                                     # ← 动态注入环境变量
- OS: {uname.system} {uname.release} ({uname.machine})
- Python: {platform.python_version()}

# Tools
{tool_list}                                                     # ← 动态生成工具列表

# Rules
1. **Read before edit.** Always read a file before modifying it.
2. **edit_file for small changes.** Use edit_file for targeted edits; write_file only for new files or complete rewrites.
3. **Verify your work.** After making changes, run relevant tests or commands to confirm correctness.
4. **Be concise.** Show code over prose. Explain only what's necessary.
5. **One step at a time.** For multi-step tasks, execute them sequentially.
6. **edit_file uniqueness.** When using edit_file, include enough surrounding context in old_string to guarantee a unique match.
7. **Respect existing style.** Match the project's coding conventions.
8. **Ask when unsure.** If the request is ambiguous, ask for clarification rather than guessing.
"""

注入链路：

复制代码

prompt.py:system_prompt(tools)
    ↓ Agent.__init__ 调用 (agent.py:34)
    ↓ self._system = system_prompt(self.tools)
    ↓ Agent._full_messages() (agent.py:41-42)
    ↓ [{"role": "system", "content": self._system}] + self.messages
    ↓ Agent.chat → LLM.chat(messages=self._full_messages(), ...)
    ↓ 发送给 OpenAI API

实现要点 ：提示词在 Agent.__init__ 时一次性构建并缓存到 self._system，之后每次 LLM 调用通过 _full_messages() 拼到 messages 列表最前面作为 role="system" 消息；tool_list 是从工具实例动态生成的，新增工具只需修改 ALL_TOOLS，提示词自动包含。

五、记忆与上下文管理系统

5.1 对话记忆

文件：corecoder/agent.py + corecoder/session.py

层级	实现方式	说明
工作记忆	`Agent.messages: list[dict]`	当前对话的所有消息（user/assistant/tool），即短期记忆
持久记忆	`session.py` → `~/.corecoder/sessions/`	JSON 文件保存/恢复完整对话历史，支持跨会话恢复

会话持久化 API：

save_session(messages, model, session_id) → 保存到磁盘，返回 session ID
load_session(session_id) → 加载历史会话，返回 (messages, model)
list_sessions() → 列出最近 20 个会话（含预览）

5.2 三层上下文压缩

文件：corecoder/context.py

借鉴 Claude Code 的 4 层压缩策略，简化为 3 层：

复制代码

Token 使用量 ──────────────────────────────────────►

  0%        50%        70%        90%       100%
  │          │          │          │          │
  │    Layer1:snip   Layer2:summarize  Layer3:collapse
  │          │          │          │          │
  │     工具输出截断   LLM摘要旧对话    紧急硬压缩
  │          │          │          │          │

层级	阈值	策略	细节
Layer 1: tool_snip	50%	工具输出截断	超过 1500 字符的工具结果，保留前 3 行 + 后 3 行
Layer 2: summarize	70%	LLM 摘要	保留最近 8 轮对话，用 LLM 将旧对话压缩为摘要
Layer 3: hard_collapse	90%	紧急硬压缩	仅保留最近 4 条消息 + 摘要，其余全部丢弃

LLM 摘要的提示词：

"Compress this conversation into a brief summary. Preserve: file paths edited, key decisions made, errors encountered, current task state. Drop: verbose command output, code listings, redundant back-and-forth."

降级策略 ：当 LLM 不可用时，使用 _extract_key_info 正则提取文件路径和错误行作为降级摘要。

🔗 代码联动：三层上下文压缩如何实现

压缩调度器 (context.py:45-67)：

python 复制代码

class ContextManager:
    def __init__(self, max_tokens: int = 128_000):
        self.max_tokens = max_tokens
        self._snip_at = int(max_tokens * 0.50)      # ← 50%阈值：64K tokens
        self._summarize_at = int(max_tokens * 0.70)  # ← 70%阈值：89.6K tokens
        self._collapse_at = int(max_tokens * 0.90)   # ← 90%阈值：115.2K tokens

    def maybe_compress(self, messages, llm=None) -> bool:
        current = estimate_tokens(messages)          # ← 估算当前token数
        compressed = False

        if current > self._snip_at:                  # ← Layer1：截断工具输出
            if self._snip_tool_outputs(messages):
                compressed = True
                current = estimate_tokens(messages)  # ← 重算token数

        if current > self._summarize_at and len(messages) > 10:  # ← Layer2：LLM摘要
            if self._summarize_old(messages, llm, keep_recent=8):
                compressed = True
                current = estimate_tokens(messages)

        if current > self._collapse_at and len(messages) > 4:    # ← Layer3：紧急压缩
            self._hard_collapse(messages, llm)
            compressed = True

        return compressed

实现要点 ：三层阈值递进（50%→70%→90%），每层压缩后重新计算 token 数，若已低于下一层阈值则跳过；maybe_compress 在 Agent.chat 的两个位置被调用：用户消息后和工具执行后。

Layer 1：工具输出截断 (context.py:70-94)：

python 复制代码

@staticmethod
def _snip_tool_outputs(messages: list[dict]) -> bool:
    changed = False
    for m in messages:
        if m.get("role") != "tool": continue         # ← 只处理tool角色消息
        content = m.get("content", "")
        if len(content) <= 1500: continue             # ← 短输出不处理
        lines = content.splitlines()
        if len(lines) <= 6: continue                  # ← 6行以内不处理
        snipped = ("\n".join(lines[:3])               # ← 保留前3行
                  + f"\n... ({len(lines)} lines, snipped to save context) ...\n"
                  + "\n".join(lines[-3:]))            # ← 保留后3行
        m["content"] = snipped                        # ← 原地修改消息
        changed = True
    return changed

实现要点 ：直接修改 messages 列表中的 content 字段（原地修改），不创建新列表；保留前3行+后3行确保 LLM 能看到命令的开头和结果，中间省略。

Layer 2：LLM 摘要 (context.py:96-117)：

python 复制代码

def _summarize_old(self, messages, llm=None, keep_recent=8) -> bool:
    if len(messages) <= keep_recent: return False
    old = messages[:-keep_recent]                     # ← 分割：旧消息 vs 最近8轮
    tail = messages[-keep_recent:]
    summary = self._get_summary(old, llm)             # ← 用LLM或降级方法生成摘要
    messages.clear()                                  # ← 清空列表
    messages.append({                                 # ← 插入压缩摘要
        "role": "user",
        "content": f"[Context compressed - conversation summary]\n{summary}",
    })
    messages.append({                                 # ← LLM需要回复确认摘要
        "role": "assistant",
        "content": "Got it, I have the context from our earlier conversation.",
    })
    messages.extend(tail)                             # ← 保留最近的对话
    return True

实现要点 ：messages.clear() + messages.extend() 是原地修改传入的列表引用（Agent.messages），不影响 Agent 对列表的持有；user/assistant 对是为了让 LLM 自然地接受摘要上下文。

摘要生成 + 降级 (context.py:135-196)：

python 复制代码

def _get_summary(self, messages, llm=None) -> str:
    flat = self._flatten(messages)                    # ← 将消息列表拍平为文本
    if llm:
        try:
            resp = llm.chat(messages=[
                {"role": "system", "content": "Compress this conversation into a brief summary. "
                 "Preserve: file paths edited, key decisions made, errors encountered, current task state. "
                 "Drop: verbose command output, code listings, redundant back-and-forth."},
                {"role": "user", "content": flat[:15000]},   # ← 截取前15000字符
            ])
            return resp.content
        except Exception: pass                        # ← LLM失败则降级
    return self._extract_key_info(messages)           # ← 正则提取降级方案

实现要点 ：LLM 摘要用专门的 system prompt 指示保留关键信息（文件路径、决策、错误）；输入截取 15000 字符防止摘要请求本身过大；降级方案 _extract_key_info 用正则 [\w./\-]+\.\w{1,5} 提取文件路径、匹配含 "error" 的行。

会话持久化实现 (session.py:15-41)：

python 复制代码

def save_session(messages: list[dict], model: str, session_id=None) -> str:
    SESSIONS_DIR.mkdir(parents=True, exist_ok=True)   # ← ~/.corecoder/sessions/
    if not session_id:
        session_id = f"session_{int(time.time())}"    # ← 时间戳作为ID
    data = {"id": session_id, "model": model, "saved_at": time.strftime("%Y-%m-%d %H:%M:%S"),
            "messages": messages}
    path = SESSIONS_DIR / f"{session_id}.json"
    path.write_text(json.dumps(data, ensure_ascii=False, indent=2))  # ← 中文不转义
    return session_id

def load_session(session_id: str) -> tuple[list[dict], str] | None:
    path = SESSIONS_DIR / f"{session_id}.json"
    if not path.exists(): return None
    data = json.loads(path.read_text())
    return data["messages"], data["model"]            # ← 恢复对话历史+模型名

实现要点 ：ensure_ascii=False 保证中文内容正确保存；恢复时返回 (messages, model) 元组，CLI 中通过 agent.messages = loaded[0] 直接替换 Agent 的对话历史。

5.3 Token 估算

python 复制代码

def _approx_tokens(text: str) -> int:
    return len(text) // 3  # 约 3.5 字符/token（中英混合）

六、钩子拦截与安全系统

6.1 Shell 命令安全拦截

文件：corecoder/tools/bash.py

这是 CoreCoder 的核心安全防线，在 BashTool.execute 中执行命令前先通过 _check_dangerous 检查：

拦截模式	正则	拦截原因
`rm -rf /`	`\brm\s+(-\w*)?-rf\s`	强制递归删除
`rm -r ~`	`\brm\s+(-\w)?-r\w\s+(/	~
`mkfs`	`\bmkfs\b`	格式化文件系统
`dd of=/dev/`	`\bdd\s+.*of=/dev/`	裸写磁盘设备
`> /dev/sda`	`>\s*/dev/sd[a-z]`	覆写块设备
`chmod 777 /`	`\bchmod\s+(-R\s+)?777\s+/`	根目录权限全开
fork bomb	`:\(\)\s{.:	:.*}`
`curl	bash`	`\bcurl\b.*
`wget	bash`	`\bwget\b.*

拦截结果 ：返回 ⚠ Blocked: {reason} 而非执行命令。

6.2 输出截断保护

场景	阈值	策略
Bash 输出	15,000 字符	保留前 6000 + 后 3000 字符
grep 匹配	200 条	达到上限即停止
glob 匹配	100 条	超过显示总数
子智能体输出	5,000 字符	截断到 4500 + 提示
edit diff	3,000 字符	截断到 2500

6.3 edit_file 安全机制

唯一性检查 ：old_string 必须在文件中恰好出现 1 次
- 出现 0 次 → 返回 "not found" + 文件前 500 字符预览
- 出现 >1 次 → 返回 "appears N times" + 要求扩大上下文
变更追踪 ：每次编辑/写入都记录到 _changed_files，可通过 /diff 命令查看

6.4 子智能体隔离

子智能体无法递归创建子智能体（tools=[t for t in parent.tools if t.name != "agent"]）
子智能体有独立的上下文窗口，不污染父智能体的对话历史
子智能体输出截断到 5000 字符

6.5 工作目录跟踪

BashTool 通过 _update_cwd 追踪 cd 命令，使后续命令在正确的目录下执行。支持 && 链式命令中的 cd。

🔗 代码联动：安全拦截如何实现

危险命令检测 (tools/bash.py:19-29,95-100)：

python 复制代码

_DANGEROUS_PATTERNS = [
    (r"\brm\s+(-\w*)?-r\w*\s+(/|~|\$HOME)", "recursive delete on home/root"),
    (r"\brm\s+(-\w*)?-rf\s", "force recursive delete"),
    (r"\bmkfs\b", "format filesystem"),
    (r"\bdd\s+.*of=/dev/", "raw disk write"),
    (r">\s*/dev/sd[a-z]", "overwrite block device"),
    (r"\bchmod\s+(-R\s+)?777\s+/", "chmod 777 on root"),
    (r":\(\)\s*\{.*:\|:.*\}", "fork bomb"),
    (r"\bcurl\b.*\|\s*(sudo\s+)?bash", "pipe curl to bash"),
    (r"\bwget\b.*\|\s*(sudo\s+)?bash", "pipe wget to bash"),
]

def _check_dangerous(cmd: str) -> str | None:
    for pattern, reason in _DANGEROUS_PATTERNS:
        if re.search(pattern, cmd):              # ← 逐个正则匹配
            return reason                        # ← 返回拦截原因（非None=拦截）
    return None                                  # ← None=安全，放行

实现要点 ：9 种正则模式覆盖最常见的破坏性操作；_check_dangerous 在 BashTool.execute 开头调用，返回 None 放行，返回原因字符串则拦截并返回 ⚠ Blocked 消息给 LLM。

工作目录跟踪 (tools/bash.py:16,103-115)：

python 复制代码

_cwd: str | None = None                          # ← 模块级全局变量，跨调用保持

def _update_cwd(command: str, current_cwd: str):
    global _cwd
    parts = command.split("&&")                   # ← 按&&拆分链式命令
    for part in parts:
        part = part.strip()
        if part.startswith("cd "):                # ← 识别cd命令
            target = part[3:].strip().strip("'\"")
            new_dir = os.path.normpath(os.path.join(current_cwd, os.path.expanduser(target)))
            if os.path.isdir(new_dir):            # ← 验证目标目录存在
                _cwd = new_dir                   # ← 更新全局cwd

实现要点 ：_cwd 是模块级全局变量，在多次 BashTool.execute 调用间保持；只处理 cd 命令，支持 && 链式命令；os.path.expanduser 处理 ~/ 路径；os.path.isdir 验证后才更新，防止 cd 到不存在的目录导致后续命令失败。

edit_file 唯一性安全实现 (tools/edit.py:51-63)：

python 复制代码

occurrences = content.count(old_string)           # ← O(n)子串计数

if occurrences == 0:
    preview = content[:500] + ("..." if len(content) > 500 else "")
    return f"Error: old_string not found in {file_path}.\nFile starts with:\n{preview}"
    # ↑ 返回文件前500字符预览，帮助LLM定位正确的文本
if occurrences > 1:
    return (f"Error: old_string appears {occurrences} times in {file_path}. "
            f"Include more surrounding lines to make it unique.")
    # ↑ 拒绝修改，要求LLM扩大上下文使old_string唯一

实现要点 ：0 次匹配时返回文件预览是关键------LLM 可能凭记忆编写 old_string，看到实际内容后可以修正；>1 次匹配时拒绝执行，防止误改多处。

七、调度系统

7.1 工具调度

文件：corecoder/agent.py

复制代码

LLM 返回 tool_calls
        │
        ├── 单个工具调用 → 顺序执行 _exec_tool(tc)
        │
        └── 多个工具调用 → 并行执行 _exec_tools_parallel(tool_calls)
                              │
                              ThreadPoolExecutor(max_workers=8)
                              │
                              pool.submit(_exec_tool, tc) × N
                              │
                              收集所有 Future 结果

调度策略：

LLM 单次返回 1 个工具调用 → 直接执行（零开销）
LLM 单次返回 N 个工具调用 → 线程池并行执行（最大 8 线程）
每轮工具执行后触发上下文压缩检查

7.2 LLM 调度（重试 + 退避）

文件：corecoder/llm.py

复制代码

API 调用
  │
  ├── 成功 → 返回流式响应
  │
  ├── RateLimitError → 等待 2^attempt 秒 → 重试（最多 3 次）
  ├── APITimeoutError → 等待 2^attempt 秒 → 重试
  ├── APIConnectionError → 等待 2^attempt 秒 → 重试
  ├── APIError(5xx) → 等待 2^attempt 秒 → 重试
  └── APIError(4xx) → 直接抛出（客户端错误不重试）

7.3 上下文压缩调度

压缩检查在两个时机触发：

用户消息后 ：chat() 中追加用户消息后
工具执行后：每轮工具调用执行完毕后

压缩按层级逐级升级，一旦某层压缩后低于下一层阈值即停止。

🔗 代码联动：调度系统如何实现

工具调度完整链路：

复制代码

LLM返回tool_calls (llm.py:165-172, 解析跨chunk累积的工具调用)
    ↓
Agent.chat (agent.py:66-86, 判断单个vs多个)
    ↓ 单个
Agent._exec_tool (agent.py:93-103)
    → get_tool(tc.name) 查找工具实例 (tools/__init__.py:22-26)
    → tool.execute(**tc.arguments) 执行 (各工具的execute方法)
    → 结果追加为 role="tool" 消息
    ↓ 多个
Agent._exec_tools_parallel (agent.py:105-118)
    → ThreadPoolExecutor(max_workers=8) 创建线程池
    → pool.submit(self._exec_tool, tc) 提交N个任务
    → f.result() 阻塞等待各结果
    → 依次追加为 role="tool" 消息（保持与tool_calls顺序一致）
    ↓
Agent.chat (agent.py:89)
    → context.maybe_compress() 压缩检查
    → 回到LLM调用继续循环

LLM重试调度实现 (llm.py:184-199)：

python 复制代码

def _call_with_retry(self, params, max_retries=3):
    for attempt in range(max_retries):              # ← 最多3次尝试
        try:
            return self.client.chat.completions.create(**params)  # ← 实际API调用
        except (RateLimitError, APITimeoutError, APIConnectionError) as e:
            if attempt == max_retries - 1: raise    # ← 最后一次失败则上抛
            time.sleep(2 ** attempt)                 # ← 指数退避: 1s→2s→4s
        except APIError as e:
            if e.status_code >= 500 and attempt < max_retries - 1:
                time.sleep(2 ** attempt)             # ← 5xx重试
            else:
                raise                                # ← 4xx不重试

实现要点 ：2 ** attempt 实现指数退避（第0次等1s，第1次等2s，第2次等4s）；瞬态错误（限流、超时、连接）和5xx服务端错误重试，4xx客户端错误不重试。

压缩调度时机 --- 在 Agent.chat 中两处调用 (agent.py:50,89)：

python 复制代码

def chat(self, user_input, on_token=None, on_tool=None) -> str:
    self.messages.append({"role": "user", "content": user_input})
    self.context.maybe_compress(self.messages, self.llm)   # ← ①用户消息后
    for _ in range(self.max_rounds):
        # ... LLM调用 + 工具执行 ...
        self.context.maybe_compress(self.messages, self.llm)  # ← ②工具执行后

八、配置系统

文件：corecoder/config.py

8.1 配置项

配置项	环境变量	默认值	说明
`model`	`CORECODER_MODEL`	`gpt-4o`	模型名称
`api_key`	`CORECODER_API_KEY` / `OPENAI_API_KEY` / `DEEPSEEK_API_KEY`	空	API 密钥（支持多来源自动选择）
`base_url`	`OPENAI_BASE_URL` / `CORECODER_BASE_URL`	None	API 端点 URL
`max_tokens`	`CORECODER_MAX_TOKENS`	4096	最大生成 token
`temperature`	`CORECODER_TEMPERATURE`	0.0	采样温度
`max_context_tokens`	`CORECODER_MAX_CONTEXT`	128000	上下文窗口大小

8.2 .env 加载

支持从当前目录向上查找到 ~ 目录的 .env 文件自动加载，使用 python-dotenv，不覆盖已存在的环境变量。

🔗 代码联动：配置系统如何实现

Config 数据类 + 环境变量加载 (config.py:28-55)：

python 复制代码

@dataclass
class Config:
    model: str = "gpt-4o"
    api_key: str = ""
    base_url: str | None = None
    max_tokens: int = 4096
    temperature: float = 0.0
    max_context_tokens: int = 128_000

    @classmethod
    def from_env(cls) -> "Config":
        _load_dotenv()                                # ← 加载.env文件
        api_key = (
            os.getenv("CORECODER_API_KEY")            # ← 优先级1：专用变量
            or os.getenv("OPENAI_API_KEY")            # ← 优先级2：OpenAI标准变量
            or os.getenv("DEEPSEEK_API_KEY")          # ← 优先级3：DeepSeek变量
            or ""
        )
        return cls(
            model=os.getenv("CORECODER_MODEL", "gpt-4o"),       # ← 默认gpt-4o
            api_key=api_key,
            base_url=os.getenv("OPENAI_BASE_URL") or os.getenv("CORECODER_BASE_URL"),
            max_tokens=int(os.getenv("CORECODER_MAX_TOKENS", "4096")),
            temperature=float(os.getenv("CORECODER_TEMPERATURE", "0")),
            max_context_tokens=int(os.getenv("CORECODER_MAX_CONTEXT", "128000")),
        )

.env 文件查找 (config.py:8-25)：

python 复制代码

def _load_dotenv():
    try:
        from dotenv import load_dotenv
        env_path = Path(".env")
        if not env_path.exists():
            cur = Path.cwd()
            home = Path.home()
            while cur != home and cur != cur.parent:  # ← 从cwd向上查找到家目录
                candidate = cur / ".env"
                if candidate.exists():
                    env_path = candidate
                    break
                cur = cur.parent
        load_dotenv(env_path, override=False)         # ← override=False：不覆盖已有环境变量
    except ImportError:
        pass                                          # ← python-dotenv未安装也不报错

实现要点 ：API key 支持三个环境变量来源，用 or 链式选择实现优先级；.env 查找从当前目录向上遍历到 ~，适配 monorepo 等场景；override=False 保证显式设置的环境变量不被 .env 覆盖；ImportError 静默处理使 python-dotenv 成为可选依赖。

九、交互式 REPL

文件：corecoder/cli.py

9.1 两种运行模式

模式	启动方式	说明
交互式	`corecoder`	进入 REPL 循环，支持多轮对话
单次	`corecoder -p "prompt"`	执行一次后退出

9.2 内置命令

命令	功能
`/help`	显示帮助
`/reset`	清空对话历史
`/model`	显示当前模型
`/model <name>`	切换模型（无需重启）
`/tokens`	显示 token 用量和费用
`/compact`	手动触发上下文压缩
`/diff`	显示本次会话修改的文件
`/save`	保存会话到磁盘
`/sessions`	列出已保存的会话
`quit` / `exit`	退出

9.3 输入体验

Enter：提交消息
Esc + Enter：插入换行（方便粘贴代码块）
命令历史 ：持久化到 ~/.corecoder_history
流式输出：LLM 响应逐 token 显示

🔗 代码联动：REPL 如何实现

入口与初始化 (cli.py:37-94)：

python 复制代码

def main():
    args = _parse_args()
    config = Config.from_env()                       # ← 从环境变量加载配置
    # CLI参数覆盖环境变量
    if args.model: config.model = args.model
    if args.base_url: config.base_url = args.base_url
    if args.api_key: config.api_key = args.api_key

    llm = LLM(model=config.model, api_key=config.api_key, base_url=config.base_url,
              temperature=config.temperature, max_tokens=config.max_tokens)
    agent = Agent(llm=llm, max_context_tokens=config.max_context_tokens)

    if args.resume:                                  # ← 恢复会话
        loaded = load_session(args.resume)
        if loaded:
            agent.messages, loaded_model = loaded    # ← 直接替换Agent的消息列表
            if not args.model:
                agent.llm.model = loaded_model       # ← 恢复模型名

    if args.prompt:                                  # ← 单次模式
        _run_once(agent, args.prompt); return
    _repl(agent, config)                             # ← 交互式REPL

REPL 主循环 (cli.py:109-230)：

python 复制代码

def _repl(agent, config):
    # 键绑定
    kb = KeyBindings()
    @kb.add("enter")
    def _submit(event): event.current_buffer.validate_and_handle()    # ← Enter提交
    @kb.add("escape", "enter")
    def _newline(event): event.current_buffer.insert_text("\n")       # ← Esc+Enter换行

    while True:
        user_input = pt_prompt("You > ", history=history, multiline=True,
                               key_bindings=kb, prompt_continuation="...  ").strip()

        # 内置命令分发
        if user_input == "/reset": agent.reset(); continue
        if user_input == "/tokens":
            p = agent.llm.total_prompt_tokens; c = agent.llm.total_completion_tokens
            # ... 显示token用量 + 费用估算 ...
        if user_input == "/compact":
            before = estimate_tokens(agent.messages)
            compressed = agent.context.maybe_compress(agent.messages, agent.llm)  # ← 手动压缩
            after = estimate_tokens(agent.messages)
            # ... 显示压缩结果 ...
        if user_input == "/save":
            sid = save_session(agent.messages, config.model)         # ← 保存会话
        if user_input == "/diff":
            from .tools.edit import _changed_files                   # ← 显示修改的文件
            # ... 列出_changed_files ...

        # 调用Agent
        def on_token(tok): streamed.append(tok); print(tok, end="", flush=True)
        def on_tool(name, kwargs): console.print(f"\n[dim]> {name}({_brief(kwargs)})[/dim]")
        response = agent.chat(user_input, on_token=on_token, on_tool=on_tool)

实现要点 ：prompt_toolkit 提供 multiline 输入、历史记录、键绑定；内置命令用 if 分支处理，每个命令对应一个功能调用；on_token 回调实现流式输出（逐 token 打印）；on_tool 回调显示工具调用信息（如 > bash(command="ls")）；/compact 手动触发 maybe_compress，/diff 读取 edit.py 的 _changed_files 全局集合。

十、会话持久化

文件：corecoder/session.py

存储路径：~/.corecoder/sessions/
格式：JSON，包含 id、model、saved_at、messages
恢复方式：corecoder -r session_id
列表限制：最多显示 20 个，按时间倒序

十一、测试体系

文件：tests/test_core.py + tests/test_tools.py

覆盖范围

模块	测试内容
Config	环境变量加载、默认值
Context	token 估算、Layer1 截断、整体压缩
Session	保存/加载/列表/不存在
LLM	已知模型费用估算、未知模型返回 None
edit_file	唯一性检查、变更追踪
Bash	基本执行、退出码、超时、危险命令拦截（rm -rf、fork bomb、curl pipe）
read_file	读取、不存在、offset/limit
write_file	写入、自动创建目录
glob	匹配、无匹配
grep	匹配、无效正则、不存在路径
agent tool	schema 验证

🔗 代码联动：完整启动链路与数据流

从命令行到 Agent 的完整启动链：

复制代码

$ corecoder -m gpt-4o
    ↓ pyproject.toml:45 定义入口
    ↓ [project.scripts] corecoder = "corecoder.cli:main"
    ↓
cli.py:main()
    ↓ _parse_args() 解析命令行参数
    ↓ Config.from_env() 加载环境变量
    ↓ LLM(model, api_key, base_url, ...) 创建LLM实例
    ↓ Agent(llm=llm, ...) 创建智能体
    ↓   ├── ALL_TOOLS 注册7个工具 (tools/__init__.py:11)
    ↓   ├── system_prompt(self.tools) 构建系统提示词 (prompt.py:7)
    ↓   └── AgentTool._parent_agent = self 绑定子智能体
    ↓
_repl(agent, config) 进入交互循环
    ↓ while True:
    ↓   user_input = pt_prompt("You > ") 获取用户输入
    ↓   agent.chat(user_input, on_token=..., on_tool=...) 调用智能体
    ↓     ├── messages.append(user_msg)
    ↓     ├── context.maybe_compress() 压缩检查
    ↓     ├── llm.chat(messages, tools) 调用LLM
    ↓     │     ├── _call_with_retry() 带重试API调用
    ↓     │     └── 流式累积 content + tool_calls
    ↓     ├── tool.execute(**args) 执行工具
    ↓     └── 返回最终文本回复

一次完整工具调用的数据流示例：

复制代码

用户: "把foo.py里的return 42改成return 99"
    ↓
Agent.chat → LLM.chat → OpenAI API (stream)
    ↓ LLM返回: tool_calls=[{name:"read_file", args:{file_path:"foo.py"}}]
    ↓
Agent._exec_tool → get_tool("read_file") → ReadFileTool.execute(file_path="foo.py")
    ↓ 返回: "1→def foo():\n2→    return 42\n"
    ↓
messages.append({role:"tool", content:"1→def foo():\n2→    return 42\n"})
    ↓
Agent.chat → LLM.chat → OpenAI API (stream)
    ↓ LLM返回: tool_calls=[{name:"edit_file", args:{file_path:"foo.py", old_string:"return 42", new_string:"return 99"}}]
    ↓
Agent._exec_tool → get_tool("edit_file") → EditFileTool.execute(...)
    ↓ content.count("return 42") == 1 ✓ 唯一性通过
    ↓ content.replace("return 42", "return 99", 1) 执行替换
    ↓ _changed_files.add("foo.py") 记录变更
    ↓ _unified_diff() 生成diff
    ↓ 返回: "Edited foo.py\n--- a/foo.py\n+++ b/foo.py\n-return 42\n+return 99\n"
    ↓
Agent.chat → LLM.chat → OpenAI API (stream)
    ↓ LLM返回: content="已将 foo.py 中的 return 42 改为 return 99。" (无tool_calls)
    ↓
返回给用户

十二、依赖关系图

复制代码

                  cli.py (入口)
                 /    |    \   \
              agent  config  session  prompt
             /   \      |              |
           llm   tools  .env         (动态)
          /       |
    OpenAI SDK   base.py (ABC)
                 /  |  \  \  \  \
               bash read write edit glob grep agent
                           |
                     _changed_files (跨工具共享)

外部依赖（仅 4 个）：

openai >= 1.0 --- LLM API 客户端
rich >= 13.0 --- 终端美化输出
prompt_toolkit >= 3.0 --- REPL 交互
python-dotenv >= 1.0 --- .env 文件加载

十三、架构特色与设计哲学

特性	说明
极简主义	~1400 行代码实现完整编程智能体
OpenAI 兼容	一套代码适配所有兼容 API（DeepSeek/Qwen/Kimi/Ollama 等）
精确编辑	搜索替换而非行号/全量覆写，安全且可审查
三层压缩	渐进式上下文管理，兼顾效率和信息保留
并行执行	多工具调用自动并行，提升效率
安全拦截	9 种危险命令模式拦截 + 输出截断保护
子智能体	可委派独立上下文的子任务，支持层级式任务分解
会话恢复	对话可保存/恢复，支持长期工作流
零框架依赖	不依赖 LangChain/LlamaIndex 等框架，纯 SDK 调用