从 Curl 开始：不用 SDK，通过 DeepSeek API 手写 Agent Runtime

目标：不用 OpenAI SDK、Codex SDK、DeepSeek SDK，也不先写 Node/Python 程序。只用 curl + JSON 文件 + jq/grep 理解一套 Agent 的完整运行机制。

0. 这条路线适合你现在的目标

你现在想学的不是"怎么调用一个模型"，而是：

Agent 到底怎么和模型通信；
模型为什么不会真的执行工具；
tool calling 的闭环到底是什么；
memory 和 knowledge base 分别应该放在哪里；
harness engineering 到底在工程里负责什么；
未来怎么把这些 curl 请求翻译成自己的 SDK / runtime / loop。

所以我们先不写应用代码。我们只做一件事：把每一次 Agent 运行都拆成可观察的 HTTP 请求和 JSON 文件。

1. DeepSeek API 的核心事实

DeepSeek API 使用 OpenAI-compatible 的 Chat Completions 风格接口。基础地址是：

arduino 复制代码

https://api.deepseek.com

主要端点是：

bash 复制代码

POST /chat/completions

常用模型：

bash 复制代码

deepseek-chat       # 非 thinking 模式，适合普通 chat、JSON 输出、tool calling 入门
deepseek-reasoner   # thinking 模式，适合复杂推理，后期再学

你现在先用：

复制代码

deepseek-chat

原因：

行为更接近普通 Chat Completions；
tool calling 结构更容易理解；
不需要先处理 reasoning_content；
更适合搭建第一个 Agent harness。

2. 准备本地实验目录

你只需要命令行环境。

建议工具：

perl 复制代码

curl    # 发 HTTP 请求
jq      # 查看和抽取 JSON
grep    # 模拟知识库检索

创建目录：

bash 复制代码

mkdir -p deepseek-agent-lab/{requests,responses,state,kb,memory,traces,evals,tools}
cd deepseek-agent-lab

设置 API Key：

ini 复制代码

export DEEPSEEK_API_KEY="你的 DeepSeek API Key"

不要把 API Key 写进 Markdown、Git 仓库、聊天记录或者请求 JSON 文件。

3. 第一层：最小模型调用

3.1 写请求 JSON

创建：

bash 复制代码

cat > requests/001_basic.json <<'JSON'
{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system",
      "content": "你是一个严谨的 Agent 工程导师。回答要结构清楚，不要编造。"
    },
    {
      "role": "user",
      "content": "用三句话解释什么是 Agent Runtime。"
    }
  ],
  "stream": false
}
JSON

发送请求：

arduino 复制代码

curl -s https://api.deepseek.com/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
  -d @requests/001_basic.json \
  | tee responses/001_basic.json

查看模型回复：

arduino 复制代码

jq -r '.choices[0].message.content' responses/001_basic.json

查看 token 用量：

arduino 复制代码

jq '.usage' responses/001_basic.json

3.2 你要理解的概念

这一步只有三个对象：

vbscript 复制代码

messages     你传给模型的上下文
model        你选择的模型
response     模型返回的结果

现在还没有 Agent。这里只是一次普通的 LLM call。

4. 第二层：理解 messages 和多轮对话

DeepSeek 的 /chat/completions 是 stateless API。这意味着服务端不会自动帮你记住上一轮对话。你想让模型"记得"，就必须把历史 messages 再次发过去。

4.1 第二轮请求

假设第一轮模型回答了：

复制代码

Agent Runtime 是负责驱动智能体运行的确定性外壳。它管理模型调用、工具调用、上下文、状态和安全策略。模型只决定想做什么，Runtime 决定能不能做以及怎么执行。

创建第二轮请求：

bash 复制代码

cat > requests/002_multi_turn.json <<'JSON'
{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system",
      "content": "你是一个严谨的 Agent 工程导师。回答要结构清楚，不要编造。"
    },
    {
      "role": "user",
      "content": "用三句话解释什么是 Agent Runtime。"
    },
    {
      "role": "assistant",
      "content": "Agent Runtime 是负责驱动智能体运行的确定性外壳。它管理模型调用、工具调用、上下文、状态和安全策略。模型只决定想做什么，Runtime 决定能不能做以及怎么执行。"
    },
    {
      "role": "user",
      "content": "那它和 Agent 本身有什么区别？"
    }
  ],
  "stream": false
}
JSON

发送：

arduino 复制代码

curl -s https://api.deepseek.com/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
  -d @requests/002_multi_turn.json \
  | tee responses/002_multi_turn.json

4.2 你要理解的概念

真正的"对话记忆"不是模型自动记住。它只是你在下一次请求中重新传入了历史消息。

这就是以后工程里的：

复制代码

ConversationStore
ContextBuilder
MemoryManager
Compaction

5. 第三层：结构化 JSON 输出

Agent 工程里不能只依赖自然语言。你需要让模型输出结构化结果，方便 harness 判断下一步做什么。

DeepSeek 支持：

json 复制代码

"response_format": { "type": "json_object" }

注意两点：

prompt 里必须明确要求输出 JSON；
max_tokens 要足够，否则 JSON 可能被截断。

5.1 JSON 输出请求

arduino 复制代码

cat > requests/003_json_output.json <<'JSON'
{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system",
      "content": "你是 Agent 工程分析器。你必须输出合法 json，不要输出 markdown。JSON 格式：{"concept": string, "definition": string, "engineering_role": string, "common_mistake": string}"
    },
    {
      "role": "user",
      "content": "分析概念：Tool Calling。请输出 json。"
    }
  ],
  "response_format": {
    "type": "json_object"
  },
  "max_tokens": 800,
  "stream": false
}
JSON

发送：

arduino 复制代码

curl -s https://api.deepseek.com/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
  -d @requests/003_json_output.json \
  | tee responses/003_json_output.json

抽取 JSON 内容：

arduino 复制代码

jq -r '.choices[0].message.content' responses/003_json_output.json | jq .

5.2 你要理解的概念

这一步对应未来工程里的：

复制代码

StructuredOutput
SchemaValidation
PlannerOutput
RouterDecision
EvalResult

你会开始理解：Agent 工程不是让模型"自由发挥"，而是让模型在某些关键节点输出可校验结构。

6. 第四层：Tool Calling 的真实闭环

这是成为 Agent 开发工程师必须吃透的一层。

6.1 最重要的一句话

模型不会真的执行工具。模型只会返回：

复制代码

我想调用哪个工具，以及参数是什么。

真正执行工具的是你的 harness。

6.2 第一次请求：给模型工具定义

我们先定义一个假工具：

复制代码

kb_search

它的意思是：在本地知识库里搜索内容。

创建一个本地知识库文件：

bash 复制代码

cat > kb/agent_runtime.md <<'MD'
# Agent Runtime

Agent Runtime 是运行 Agent 的确定性外壳。它负责模型调用、工具执行、上下文拼装、状态管理、审批、安全边界、trace 和 eval。

Agent 不应该直接拥有执行权。模型可以提出 tool call，但 harness 必须负责校验、执行和记录。
MD

现在创建带 tools 的请求：

bash 复制代码

cat > requests/004_tool_call_step1.json <<'JSON'
{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system",
      "content": "你是一个 Agent 工程助手。当用户问到本地知识库内容时，优先调用 kb_search。不要假装已经读过知识库。"
    },
    {
      "role": "user",
      "content": "根据知识库回答：Agent Runtime 负责什么？"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "kb_search",
        "description": "Search the local knowledge base for relevant passages.",
        "parameters": {
          "type": "object",
          "properties": {
            "query": {
              "type": "string",
              "description": "The search query."
            }
          },
          "required": ["query"]
        }
      }
    }
  ],
  "tool_choice": "auto",
  "stream": false
}
JSON

发送：

arduino 复制代码

curl -s https://api.deepseek.com/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
  -d @requests/004_tool_call_step1.json \
  | tee responses/004_tool_call_step1.json

检查是否产生 tool call：

arduino 复制代码

jq '.choices[0].finish_reason' responses/004_tool_call_step1.json
jq '.choices[0].message.tool_calls' responses/004_tool_call_step1.json

你可能看到类似结构：

css 复制代码

[  {    "id": "call_xxx",    "type": "function",    "function": {      "name": "kb_search",      "arguments": "{"query":"Agent Runtime 负责什么"}"    }  }]

6.3 你手工扮演 harness

现在你不要写程序。你手工执行这个工具。

比如用 grep 搜索：

perl 复制代码

grep -R "Agent Runtime" kb/

得到工具结果后，准备第二次请求。注意：你必须把上一轮 assistant 的 tool_calls 放回 messages，再追加一个 role: tool 的消息。

把下面 JSON 里的 call_xxx 换成你真实响应里的 tool_calls[0].id。

javascript 复制代码

cat > requests/005_tool_call_step2.json <<'JSON'
{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system",
      "content": "你是一个 Agent 工程助手。当用户问到本地知识库内容时，优先调用 kb_search。不要假装已经读过知识库。"
    },
    {
      "role": "user",
      "content": "根据知识库回答：Agent Runtime 负责什么？"
    },
    {
      "role": "assistant",
      "content": null,
      "tool_calls": [
        {
          "id": "call_xxx",
          "type": "function",
          "function": {
            "name": "kb_search",
            "arguments": "{"query":"Agent Runtime 负责什么"}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "tool_call_id": "call_xxx",
      "content": "Agent Runtime 是运行 Agent 的确定性外壳。它负责模型调用、工具执行、上下文拼装、状态管理、审批、安全边界、trace 和 eval。Agent 不应该直接拥有执行权。模型可以提出 tool call，但 harness 必须负责校验、执行和记录。"
    }
  ],
  "stream": false
}
JSON

发送第二次请求：

arduino 复制代码

curl -s https://api.deepseek.com/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
  -d @requests/005_tool_call_step2.json \
  | tee responses/005_tool_call_step2.json

查看最终答案：

arduino 复制代码

jq -r '.choices[0].message.content' responses/005_tool_call_step2.json

6.4 这就是 Agent Loop

完整闭环是：

rust 复制代码

User Request
  -> Model Call with tools
  -> Model returns tool_calls
  -> Harness validates tool name and arguments
  -> Harness executes the tool
  -> Harness appends tool result to messages
  -> Model Call again
  -> Model returns final answer

现在你已经用 curl 跑通了 Agent 的核心 loop。

7. 第五层：Memory 和 Knowledge Base 的区别

很多 Agent 项目做不好，是因为把 memory 和 knowledge base 混在一起。

7.1 Memory 是什么

Memory 是和用户、项目、任务有关的长期状态。

例子：

json 复制代码

{"type":"user_preference","key":"learning_style","value":"喜欢从 curl 和协议层开始理解，再翻译成代码","confidence":0.95}
{"type":"project_fact","key":"current_goal","value":"从零实现 Agent Runtime，理解 harness、memory、KB、eval","confidence":0.9}

创建本地 memory：

bash 复制代码

cat > memory/user_memory.jsonl <<'JSONL'
{"type":"user_preference","key":"learning_style","value":"喜欢从 curl 和协议层开始理解，再翻译成代码","confidence":0.95}
{"type":"project_fact","key":"current_goal","value":"从零实现 Agent Runtime，理解 harness、memory、KB、eval","confidence":0.9}
JSONL

7.2 Knowledge Base 是什么

Knowledge Base 是外部知识、文档、规范、项目资料。

例子：

bash 复制代码

kb/agent_runtime.md
kb/deepseek_api_notes.md
kb/project_requirements.md
kb/harness_design.md

7.3 二者协同方式

每次请求前，你的 harness 应该做：

markdown 复制代码

1. 读取用户当前问题
2. 从 memory 中挑选相关事实
3. 从 KB 中检索相关片段
4. 拼成本轮上下文
5. 发给模型

现在我们不用代码，就手工拼。

创建请求：

swift 复制代码

cat > requests/006_memory_kb_context.json <<'JSON'
{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system",
      "content": "你是 Agent 工程导师。请基于 MEMORY 和 KNOWLEDGE_BASE 回答，不要编造。"
    },
    {
      "role": "user",
      "content": "MEMORY:\n- 用户喜欢从 curl 和协议层开始理解，再翻译成代码。\n- 当前目标：从零实现 Agent Runtime，理解 harness、memory、KB、eval。\n\nKNOWLEDGE_BASE:\nAgent Runtime 是运行 Agent 的确定性外壳。它负责模型调用、工具执行、上下文拼装、状态管理、审批、安全边界、trace 和 eval。\n\nQUESTION:\n我现在应该怎么理解 memory 和 knowledge base 的协同？"
    }
  ],
  "stream": false
}
JSON

发送：

arduino 复制代码

curl -s https://api.deepseek.com/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
  -d @requests/006_memory_kb_context.json \
  | tee responses/006_memory_kb_context.json

7.4 你要理解的概念

Memory 和 KB 都不会自动进入模型。它们必须被你的 harness 选中、压缩、排序、拼进上下文。

未来工程里，这一层叫：

vbnet 复制代码

Context Engineering
Retrieval
Memory Selection
Prompt Assembly
Compaction

8. 第六层：Context Caching 的正确用法

DeepSeek 的 Context Caching 默认开启。但它只对"重复前缀"有帮助。

所以你应该养成一个上下文排版习惯：

perl 复制代码

稳定前缀：system prompt、角色规则、工具使用规则、输出格式
动态内容：memory、KB 检索片段、用户当前问题

推荐结构：

makefile 复制代码

SYSTEM:
你是 xxx。规则固定不变。

AGENT_POLICY:
固定工具策略、审批策略、输出规范。

DYNAMIC_CONTEXT:
本轮 memory / KB / 当前任务。

不要每次随机改 system prompt。不要把时间戳、traceId、临时状态放到最前面。这些会破坏可复用前缀。

查看缓存命中：

arduino 复制代码

jq '.usage.prompt_cache_hit_tokens, .usage.prompt_cache_miss_tokens' responses/006_memory_kb_context.json

9. 第七层：Trace，不写代码也要记录运行过程

Agent 工程不是"发一次请求，看看答案"。你必须能回放每一次运行。

每一次运行至少保存：

bash 复制代码

traces/<trace_id>/request_1.json
traces/<trace_id>/response_1.json
traces/<trace_id>/tool_call.json
traces/<trace_id>/tool_result.json
traces/<trace_id>/request_2.json
traces/<trace_id>/response_2.json
traces/<trace_id>/notes.md

手工创建一次 trace：

bash 复制代码

TRACE_ID="trace_001_tool_call"
mkdir -p traces/$TRACE_ID
cp requests/004_tool_call_step1.json traces/$TRACE_ID/request_1.json
cp responses/004_tool_call_step1.json traces/$TRACE_ID/response_1.json
cp requests/005_tool_call_step2.json traces/$TRACE_ID/request_2.json
cp responses/005_tool_call_step2.json traces/$TRACE_ID/response_2.json

写观察记录：

bash 复制代码

cat > traces/$TRACE_ID/notes.md <<'MD'
# Trace Notes

## 目标
理解 DeepSeek tool calling 的两段式闭环。

## 观察
- 第一轮模型没有直接回答，而是返回 tool_calls。
- harness 必须执行工具并追加 role=tool 的消息。
- 第二轮模型基于 tool 结果生成最终答案。

## 风险
- 如果 tool_call_id 不一致，模型无法关联工具结果。
- 如果工具结果包含不可信文本，不能让它覆盖 system 规则。
MD

这就是后面专业 Agent 工程里的 observability。

10. 第八层：Approval，不要让模型直接执行危险动作

当工具有副作用时，比如：

复制代码

写文件
删文件
提交代码
发邮件
改数据库
调用生产 API

模型最多只能提出 action。 harness 必须先审批。

你可以先用 JSON 输出模拟：

bash 复制代码

cat > requests/007_approval_plan.json <<'JSON'
{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system",
      "content": "你是安全的 Agent Planner。你不能直接执行动作，只能输出 json plan。JSON 格式：{"intent": string, "requires_approval": boolean, "risk": string, "proposed_actions": array}"
    },
    {
      "role": "user",
      "content": "请帮我删除项目里所有临时文件。输出 json。"
    }
  ],
  "response_format": {
    "type": "json_object"
  },
  "max_tokens": 800,
  "stream": false
}
JSON

发送：

arduino 复制代码

curl -s https://api.deepseek.com/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
  -d @requests/007_approval_plan.json \
  | tee responses/007_approval_plan.json

你要检查：

arduino 复制代码

jq -r '.choices[0].message.content' responses/007_approval_plan.json | jq .

这一步的核心思想：

erlang 复制代码

LLM proposes.
Harness disposes.

模型提出计划。 harness 决定能不能执行。

11. 第九层：Eval，用 curl 做最小验证

Eval 的本质不是"训练模型"。 Eval 是判断你的 Agent 行为有没有变好或变坏。

11.1 写一个测试用例

bash 复制代码

cat > evals/case_001.json <<'JSON'
{
  "case_id": "case_001",
  "input": "Agent Runtime 和 Agent 有什么区别？",
  "expected_points": [
    "Agent 是任务执行主体或模型驱动单元",
    "Runtime 是确定性运行外壳",
    "Runtime 管理工具、状态、安全、trace、eval"
  ]
}
JSON

11.2 手工跑被测请求

bash 复制代码

cat > requests/008_eval_candidate.json <<'JSON'
{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system",
      "content": "你是 Agent 工程导师。回答必须准确、简洁。"
    },
    {
      "role": "user",
      "content": "Agent Runtime 和 Agent 有什么区别？"
    }
  ],
  "stream": false
}
JSON

发送：

arduino 复制代码

curl -s https://api.deepseek.com/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
  -d @requests/008_eval_candidate.json \
  | tee responses/008_eval_candidate.json

11.3 用另一次模型调用做 grader

把候选答案复制到下面的 CANDIDATE_ANSWER 里：

swift 复制代码

cat > requests/009_eval_grader.json <<'JSON'
{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system",
      "content": "你是严格的 Agent 工程评测器。你必须输出 json。JSON 格式：{"score": number, "passed": boolean, "missing_points": array, "notes": string}"
    },
    {
      "role": "user",
      "content": "请评测候选答案是否覆盖预期点。\n\nEXPECTED_POINTS:\n1. Agent 是任务执行主体或模型驱动单元\n2. Runtime 是确定性运行外壳\n3. Runtime 管理工具、状态、安全、trace、eval\n\nCANDIDATE_ANSWER:\n把这里替换成 responses/008_eval_candidate.json 里的回答\n\n请输出 json。"
    }
  ],
  "response_format": {
    "type": "json_object"
  },
  "max_tokens": 800,
  "stream": false
}
JSON

发送：

arduino 复制代码

curl -s https://api.deepseek.com/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${DEEPSEEK_API_KEY}" \
  -d @requests/009_eval_grader.json \
  | tee responses/009_eval_grader.json

查看评分：

arduino 复制代码

jq -r '.choices[0].message.content' responses/009_eval_grader.json | jq .

这就是最小 eval loop。

12. 什么时候使用 deepseek-reasoner

先不要急着用 deepseek-reasoner。

建议顺序：

javascript 复制代码

阶段 1：deepseek-chat + 普通 messages
阶段 2：deepseek-chat + JSON 输出
阶段 3：deepseek-chat + tool calling
阶段 4：deepseek-chat + memory / KB / eval
阶段 5：deepseek-reasoner + 复杂推理
阶段 6：deepseek-reasoner + tool calling

原因：thinking mode 加 tool calls 时，需要正确处理 reasoning_content。如果你没有把中间 reasoning 内容按 API 规则传回去，可能会出错。所以它适合在你已经理解普通 tool loop 之后再学。

13. 从 Curl 翻译成代码时，怎么拆模块

当你已经能用 curl 跑通所有流程，再开始写代码。

映射关系如下：

Curl 阶段	未来代码模块	负责什么
`requests/*.json`	`ModelRequestBuilder`	构造模型请求
`curl https://api.deepseek.com/chat/completions`	`DeepSeekChatAdapter`	发送 HTTP 请求
`messages` 数组	`ConversationStore`	保存对话上下文
手工拼 memory / KB	`ContextBuilder`	选择并组装上下文
`tools` JSON schema	`ToolRegistry`	注册可用工具
手工 grep KB	`ToolExecutor`	执行工具
`role: tool` 消息	`ToolResultSerializer`	把工具结果转回模型上下文
`traces/` 文件夹	`TraceStore`	记录和回放运行过程
`evals/*.json`	`EvalRunner`	回归测试
审批 JSON	`ApprovalPolicy`	控制副作用动作

你要注意：

复制代码

不要一开始就写 Agent 类。
先写 DeepSeekChatAdapter。
再写 ConversationStore。
再写 ToolRegistry。
再写 ToolExecutor。
最后才写 AgentHarness。

14. 最小 AgentHarness 的伪流程

不是代码，只是流程：

sql 复制代码

run(user_input):

1. load conversation history
2. select memory
3. retrieve KB passages
4. build messages
5. send chat completion request
6. if response has tool_calls:
     a. validate tool name
     b. validate arguments
     c. check approval policy
     d. execute tool
     e. append assistant tool_call message
     f. append tool result message
     g. call model again
7. save final answer
8. save trace
9. optionally run eval
10. return answer

这就是你未来要写的核心 runtime。

15. 推荐你的最终工程目标

等你用 curl 全部跑通后，你要写的不是"聊天机器人"。

你要写的是：

复制代码

DeepSeekChatAdapter
ConversationStore
MemoryStore
KnowledgeBaseRetriever
ContextBuilder
ToolRegistry
ToolExecutor
ApprovalPolicy
TraceStore
EvalRunner
AgentHarness

专业 Agent 开发工程师的核心能力不是会调用 API。而是知道：

bash 复制代码

模型输出不可信，必须校验。
工具执行有风险，必须审批。
上下文不是越多越好，必须选择和压缩。
记忆不是聊天历史，必须结构化管理。
知识库不是 prompt，必须检索和引用。
没有 eval，就没有可靠迭代。
没有 trace，就没有工程化 Agent。

16. 官方资料

DeepSeek API Docs:

arduino 复制代码

https://api-docs.deepseek.com/

Chat Completion API:

arduino 复制代码

https://api-docs.deepseek.com/api/create-chat-completion

Multi-round Conversation:

arduino 复制代码

https://api-docs.deepseek.com/guides/multi_round_chat

Function / Tool Calling:

arduino 复制代码

https://api-docs.deepseek.com/guides/function_calling

JSON Output:

arduino 复制代码

https://api-docs.deepseek.com/guides/json_mode

Context Caching:

arduino 复制代码

https://api-docs.deepseek.com/guides/kv_cache

Thinking Mode:

arduino 复制代码

https://api-docs.deepseek.com/guides/thinking_mode

17. 下一步练习

完成本文后，你应该能回答这些问题：

为什么 Chat Completion API 是 stateless？
tool calling 为什么一定需要第二次模型调用？
assistant.tool_calls 和 role: tool 有什么关系？
memory 和 KB 分别解决什么问题？
context caching 为什么依赖重复前缀？
trace 和 eval 为什么要从第一天开始做？

当你能清楚回答这些问题，你就已经不是在"调用大模型 API"，而是在理解 Agent Runtime。