Claude Code 自下而上分析(Slash/Sub Agents/Skills/MCP)带来的启发

Claude Code 是一个伟大的发明 & 生态
Everything is Prompt(Show Me Your Prompt)

背景

最近在用 Single Agent 架构做一些事情,目前主流的的 Single Agent 做的最好的是 Claude Code,为了彻底明白它封装的各种概念,让 AI 听话干活,抛开上层概念,进行深入的理解底层发生了什么。

关于 Claude Code Slash, Sub Agents, Skills, MCP 的定义和最佳实践,相关资料已经很多了,这里不再赘述,本文章更关心他们底层是如何工作的,以及能给我们后续的工作带来什么启发和帮助

TLDR

如果不想细看,可以直接看结论:

  1. 相对于工具摆弄,应该更关注你的问题是什么?应该如何被解决
  2. 所有的功能都很依赖模型的指令遵循(模型好坏决定了上层工作流越稳定),不同任务场景,尽量减少不必要的 tools 使用,可以减少大模型的理解混乱,减少大模型的选择困难症
  3. 如果任务明确,可以用工程手段解决,就优先使用工程手段触发任务或者显示调用对应工具,别让模型去猜。通过大模型去理解语言任务,触发对应的 tools 会有较大的失败概率
  4. Claude Code 在每次 main agent 调用时会非常消耗 token,主要来自 System prompt 和 Tools Prompt(统计见下文)
    1. 可以引入动态分析,减少不必要的 token 消耗,也可以减少模型的理解成本

Slash

TLDR: 斜杠命令本质上是 Markdown 文件,执行后,直接作为 user prompt 发送给大模型。Slash 可以大大减少重复的 prompt 输入,做到团队共享,复用。

案例

定义一个 /json-anythiny Slash,用来结构化控制 AI 的输出

Markdown 复制代码
---
name: JSON Anything
description: 按照指定的 JSON 格式,把内容作为返回
category: DEMO
tags: [DEMO]
---

请按 JSON 格式返回你需要回答用户的内容,JSON 的格式定义如下,并且用 markdown 格式 ```json 和 ``` 包裹起来

```typescript
inteface Response {
  "type": "Response";
  // 这里是你需要回复用户的内容
  "content": string;
}
```

$ARGUMENTS

payload.messages[0].content看到拼接规则

Plain 复制代码
{
  "role": "user",
  "content": [
    {
      "type": "text",
      "text": "<local-command-caveat>Caveat: The messages below were generated by the user while running local commands. DO NOT respond to these messages or otherwise consider them in your response unless the user explicitly asks you to.</local-command-caveat>"
    },
    {
      "type": "text",
      "text": "<local-command-stdout></local-command-stdout>"
    },
    {
      "type": "text",
      "text":
"<command-message>json-anythiny</command-message>
<command-name>/json-anythiny</command-name>
<command-args>什么是碳基生物</command-args>"
    },
    {
      "type": "text",
      "text":
"请按 JSON 格式返回你需要回答用户的内容,JSON 的格式定义如下,并且用 markdown 格式 ```json 和 ``` 包裹起来

```typescript
inteface Response {
        "type": "Response";
  // 这里是你需要回复用户的内容
        "content": string;
      }
```

ARGUMENTS: 什么是碳基生物","cache_control": {
        "type": "ephemeral"
      }
    }
  ]
}

启发

唯一的启发就是 slash 共用,且能内置到 project 里(这也是 slash 的本意)

Agent(Sub Agent)

TLDR: Agent 底层依赖 llms 的 Tools,再底层依赖 llms 的指令遵循

~/.claude/agentsproject/.claude/agents 的内容会被拼接到发起大模型的调用时的 payload.tools.Task 里。通过查阅 Taskdescription 我们不难发现,他的组成是

Plain 复制代码
Task description(描述 Task tool 是用来处理复杂任务的(重要))
built-in agents description(内置 agents 描述)
user agents description(用户级别的 agents 描述)
project agents description(项目级别的 agents 描述)
usage(使用案例)

原文如下,其中:snap-json-repair 是我定义在 ~/.claude/agents里,typo-checkerfrontend-solution-designerprd-expandertechnical-docs-reviewer 是我定义在在 project/.claude/agents

Plain 复制代码
Launch a new agent to handle complex, multi-step tasks autonomously. 

The Task tool launches specialized agents (subprocesses) that autonomously handle complex tasks. Each agent type has specific capabilities and tools available to it.

Available agent types and the tools they have access to:
- general-purpose: General-purpose agent for researching complex questions, searching for code, and executing multi-step tasks. When you are searching for a keyword or file and are not confident that you will find the right match in the first few tries use this agent to perform the search for you. (Tools: *)
- statusline-setup: Use this agent to configure the user's Claude Code status line setting. (Tools: Read, Edit)
- Explore: Fast agent specialized for exploring codebases. Use this when you need to quickly find files by patterns (eg. "src/components/**/*.tsx"), search code for keywords (eg. "API endpoints"), or answer questions about the codebase (eg. "how do API endpoints work?"). When calling this agent, specify the desired thoroughness level: "quick" for basic searches, "medium" for moderate exploration, or "very thorough" for comprehensive analysis across multiple locations and naming conventions. (Tools: All tools)
- Plan: Software architect agent for designing implementation plans. Use this when you need to plan the implementation strategy for a task. Returns step-by-step plans, identifies critical files, and considers architectural trade-offs. (Tools: All tools)
- claude-code-guide: Use this agent when the user asks questions ("Can Claude...", "Does Claude...", "How do I...") about: (1) Claude Code (the CLI tool) - features, hooks, slash commands, MCP servers, settings, IDE integrations, keyboard shortcuts; (2) Claude Agent SDK - building custom agents; (3) Claude API (formerly Anthropic API) - API usage, tool use, Anthropic SDK usage. **IMPORTANT:** Before spawning a new agent, check if there is already a running or recently completed claude-code-guide agent that you can resume using the "resume" parameter. (Tools: Glob, Grep, Read, WebFetch, WebSearch)
- snap-json-repair: 用于处理 AI 返回的 JSON 数据,校验其是否符合定义的 JSON 规范,若不符合则尝试修正 (Tools: All tools)
- typo-checker: Use this agent when a chunk of text has just been completed and you need to check it for Chinese character typos and incorrect word usage. This includes proactively running after every task involving Chinese text content creation or modification. 
- frontend-solution-designer: Use this agent when a user requests to interpret a requirement from a frontend development perspective and design a plan using the frontend technology stack. This could be when a requirement is described in general terms and needs translation into a concrete frontend implementation strategy, including architecture, components, frameworks, and tooling.
- prd-expander: Use this agent when a user provides a brief or incomplete requirement description and requests it to be expanded into a full PRD (Product Requirement Document) following standardized formatting and structure.
- technical-docs-reviewer: Use this agent when a user has completed writing or updating a technical document and needs a detailed review for accuracy, clarity, completeness, and adherence to documentation standards.

When using the Task tool, you must specify a subagent_type parameter to select which agent type to use.

When NOT to use the Task tool:
- If you want to read a specific file path, use the Read or Glob tool instead of the Task tool, to find the match more quickly
- If you are searching for a specific class definition like "class Foo", use the Glob tool instead, to find the match more quickly
- If you are searching for code within a specific file or set of 2-3 files, use the Read tool instead of the Task tool, to find the match more quickly
- Other tasks that are not related to the agent descriptions above


Usage notes:
- Always include a short description (3-5 words) summarizing what the agent will do
- Launch multiple agents concurrently whenever possible, to maximize performance; to do that, use a single message with multiple tool uses
- When the agent is done, it will return a single message back to you. The result returned by the agent is not visible to the user. To show the user the result, you should send a text message back to the user with a concise summary of the result.
- You can optionally run agents in the background using the run_in_background parameter. When an agent runs in the background, you will need to use TaskOutput to retrieve its results once it's done. You can continue to work while background agents run - When you need their results to continue you can use TaskOutput in blocking mode to pause and wait for their results.
- Agents can be resumed using the `resume` parameter by passing the agent ID from a previous invocation. When resumed, the agent continues with its full previous context preserved. When NOT resuming, each invocation starts fresh and you should provide a detailed task description with all necessary context.
- When the agent is done, it will return a single message back to you along with its agent ID. You can use this ID to resume the agent later if needed for follow-up work.
- Provide clear, detailed prompts so the agent can work autonomously and return exactly the information you need.
- Agents with "access to current context" can see the full conversation history before the tool call. When using these agents, you can write concise prompts that reference earlier context (e.g., "investigate the error discussed above") instead of repeating information. The agent will receive all prior messages and understand the context.
- The agent's outputs should generally be trusted
- Clearly tell the agent whether you expect it to write code or just to do research (search, file reads, web fetches, etc.), since it is not aware of the user's intent
- If the agent description mentions that it should be used proactively, then you should try your best to use it without the user having to ask for it first. Use your judgement.
- If the user specifies that they want you to run agents "in parallel", you MUST send a single message with multiple Task tool use content blocks. For example, if you need to launch both a code-reviewer agent and a test-runner agent in parallel, send a single message with both tool calls.

Example usage:

<example_agent_descriptions>
"code-reviewer": use this agent after you are done writing a signficant piece of code
"greeting-responder": use this agent when to respond to user greetings with a friendly joke
</example_agent_description>

<example>
user: "Please write a function that checks if a number is prime"
assistant: Sure let me write a function that checks if a number is prime
assistant: First let me use the Write tool to write a function that checks if a number is prime
assistant: I'm going to use the Write tool to write the following code:
<code>
function isPrime(n) {
  if (n <= 1) return false
  for (let i = 2; i * i <= n; i++) {
    if (n % i === 0) return false
  }
  return true
}
</code>
<commentary>
Since a signficant piece of code was written and the task was completed, now use the code-reviewer agent to review the code
</commentary>
assistant: Now let me use the code-reviewer agent to review the code
assistant: Uses the Task tool to launch the code-reviewer agent 
</example>

<example>
user: "Hello"
<commentary>
Since the user is greeting, use the greeting-responder agent to respond with a friendly joke
</commentary>
assistant: "I'm going to use the Task tool to launch the greeting-responder agent"
</example>

案例

Query:我有一个需求,是开发一个登录页面,帮我扩写这个需求并且重新审查扩写的需求,在完成前端技术方案设计

这里的 agents 不是很稳定的会被激发,原因在于 fewshot 的 case 太局限,建议使用 @agent-name 来显示调用触发

从结果来看 prd-expandertechnical-docs-reviewer 被调用

Markdown 复制代码
⏺ 你有以下需求组合:
  1. 扩写「开发一个登录页面」为标准 PRD 文档
  2. 对扩写后的 PRD 进行审查
  3. 完成前端技术方案设计

  这是一个经典的多步骤设计开发场景。我将先构建待办列表,确保每步任务清晰可追踪:

  1. 扩写登录页面需求为标准 PRD 文档
  2. 审查并优化扩写后的 PRD 文档
  3. 根据 PRD 进行前端技术方案设计

  我将先完成 PRD 扩写,然后依次完成后续工作。

  接下来,我将使用适配 agent 工具并推进任务。

⏺ prd-expander(扩写登录页面为PRD)
  ⎿  Done (0 tool uses · 10.6k tokens · 20s)

⏺ 登录页面 PRD文档已扩写为标准格式,内容涵盖业务背景、功能清单、交互流程、UI原型要素和非功能性要求等。

  下一步将对该 PRD 进行审查,确保内容准确完整、结构清晰。之后会完成前端技术方案设计。

  请问,你是否有指定技术栈(例如 React、Vue、Angular)或特殊设计要求?如果没有,我将默认使用主流 React 技术栈进行前端方案设计。

> 继续

⏺ technical-docs-reviewer(审查扩写后的登录页面PRD)
  ⎿  Done (0 tool uses · 9.0k tokens · 4s)

仔细看下 messages

assistant content

markdown 复制代码
你有以下需求组合:
1. 扩写「开发一个登录页面」为标准 PRD 文档
2. 对扩写后的 PRD 进行审查
3. 完成前端技术方案设计

这是一个经典的多步骤设计开发场景。我将先构建待办列表,确保每步任务清晰可追踪:

1. 扩写登录页面需求为标准 PRD 文档
2. 审查并优化扩写后的 PRD 文档
3. 根据 PRD 进行前端技术方案设计

我将先完成 PRD 扩写,然后依次完成后续工作。

接下来,我将使用适配 agent 工具并推进任务。
json 复制代码
{
  "type": "tool_use",
  "id": "call_qLnkEo2PhBoNZXImGM1vW8qW",
  "name": "Task",
  "input": {
    "description": "扩写登录页面为PRD",
    "prompt": "将需求"开发一个登录页面"扩写成完整标准 PRD,需包含产品背景、需求描述、功能清单、交互流程、界面原型要素、非功能性要求等。",
    "subagent_type": "prd-expander"
  }
}

最后你会发现 agent 也是被 tool_use 激活

启发

正如官方推荐一样,Agent 从使用来看可以并行触发,提升执行效率,但是触发不太稳定,最好是显示调用(但是有时也不准,可能和模型有关)

Skills

和 Agent 一样,Skills 底层也是一个 tool,你编写的 skill 的 description 都以 prompt 形式注入到 上下文里,让 AI 结合你的意图去调用(不保熟)

tools.Skill的 description 组成格式是:

Plain 复制代码
1. Skill description(描述何时用 skill,被动嗅探 or slash)
2. usage(使用案例)
3. user skills description(用户级别的 skills 描述)
4. project skills description(项目级别的 skills 描述)

如下是完整的 Prompt

Markdown 复制代码
Execute a skill within the main conversation

When users ask you to perform tasks, check if any of the available skills below can help complete the task more effectively. Skills provide specialized capabilities and domain knowledge.

When users ask you to run a "slash command" or reference "/<something>" (e.g., "/commit", "/review-pr"), they are referring to a skill. Use this tool to invoke the corresponding skill.

Example:
  User: "run /commit"
  Assistant: [Calls Skill tool with skill: "commit"]

How to invoke:
- Use this tool with the skill name and optional arguments
- Examples:
  - `skill: "pdf"` - invoke the pdf skill
  - `skill: "commit", args: "-m 'Fix bug'"` - invoke with arguments
  - `skill: "review-pr", args: "123"` - invoke with arguments
  - `skill: "ms-office-suite:pdf"` - invoke using fully qualified name

Important:
- When a skill is relevant, you must invoke this tool IMMEDIATELY as your first action
- NEVER just announce or mention a skill in your text response without actually calling this tool
- This is a BLOCKING REQUIREMENT: invoke the relevant Skill tool BEFORE generating any other response about the task
- Only use skills listed in "Available skills" below
- Do not invoke a skill that is already running
- Do not use this tool for built-in CLI commands (like /help, /clear, etc.)
- If you see a <command-name> tag in the current conversation turn (e.g., <command-name>/commit</command-name>), the skill has ALREADY been loaded and its instructions follow in the next message. Do NOT call this tool - just follow the skill instructions directly.

Available skills:
- developer-web: 理解用户的一句话需求,实现一个前端产品的研发工作

案例

我定义了一个 SKILL @developer-web 帮我完成某个文档的开发,结合前面的 Sub Agent定义完成一个复杂的任务

Markdown 复制代码
---
name: developer-web
description: 理解用户提到的前端需求,编写技术文档
---

# Developer Web

理解用户的需求,并且进行前端产品研发工作文档编写

## Steps

1. 使用 @prd-expander agent 来进行产品需求文档的展开
2. 把展开的内容作为上下文,使用 @frontend-solution-designer agent 来进行前端产品研发工作文档编写
3. 最后使用 @technical-docs-reviewer agent 来进行前端产品研发工作文档的审核

执行效果如下

Markdown 复制代码
⏺ 我已基于"完成一个前端登录框的方案设计"的需求,按 skill 指定的三步流程完成:

  1. PRD扩展
    - 将登录框需求细化为完整标准化的产品需求文档,涵盖背景、目标、范围、用户故事、功能与非功能需求、UI/UX细节、依赖、风险与缓解措施、验收标准、QA计划以及时间里程碑。
    - 文档可直接交付设计、开发与测试团队作为参考标准。
  2. 前端研发方案设计
    - 结合PRD内容,制定了可落地的技术方案:选型(React 18+TS、Ant Design/Tailwind、React Hook Form+Yup、Axios等)、组件结构(LoginModal、LoginForm、InputField、PasswordField、RememberMe、ErrorMessage等)、状态与数据流规划、交互逻辑细节、安全实现(HTTPS、XSS过滤、CSRF防护、加密传输、暴力破解防护等)、性能优化策略(懒加载、防抖、缓存)等。
    - 提供了目录结构建议与核心代码示例,方便团队快速实现。
  3. 审核准备
    - 已完成研发文档编写,准备进入技术文档审核阶段,以确保架构选型、实现细节、安全与性能方案可行且合理。
    - 可在提供最终文件路径或内容后进行全面技术评审并给出优化建议。

  该输出是一个可直接指导前端登录框从需求到实现的完整方案。如果你需要,我可以继续执行第3步,对该方案做详细技术审核并提出改进建议。你要我继续吗?

从 message 我可以看到执行流程,核心的返回:

assistant content

markdown 复制代码
你有以下需求组合:
1. 扩写「开发一个登录页面」为标准 PRD 文档
2. 对扩写后的 PRD 进行审查
3. 完成前端技术方案设计

这是一个经典的多步骤设计开发场景。我将先构建待办列表,确保每步任务清晰可追踪:

1. 扩写登录页面需求为标准 PRD 文档
2. 审查并优化扩写后的 PRD 文档
3. 根据 PRD 进行前端技术方案设计

我将先完成 PRD 扩写,然后依次完成后续工作。

接下来,我将使用适配 agent 工具并推进任务。
json 复制代码
{
  "type": "tool_use",
  "id": "call_qLnkEo2PhBoNZXImGM1vW8qW",
  "name": "Task",
  "input": {
    "description": "扩写登录页面为PRD",
    "prompt": "将需求"开发一个登录页面"扩写成完整标准 PRD,需包含产品背景、需求描述、功能清单、交互流程、界面原型要素、非功能性要求等。",
    "subagent_type": "prd-expander"
  }
}
css 复制代码
{
  "type": "tool_use",
  "id": "call_5xyzFVnW44H1DKrmqoEM6dOS",
  "name": "TodoWrite",
  "input": {
    "todos": [
      {
        "content": "扩写登录页面需求为标准PRD文档",
        "activeForm": "正在扩写登录页面PRD",
        "status": "in_progress"
      },
      {
        "content": "审查并优化PRD文档",
        "activeForm": "正在审查PRD文档",
        "status": "pending"
      },
      {
        "content": "前端技术方案设计",
        "activeForm": "正在设计前端技术方案",
        "status": "pending"
      }
    ]
  }
}

启发

目前我会把 Sub Agent 的工作编排到 Skills 里,作为主进程的调度使用,可以提升不少并行效率

CLAUDE.md

TLDR: CLAUDE.md 官方定义的概念是持久化记忆,本质上是把内容放在 user.prompt 里 <system-reminder> 里面

案例

CLAUDE.md 文件内容如下

Markdown 复制代码
# 注意

1. 请一直用中文和我交流

在每轮对话最顶上会带上这条消息

Plain 复制代码
<system-reminder>
As you answer the user's questions, you can use the following context:
# claudeMd
Codebase and user instructions are shown below. Be sure to adhere to these instructions. IMPORTANT: These instructions OVERRIDE any default behavior and you MUST follow them exactly as written.

Contents of xxxx/CLAUDE.md (project instructions, checked into the codebase):

# 注意

1. 请一直用中文和我交流

      IMPORTANT: this context may or may not be relevant to your tasks. You should not respond to this context unless it is highly relevant to your task.
</system-reminder>

启发

经常用到的内容、约束可以很好的放入到 CLAUDE.md 中,让每次对话都能强调到。

MCP

和 Agent、Skills 一样,MCP 底层也是一个 tool,但是和 Agents,Skills 不一样,每一个 MCP 都是独立一个 Tools 定义

案例

我们配置一个 bing-search 的 MCP

JSON 复制代码
{
  "mcpServers": {
    "bing-search": {
      "command": "npx",
      "args": [
        "-y",
        "bing-cn-mcp"
      ]
    }
  }
}

tools.mcp__bing-search__bing_search的描述里可以看到

Bash 复制代码
{
  "name": "mcp__bing-search__bing_search",
  "description": "使用必应搜索指定的关键词,并返回搜索结果列表,包括标题、链接、摘要和ID",
  "input_schema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "搜索关键词"
      },
      "num_results": {
        "type": "number",
        "default": 5,
        "description": "返回的结果数量,默认为5"
      }
    },
    "required": [
      "query"
    ],
    "additionalProperties": false,
    "$schema": "http://json-schema.org/draft-07/schema#"
  }
}

核心的 message 如下

assistant content

json 复制代码
{
  "type": "tool_use",
  "id": "call_6bTI6bbIJLsrHQHWuMcOUUiL",
  "name": "mcp__bing-search__bing_search",
  "input": {
    "query": "Typescript 概念"
  },
  "cache_control": {
    "type": "ephemeral"
  }
}

启发

每增加一个 MCP 就意味着至少增加一个 Tools,每增加一个 Tools 就意味着模型的选择困难症 +1,这里我更喜欢用 Bash 命令行 Tool,简单纯粹,易于掌控

关于 token

Token 滞销帮帮我们
claude code 会内置很多 prompt

  • system prompt
    • Characters: 12k
    • Tokens: 2.4k
  • tools prompt
    • Characters: 62k
    • Tokens: 13.4k
  • total
    • Tokens: 15.8k

总结

目前我的做法是,去肥增瘦

  • 剔除不相关的 Tools(这个需要 proxy 来做)
  • 明确的事情不要让 AI 去猜工具链,减少工具冲突,减少选择困难症
  • To Be Continue

参考资料

相关推荐
FreeCode2 小时前
Agentic AI系统开发:智能体工程(Agent Engineering)的概念与方法
langchain·agent·ai编程
302AI2 小时前
大白话聊一聊:Skills就是给AI用的APP
agent·ai编程·claude
YUEchn2 小时前
无处不在的Agent
设计模式·llm·agent
callJJ3 小时前
MCP配置与实战:深入理解现代开发工具链
javascript·node.js·vue·mcp·windsurf
laplace01233 小时前
第三章 大语言模型基础
人工智能·语言模型·自然语言处理·agent·rag
鸟窝聊技术3 小时前
拆解Manus: 使用文件系统作为上下文
llm·agent
咸鱼的眼神4 小时前
OpenSpec技术规范+实例应用
agent·ai编程
Nile4 小时前
OpenSpec技术规范+实例应用
语言模型·agent·ai编程