Agent 项目落地模板
1. 路线选择原则
如果你要真的开工,我建议默认采用这个路线:
- 先做 L1/L3
- 不要一开始做纯 L2 loop agent
- 目录结构按"可升级到 graph"来设计
- 工具、状态、规划、执行器要分开
- 所有 side-effect tool 都必须可审计
一句话:
先把 Agent 当成"LLM 驱动的 workflow 系统"来做,而不是"全自动智能体"来做。
2. 模版适合项目类型
这份模板适合以下项目:
- 企业内部知识助手
- Research Agent
- 自动报告生成
- 多工具办公自动化
- 运维/数据分析助手
- 面向 API 的 Agent 服务
不太适合:
- 纯聊天机器人
- 只有单次 function calling 的极简服务
- 极端高频低延迟系统
3. 项目总体架构
先看整体分层。
text
client / ui
↓
api layer
↓
application layer
↓
agent runtime
├── planner
├── executor
├── tools
├── state
├── memory
├── policies
└── observability
↓
infra layer
├── llm provider
├── cache
├── db
├── queue
└── tracing/logging
更具体一点:
text
User Request
↓
FastAPI Router
↓
Agent Service
↓
Task Interpreter
↓
Planner
↓
Plan Validator
↓
Executor
├── Tool Registry
├── State Store
├── Policy Check
├── Cache
└── Retry Manager
↓
Final Synthesizer
↓
Response
4. 目录结构
这是一个比较实用的目录设计。
text
agent_app/
├── app/
│ ├── api/
│ │ ├── routers/
│ │ │ ├── health.py
│ │ │ ├── chat.py
│ │ │ └── tasks.py
│ │ └── deps.py
│ │
│ ├── core/
│ │ ├── config.py
│ │ ├── logging.py
│ │ ├── exceptions.py
│ │ └── security.py
│ │
│ ├── domain/
│ │ ├── models/
│ │ │ ├── task.py
│ │ │ ├── plan.py
│ │ │ ├── state.py
│ │ │ ├── tool.py
│ │ │ └── response.py
│ │ ├── enums.py
│ │ └── policies.py
│ │
│ ├── application/
│ │ ├── services/
│ │ │ ├── agent_service.py
│ │ │ ├── planning_service.py
│ │ │ ├── execution_service.py
│ │ │ └── evaluation_service.py
│ │ └── usecases/
│ │ ├── run_agent.py
│ │ └── run_workflow.py
│ │
│ ├── agent/
│ │ ├── planner/
│ │ │ ├── base.py
│ │ │ ├── llm_planner.py
│ │ │ └── prompts.py
│ │ ├── executor/
│ │ │ ├── base.py
│ │ │ ├── sequential_executor.py
│ │ │ ├── graph_executor.py
│ │ │ └── node_runner.py
│ │ ├── tools/
│ │ │ ├── base.py
│ │ │ ├── registry.py
│ │ │ ├── web_search.py
│ │ │ ├── database_query.py
│ │ │ ├── python_exec.py
│ │ │ └── email_sender.py
│ │ ├── memory/
│ │ │ ├── base.py
│ │ │ ├── short_term.py
│ │ │ └── vector_memory.py
│ │ ├── state/
│ │ │ ├── store.py
│ │ │ └── serializers.py
│ │ ├── policies/
│ │ │ ├── tool_policy.py
│ │ │ ├── budget_policy.py
│ │ │ └── safety_policy.py
│ │ ├── prompts/
│ │ │ ├── system_prompts.py
│ │ │ ├── planner_prompts.py
│ │ │ └── synthesis_prompts.py
│ │ └── runtime/
│ │ ├── orchestrator.py
│ │ ├── guards.py
│ │ └── context_builder.py
│ │
│ ├── integrations/
│ │ ├── llm/
│ │ │ ├── base.py
│ │ │ ├── openai_client.py
│ │ │ ├── anthropic_client.py
│ │ │ └── gemini_client.py
│ │ ├── storage/
│ │ │ ├── redis_store.py
│ │ │ ├── postgres_store.py
│ │ │ └── s3_store.py
│ │ ├── cache/
│ │ │ └── redis_cache.py
│ │ └── tracing/
│ │ ├── logger.py
│ │ ├── metrics.py
│ │ └── opentelemetry.py
│ │
│ ├── tests/
│ │ ├── unit/
│ │ ├── integration/
│ │ ├── evals/
│ │ └── fixtures/
│ │
│ └── main.py
│
├── scripts/
│ ├── run_local.sh
│ ├── seed_tools.py
│ └── eval_runner.py
│
├── configs/
│ ├── dev.yaml
│ ├── prod.yaml
│ └── prompts.yaml
│
├── Dockerfile
├── docker-compose.yml
├── pyproject.toml
└── README.md
5. 各模块职责说明
5.1 api/
负责对外 HTTP 接口。
应包含
/chat/tasks/run/health/metrics(可选)
原则
- API 层不要写 Agent 核心逻辑
- 只做参数解析、鉴权、调用 usecase
5.2 domain/
这里放核心数据结构,不要混业务逻辑。
核心模型建议
task.py
python
from pydantic import BaseModel
from typing import Optional, Dict, Any
class TaskRequest(BaseModel):
user_id: str
query: str
metadata: Dict[str, Any] = {}
session_id: Optional[str] = None
plan.py
python
from pydantic import BaseModel
from typing import List, Dict, Any, Optional
class PlanStep(BaseModel):
name: str
kind: str # tool | llm
tool_name: Optional[str] = None
instruction: Optional[str] = None
args: Dict[str, Any] = {}
depends_on: List[str] = []
class ExecutionPlan(BaseModel):
steps: List[PlanStep]
goal: str
state.py
python
from pydantic import BaseModel, Field
from typing import Dict, Any
class AgentState(BaseModel):
values: Dict[str, Any] = Field(default_factory=dict)
def set(self, key: str, value: Any):
self.values[key] = value
def get(self, key: str, default=None):
return self.values.get(key, default)
tool.py
python
from pydantic import BaseModel
from typing import Dict, Any
class ToolCall(BaseModel):
tool_name: str
args: Dict[str, Any]
class ToolResult(BaseModel):
success: bool
output: Any
error: str | None = None
5.3 application/
这里是用例编排层,把 API 请求转成领域服务调用。
例子
run_agent.pyrun_workflow.py
run_agent.py
python
class RunAgentUseCase:
def __init__(self, agent_service):
self.agent_service = agent_service
async def execute(self, request):
return await self.agent_service.run(request)
5.4 agent/planner/
负责任务规划。
职责
- 解析用户目标
- 输出线性 plan(L3)或 task graph(L4)
- 不做真实执行
规划器接口
python
from abc import ABC, abstractmethod
class BasePlanner(ABC):
@abstractmethod
async def create_plan(self, task_request):
...
一个简单 planner
python
class LLMPlanner(BasePlanner):
def __init__(self, llm_client, prompt_builder):
self.llm = llm_client
self.prompt_builder = prompt_builder
async def create_plan(self, task_request):
prompt = self.prompt_builder.build(task_request)
raw = await self.llm.generate(prompt)
return parse_plan(raw)
重要原则
- planner 输出必须结构化
- 最好 JSON schema 校验
- planner 不要直接执行工具
5.5 agent/executor/
负责执行计划,是系统的关键。
L3:顺序执行器
python
class SequentialExecutor:
def __init__(self, tool_registry, llm_client, state_store):
self.tool_registry = tool_registry
self.llm = llm_client
self.state_store = state_store
async def execute(self, plan, state):
for step in plan.steps:
if step.kind == "tool":
tool = self.tool_registry.get(step.tool_name)
result = await tool.run(step.args, state)
state.set(step.name, result)
elif step.kind == "llm":
prompt = build_step_prompt(step, state)
result = await self.llm.generate(prompt)
state.set(step.name, result)
return state
L4:图执行器
python
class GraphExecutor:
def __init__(self, node_runner):
self.node_runner = node_runner
async def execute(self, plan, state):
completed = set()
while len(completed) < len(plan.steps):
ready = [
s for s in plan.steps
if s.name not in completed
and all(dep in completed for dep in s.depends_on)
]
results = await run_parallel([
self.node_runner.run(step, state) for step in ready
])
for step, result in zip(ready, results):
state.set(step.name, result)
completed.add(step.name)
return state
5.6 agent/tools/
这里是最容易做烂的地方。
建议每个工具都遵循统一接口。
工具基类
python
from abc import ABC, abstractmethod
class BaseTool(ABC):
name: str
description: str
@abstractmethod
async def run(self, args, state):
...
示例:网页搜索
python
class WebSearchTool(BaseTool):
name = "web_search"
description = "Search the web for information"
async def run(self, args, state):
query = args["query"]
# 调第三方搜索 API
return {"query": query, "results": ["result1", "result2"]}
工具注册表
python
class ToolRegistry:
def __init__(self):
self._tools = {}
def register(self, tool):
self._tools[tool.name] = tool
def get(self, name):
if name not in self._tools:
raise ValueError(f"Tool not found: {name}")
return self._tools[name]
工具设计原则
每个工具都要明确:
- 输入 schema
- 输出 schema
- 是否有副作用
- 权限级别
- 超时
- 重试策略
5.7 agent/policies/
这是生产系统必须有的。
推荐至少有三类策略
1)工具权限策略
python
class ToolPolicy:
def is_allowed(self, user, tool_name):
forbidden = {"delete_user", "refund_payment"}
return tool_name not in forbidden
2)预算策略
python
class BudgetPolicy:
def __init__(self, max_steps=8, max_tokens=20000):
self.max_steps = max_steps
self.max_tokens = max_tokens
3)安全策略
- prompt injection 检测
- PII 屏蔽
- 敏感动作审批
- 外发邮件审核
5.8 agent/runtime/
这是 orchestration 的核心。
orchestrator.py
python
class AgentOrchestrator:
def __init__(self, planner, executor, synthesizer, policies):
self.planner = planner
self.executor = executor
self.synthesizer = synthesizer
self.policies = policies
async def run(self, task_request):
self.policies.check_request(task_request)
plan = await self.planner.create_plan(task_request)
self.policies.check_plan(plan)
state = AgentState()
state = await self.executor.execute(plan, state)
return await self.synthesizer.generate(task_request, state)
guards.py
负责:
- max steps
- timeout
- loop detection
- tool retry guard
5.9 integrations/llm/
这一层要做成可替换。
接口
python
class BaseLLMClient:
async def generate(self, prompt: str) -> str:
raise NotImplementedError
OpenAI 示例
python
class OpenAIClient(BaseLLMClient):
def __init__(self, client, model):
self.client = client
self.model = model
async def generate(self, prompt: str) -> str:
resp = await self.client.responses.create(
model=self.model,
input=prompt
)
return resp.output_text
建议把以下能力封装掉:
- 重试
- timeout
- token 统计
- tracing
- tool call 抽象
5.10 integrations/tracing/
生产上必须有 tracing。
至少记录
- request_id
- session_id
- user_id
- plan
- tool_name
- tool_args
- tool_result summary
- token usage
- latency
- final answer quality signal
6. 一个最小可运行的 L3 Agent 模板
下面给你一个最小骨架。
6.1 agent_service.py
python
class AgentService:
def __init__(self, orchestrator):
self.orchestrator = orchestrator
async def run(self, request):
return await self.orchestrator.run(request)
6.2 llm_planner.py
python
class LLMPlanner:
def __init__(self, llm_client):
self.llm = llm_client
async def create_plan(self, task_request):
# 实际项目里这里应该要求输出 JSON
# 这里简化成固定计划
return ExecutionPlan(
goal=task_request.query,
steps=[
PlanStep(
name="search",
kind="tool",
tool_name="web_search",
args={"query": task_request.query}
),
PlanStep(
name="summary",
kind="llm",
instruction="Summarize the search results"
)
]
)
6.3 sequential_executor.py
python
class SequentialExecutor:
def __init__(self, tool_registry, llm_client):
self.tool_registry = tool_registry
self.llm = llm_client
async def execute(self, plan, state):
for step in plan.steps:
if step.kind == "tool":
tool = self.tool_registry.get(step.tool_name)
result = await tool.run(step.args, state)
state.set(step.name, result)
elif step.kind == "llm":
prompt = f"{step.instruction}\nContext:\n{state.values}"
result = await self.llm.generate(prompt)
state.set(step.name, result)
return state
6.4 web_search.py
python
class WebSearchTool:
name = "web_search"
description = "Search web content"
async def run(self, args, state):
query = args["query"]
# 假装调用搜索引擎
return {
"query": query,
"results": [
{"title": "Result A", "snippet": "Info A"},
{"title": "Result B", "snippet": "Info B"},
]
}
6.5 synthesizer.py
python
class FinalSynthesizer:
def __init__(self, llm_client):
self.llm = llm_client
async def generate(self, task_request, state):
prompt = f"""
User request: {task_request.query}
Execution state:
{state.values}
Produce a final user-friendly answer.
"""
return await self.llm.generate(prompt)
6.6 main.py
python
from fastapi import FastAPI
app = FastAPI()
@app.post("/tasks/run")
async def run_task(payload: dict):
request = TaskRequest(**payload)
result = await agent_service.run(request)
return {"result": result}
7. 从 L3 升级到 L4 的设计预留
如果你现在做 L3,但以后可能升级到 L4,建议现在就预留好这些接口。
7.1 PlanStep 中预留依赖字段
即使暂时不用 graph,也先加上:
python
depends_on: List[str] = []
这样以后可以平滑升级 DAG。
7.2 Executor 接口抽象化
python
class BaseExecutor(ABC):
@abstractmethod
async def execute(self, plan, state):
...
这样你可以轻松切换:
SequentialExecutorGraphExecutor
7.3 State 不要写死成字符串拼接
要用结构化 state:
python
state.set("search_result", {...})
state.set("analysis", {...})
这样 node 之间传递才可靠。
7.4 Tool 输出尽量结构化
不要只返回一坨文本。
坏例子:
python
return "I found some web pages..."
好例子:
python
return {
"results": [...],
"source_count": 5,
"query": query
}
8. 推荐技术栈
按"务实落地"来推荐。
8.1 Web/API
- FastAPI
8.2 数据模型
- Pydantic
8.3 LLM 接入
- 官方 SDK(OpenAI / Anthropic / Gemini)
- 或 LiteLLM 做多模型适配
8.4 缓存
- Redis
8.5 状态/持久化
- Postgres
- 或 MongoDB(如果偏文档)
8.6 可观测性
- OpenTelemetry
- Prometheus + Grafana
- Langfuse / Helicone / Phoenix(二选一类)
8.7 任务队列
- Celery / Dramatiq / Arq
- 批处理可用 Prefect / Airflow
8.8 向量检索
- pgvector / Qdrant / Weaviate
9. 推荐优先开源项目
按你做的层级来选。
9.1 如果你做 L1/L3
优先:
- 官方 SDK
- PydanticAI
- LiteLLM
- 自研 planner + executor
这是我最推荐的组合,最稳。
9.2 如果你做 L4
优先:
- LangGraph
- Prefect / Dagster / Airflow(偏工作流)
- 自研 graph executor(如果你有平台团队)
9.3 如果你做实验/原型
可用:
- LangChain
- AutoGen
- CrewAI
但建议:
- 用来验证想法
- 不要默认当最终生产内核
10. 测试策略模板
Agent 项目最容易忽略测试。建议至少分三层。
10.1 单元测试
测:
- planner 输出结构
- tool 输入输出
- policy 校验
- state 更新逻辑
10.2 集成测试
测:
- API → planner → executor → tool → synthesizer 全链路
- 第三方 API mock
- 超时/失败重试
10.3 evals
这是 Agent 项目的核心测试。
准备一个数据集:
json
[
{
"query": "分析 AI coding 市场趋势",
"expected_signals": ["market", "trend", "competitor"]
},
{
"query": "查询订单123状态",
"expected_tool": "order_query"
}
]
评估:
- 是否选对工具
- 是否成功完成任务
- latency
- token cost
- answer quality
11. 生产环境必须加的保护
这个很关键。
11.1 Max limits
- max_steps
- max_tool_calls
- max_tokens
- max_execution_time
11.2 Tool allowlist
按用户角色控制工具:
python
allowed_tools = {
"guest": ["web_search"],
"employee": ["web_search", "db_query"],
"admin": ["web_search", "db_query", "send_email"]
}
11.3 Side-effect tools 审批
例如:
- send_email
- create_ticket
- update_db
应支持:
- dry-run
- human approval
- audit log
11.4 Fallback
当 planner 失败时:
- fallback 到固定 workflow
- fallback 到简单回答
- fallback 到人工
12. 一个简化版 README 模板
你项目里可以这么写。
Project Overview
This project implements a production-oriented AI Agent system with:
- planner + executor architecture
- structured tools
- safety policies
- observability and evaluation hooks
Architecture
- L3 by default
- L4-ready abstractions
- tool registry
- state store
- policy engine
Run locally
bash
cp .env.example .env
docker-compose up -d
uvicorn app.main:app --reload
Test
bash
pytest
python scripts/eval_runner.py
13. 最推荐的最小落地方案
如果你现在就想开工,我建议你先做这个 MVP 架构:
第一版
- FastAPI
- Pydantic
- 官方 LLM SDK
- ToolRegistry
- LLMPlanner
- SequentialExecutor
- Redis cache
- Postgres logs
- Langfuse tracing
不要先做的
- 不要先做 multi-agent
- 不要先做复杂 memory
- 不要先做自由 loop agent
- 不要先做全图调度
- 不要先做太重的框架绑定
14. 最后给你一个"可直接抄"的精简版骨架
text
app/
├── api/
├── domain/
├── application/
├── agent/
│ ├── planner/
│ ├── executor/
│ ├── tools/
│ ├── policies/
│ ├── runtime/
│ └── prompts/
├── integrations/
│ ├── llm/
│ ├── cache/
│ ├── storage/
│ └── tracing/
└── main.py
其中最核心的 5 个类是:
LLMPlannerSequentialExecutorToolRegistryAgentStateAgentOrchestrator
先把这 5 个类做好,你的项目就已经是一个真正可落地的 Agent 系统雏形了。
如果你愿意,我下一步可以直接继续给你两份非常实战的内容之一:
- 一份完整的 Python 代码骨架(含类定义)
- 一份"Research Agent"示例项目模板,把上面的目录直接填成可运行示例