我们是由枫哥组建的IT技术团队,成立于2017年,致力于帮助IT从业者提供实力,成功入职理想企业,我们提供一对一学习辅导,由知名大厂导师指导,分享Java技术、参与项目实战等服务,并为学员定制职业规划,全面提升竞争力,过去8年,我们已成功帮助数千名求职者拿到满意的Offer:IT枫斗者、IT枫斗者-Java面试突击。
构建具有执行功能的 AI Agent:基于工作记忆的任务规划与元认知监控架构
核心命题:真正的智能不在于工具调用,而在于对自身认知状态的监控与调节。本文实现一套符合认知心理学"执行功能"(Executive Function)理论的 Agent 状态管理系统。
一、认知架构基础:为什么 Agent 需要工作记忆?
1.1 从人类认知到 Agent 架构
人类大脑的前额叶皮层负责执行功能(Executive Function),包含三个核心组件:
- 工作记忆(Working Memory):暂存任务上下文与中间状态
- 认知灵活性(Cognitive Flexibility):根据环境反馈切换任务策略
- 抑制控制(Inhibitory Control):抑制无关干扰,保持目标聚焦
对应到 AI Agent 架构:
| 认知功能 | Agent 实现 | 本文对应模块 |
|---|---|---|
| 工作记忆 | 短期状态存储与更新 | WorkingMemory (TodoManager 演进) |
| 认知灵活性 | 基于状态的任务重规划 | ReplanningStrategy |
| 抑制控制 | 防止上下文漂移的监控机制 | MetacognitiveMonitor (Nag Reminder 演进) |
1.2 简单工具调用的局限性
传统 ReAct 模式的 Agent 存在**"金鱼记忆"**问题:
- 无法维护跨轮次的任务上下文
- 容易在复杂任务中"迷失方向"(目标漂移)
- 缺乏对执行进度的元认知(无法评估"我完成了多少")
解决方案 :引入显式的符号化工作记忆(Symbolic Working Memory),将任务状态外化为可持久化、可查询的结构化数据。
二、架构设计:分层状态管理系统
2.1 系统架构图
┌─────────────────────────────────────────────────────────────┐
│ 认知层 (Cognitive Layer) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ Goal Manager │ │ Working │ │ Metacognitive │ │
│ │ (目标管理) │ │ Memory │ │ Monitor │ │
│ │ │ │ (工作记忆) │ │ (元认知监控) │ │
│ └──────┬───────┘ └──────┬───────┘ └────────┬─────────┘ │
└─────────┼─────────────────┼───────────────────┼───────────┘
│ │ │
┌─────────┼─────────────────┼───────────────────┼───────────┐
│ │ 应用层 (Application Layer) │ │
│ ┌──────┴───────┐ ┌──────┴───────┐ ┌────────┴─────────┐ │
│ │ Task │ │ State │ │ Reflection │ │
│ │ Decomposer │ │ Transition │ │ Engine │ │
│ │ (任务分解) │ │ (状态流转) │ │ (反思引擎) │ │
│ └──────┬───────┘ └──────┬───────┘ └────────┬─────────┘ │
└─────────┼─────────────────┼───────────────────┼───────────┘
│ │ │
┌─────────┼─────────────────┼───────────────────┼───────────┐
│ │ 基础设施层 (Infrastructure) │
│ ┌──────┴───────┐ ┌──────┴───────┐ ┌────────────────┐ │
│ │ Persistence │ │ Event │ │ Concurrency │ │
│ │ (状态持久化) │ │ Bus │ │ Control │ │
│ │ │ │ (事件总线) │ │ (并发控制) │ │
│ └──────────────┘ └──────────────┘ └────────────────┘ │
└─────────────────────────────────────────────────────────────┘
2.2 核心领域模型
java
import java.time.Instant;
import java.util.*;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.locks.ReadWriteLock;
import java.util.concurrent.locks.ReentrantReadWriteLock;
import java.util.stream.Collectors;
/**
* 领域对象:任务项(Task Item)
* 采用 Java Record 实现不可变值对象,确保状态变更必须通过领域方法
*/
public record TaskItem(
String id,
String description,
TaskStatus status,
int priority, // 优先级:1-5,用于冲突解决
Instant createdAt,
Instant startedAt, // 状态迁移时间戳
Instant completedAt,
Map<String, Object> metadata // 扩展字段:存储上下文、工具参数等
) {
public TaskItem {
Objects.requireNonNull(id, "任务ID不能为空");
Objects.requireNonNull(description, "任务描述不能为空");
if (priority < 1 || priority > 5) {
throw new IllegalArgumentException("优先级必须在1-5之间");
}
}
// 工厂方法:创建待办任务
public static TaskItem pending(String id, String description, int priority) {
return new TaskItem(
id, description, TaskStatus.PENDING, priority,
Instant.now(), null, null, new HashMap<>()
);
}
// 状态转换方法:遵循状态机规则
public TaskItem start() {
if (status != TaskStatus.PENDING) {
throw new IllegalStateException("只有待办任务可以开始,当前状态:" + status);
}
return new TaskItem(
id, description, TaskStatus.IN_PROGRESS, priority,
createdAt, Instant.now(), completedAt, metadata
);
}
public TaskItem complete(Map<String, Object> result) {
if (status != TaskStatus.IN_PROGRESS) {
throw new IllegalStateException("只有进行中的任务可以完成");
}
Map<String, Object> newMeta = new HashMap<>(metadata);
newMeta.put("result", result);
return new TaskItem(
id, description, TaskStatus.COMPLETED, priority,
createdAt, startedAt, Instant.now(), newMeta
);
}
public TaskItem block(String reason) {
return new TaskItem(
id, description, TaskStatus.BLOCKED, priority,
createdAt, startedAt, completedAt,
Map.of("blockReason", reason, "previousStatus", status.name())
);
}
}
/**
* 状态枚举:使用 Sealed Class 限制状态空间(Java 17+)
*/
public enum TaskStatus {
PENDING("待办", "等待执行"),
IN_PROGRESS("进行中", "当前焦点任务"),
COMPLETED("已完成", "目标达成"),
BLOCKED("阻塞", "需要外部输入或依赖解决"),
CANCELLED("已取消", "目标废弃");
private final String label;
private final String description;
TaskStatus(String label, String description) {
this.label = label;
this.description = description;
}
// 状态流转验证
public boolean canTransitionTo(TaskStatus newStatus) {
return switch (this) {
case PENDING -> newStatus == IN_PROGRESS || newStatus == CANCELLED;
case IN_PROGRESS -> newStatus == COMPLETED || newStatus == BLOCKED || newStatus == CANCELLED;
case BLOCKED -> newStatus == IN_PROGRESS || newStatus == CANCELLED;
case COMPLETED, CANCELLED -> false; // 终态
};
}
}
三、工作记忆管理器:生产级实现
3.1 架构演进:从简单列表到事务性存储
java
import java.util.*;
import java.util.concurrent.*;
import java.util.concurrent.locks.*;
import java.util.function.Predicate;
/**
* WorkingMemory:Agent 的工作记忆系统
* 职责:
* 1. 维护当前任务上下文(焦点任务 + 背景任务)
* 2. 提供原子性状态更新(事务支持)
* 3. 并发控制(防止竞态条件)
* 4. 事件发布(状态变更通知)
*/
public class WorkingMemory {
// 并发控制:读写锁支持高并发读,独占写
private final ReadWriteLock lock = new ReentrantReadWriteLock();
private final Lock readLock = lock.readLock();
private final Lock writeLock = lock.writeLock();
// 状态存储
private final Map<String, TaskItem> taskStore = new ConcurrentHashMap<>();
private final Deque<String> focusStack = new ConcurrentLinkedDeque<>(); // 焦点任务栈(支持子任务嵌套)
private String currentFocusId = null;
// 领域事件(用于驱动元认知监控)
private final List<StateChangeEvent> eventLog = new CopyOnWriteArrayList<>();
private final List<MemoryEventListener> listeners = new CopyOnWriteArrayList<>();
// 配置约束
private final int maxTasks;
private final int maxFocusHistory;
public WorkingMemory(int maxTasks, int maxFocusHistory) {
this.maxTasks = maxTasks;
this.maxFocusHistory = maxFocusHistory;
}
// ------------------- 核心操作 -------------------
/**
* 原子性批量更新(事务语义)
* 类比数据库事务:要么全部成功,要么全部回滚
*/
public BatchUpdateResult atomicUpdate(List<TaskItem> newTasks,
List<String> tasksToRemove,
Optional<String> newFocusId) {
writeLock.lock();
try {
// 前置验证
if (taskStore.size() + newTasks.size() - tasksToRemove.size() > maxTasks) {
return BatchUpdateResult.failure("超出最大任务数限制: " + maxTasks);
}
// 验证新焦点任务存在
if (newFocusId.isPresent() && !taskStore.containsKey(newFocusId.get())
&& newTasks.stream().noneMatch(t -> t.id().equals(newFocusId.get()))) {
return BatchUpdateResult.failure("焦点任务不存在: " + newFocusId.get());
}
// 执行更新
List<StateChangeEvent> changes = new ArrayList<>();
// 1. 添加/更新任务
for (TaskItem task : newTasks) {
TaskItem old = taskStore.put(task.id(), task);
changes.add(new StateChangeEvent(
old == null ? ChangeType.CREATED : ChangeType.UPDATED,
old, task, Instant.now()
));
}
// 2. 移除任务
for (String id : tasksToRemove) {
TaskItem removed = taskStore.remove(id);
if (removed != null) {
changes.add(new StateChangeEvent(
ChangeType.REMOVED, removed, null, Instant.now()
));
}
if (id.equals(currentFocusId)) {
currentFocusId = null;
}
}
// 3. 更新焦点
newFocusId.ifPresent(id -> {
if (currentFocusId != null) {
focusStack.push(currentFocusId);
if (focusStack.size() > maxFocusHistory) {
focusStack.removeLast();
}
}
currentFocusId = id;
});
// 4. 持久化事件
eventLog.addAll(changes);
// 5. 发布事件(异步通知监听器)
changes.forEach(this::notifyListeners);
return BatchUpdateResult.success(changes);
} finally {
writeLock.unlock();
}
}
/**
* 获取当前认知上下文(用于构造 LLM Prompt)
*/
public CognitiveContext getCognitiveContext() {
readLock.lock();
try {
TaskItem focus = currentFocusId != null ? taskStore.get(currentFocusId) : null;
List<TaskItem> pending = taskStore.values().stream()
.filter(t -> t.status() == TaskStatus.PENDING)
.sorted(Comparator.comparingInt(TaskItem::priority).reversed())
.limit(5)
.collect(Collectors.toList());
List<TaskItem> blocked = taskStore.values().stream()
.filter(t -> t.status() == TaskStatus.BLOCKED)
.collect(Collectors.toList());
return new CognitiveContext(focus, pending, blocked, focusStack.size());
} finally {
readLock.unlock();
}
}
/**
* 序列化为 LLM 可读的文本格式(思维链提示)
*/
public String toPromptRepresentation() {
CognitiveContext ctx = getCognitiveContext();
StringBuilder sb = new StringBuilder();
sb.append("<working_memory>\n");
// 焦点任务(当前上下文窗口的核心)
if (ctx.focus() != null) {
sb.append(String.format(" <focus_task id=\"%s\" status=\"%s\">\n",
ctx.focus().id(), ctx.focus().status()));
sb.append(String.format(" %s\n", ctx.focus().description()));
if (!ctx.focus().metadata().isEmpty()) {
sb.append(" <context>\n");
ctx.focus().metadata().forEach((k, v) ->
sb.append(String.format(" %s: %s\n", k, v)));
sb.append(" </context>\n");
}
sb.append(" </focus_task>\n");
}
// 待办队列(工作记忆的"边缘")
if (!ctx.pending().isEmpty()) {
sb.append(" <pending_queue>\n");
ctx.pending().forEach(t ->
sb.append(String.format(" [P%d] #%s: %s\n",
t.priority(), t.id(), t.description())));
sb.append(" </pending_queue>\n");
}
// 阻塞任务(需要干预)
if (!ctx.blocked().isEmpty()) {
sb.append(" <blocked_tasks>\n");
ctx.blocked().forEach(t -> {
sb.append(String.format(" #%s: %s (Reason: %s)\n",
t.id(), t.description(),
t.metadata().getOrDefault("blockReason", "Unknown")));
});
sb.append(" </blocked_tasks>\n");
}
// 元认知提示
double completionRate = calculateCompletionRate();
sb.append(String.format(" <meta_progress>%.1f%%</meta_progress>\n", completionRate * 100));
sb.append(String.format(" <context_depth>%d</context_depth>\n", ctx.focusStackDepth()));
sb.append("</working_memory>\n");
return sb.toString();
}
private double calculateCompletionRate() {
long total = taskStore.size();
if (total == 0) return 0.0;
long completed = taskStore.values().stream()
.filter(t -> t.status() == TaskStatus.COMPLETED)
.count();
return (double) completed / total;
}
private void notifyListeners(StateChangeEvent event) {
listeners.forEach(listener -> {
try {
listener.onStateChanged(event);
} catch (Exception e) {
// 防止监听器异常影响主流程
System.err.println("Listener error: " + e.getMessage());
}
});
}
// ------------------- 领域事件 -------------------
public record CognitiveContext(
TaskItem focus,
List<TaskItem> pending,
List<TaskItem> blocked,
int focusStackDepth
) {}
public record StateChangeEvent(
ChangeType type,
TaskItem oldState,
TaskItem newState,
Instant timestamp
) {}
public enum ChangeType { CREATED, UPDATED, REMOVED }
public interface MemoryEventListener {
void onStateChanged(StateChangeEvent event);
}
public record BatchUpdateResult(
boolean success,
String errorMessage,
List<StateChangeEvent> changes
) {
static BatchUpdateResult success(List<StateChangeEvent> changes) {
return new BatchUpdateResult(true, null, changes);
}
static BatchUpdateResult failure(String message) {
return new BatchUpdateResult(false, message, List.of());
}
}
}
四、元认知监控:从 Nag Reminder 到智能干预
4.1 认知监控的理论模型
元认知监控(Metacognitive Monitoring)包含三个层次:
- 计划监控:检查是否制定了合理的执行计划
- 过程监控:检查当前行动是否与目标一致
- 结果监控:评估已完成任务的质量
4.2 自适应监控引擎
java
import java.time.Duration;
import java.time.Instant;
import java.util.*;
/**
* MetacognitiveMonitor:元认知监控器
* 职责:
* 1. 检测认知漂移(长时间未更新工作记忆)
* 2. 识别任务停滞(IN_PROGRESS 状态过长)
* 3. 触发反思或重新规划
* 4. 自适应调整提醒频率(避免过度干预)
*/
public class MetacognitiveMonitor implements WorkingMemory.MemoryEventListener {
private final WorkingMemory memory;
private final LLMGateway llmGateway; // 与 LLM 交互的网关
// 监控配置
private final Duration stagnationThreshold; // 任务停滞阈值(如5分钟)
private final int maxRoundsWithoutUpdate; // 最大容忍轮数
private final double driftThreshold; // 上下文漂移阈值(基于语义相似度)
// 运行时状态
private final Map<String, TaskMetrics> taskMetrics = new HashMap<>();
private int roundsSinceLastUpdate = 0;
private Instant lastMemoryUpdate = Instant.now();
private List<String> recentToolCalls = new ArrayList<>(); // 用于检测重复模式
public MetacognitiveMonitor(WorkingMemory memory, LLMGateway llmGateway,
Duration stagnationThreshold,
int maxRoundsWithoutUpdate) {
this.memory = memory;
this.llmGateway = llmGateway;
this.stagnationThreshold = stagnationThreshold;
this.maxRoundsWithoutUpdate = maxRoundsWithoutUpdate;
this.memory.addListener(this);
}
/**
* 每轮 Agent 循环调用,执行监控检查
*/
public MonitoringResult afterRound(List<ToolCall> currentRoundTools) {
roundsSinceLastUpdate++;
recentToolCalls.addAll(currentRoundTools.stream().map(ToolCall::name).toList());
List<Intervention> interventions = new ArrayList<>();
// 检查 1:工作记忆更新停滞
if (roundsSinceLastUpdate >= maxRoundsWithoutUpdate) {
interventions.add(Intervention.memoryUpdateRequired(
"已 " + roundsSinceLastUpdate + " 轮未更新任务状态,请检查进度并更新待办列表"
));
}
// 检查 2:任务执行停滞(长时间处于 IN_PROGRESS 无变化)
memory.getCognitiveContext().focus().ifPresent(focus -> {
TaskMetrics metrics = taskMetrics.computeIfAbsent(focus.id(),
k -> new TaskMetrics(focus.id()));
metrics.recordActivity(currentRoundTools);
if (metrics.isStagnant(stagnationThreshold)) {
interventions.add(Intervention.taskStagnation(
focus.id(),
"任务 #" + focus.id() + " 已停滞超过 " + stagnationThreshold.toMinutes() + " 分钟",
suggestRecoveryStrategy(metrics)
));
}
// 检查 3:工具调用模式异常(如重复调用相同工具相同参数)
if (metrics.detectRepetitivePattern(3)) {
interventions.add(Intervention.repetitiveBehavior(
"检测到重复操作模式,可能需要调整策略或寻求帮助"
));
}
});
// 检查 4:上下文窗口污染(通过简单启发式)
if (recentToolCalls.size() > 10) {
recentToolCalls = recentToolCalls.subList(recentToolCalls.size() - 5, recentToolCalls.size());
}
return new MonitoringResult(interventions, calculateCognitiveLoad());
}
/**
* 状态变更回调:重置监控计数器
*/
@Override
public void onStateChanged(WorkingMemory.StateChangeEvent event) {
roundsSinceLastUpdate = 0;
lastMemoryUpdate = Instant.now();
// 如果任务完成,清理指标
if (event.type() == WorkingMemory.ChangeType.UPDATED &&
event.newState() != null &&
event.newState().status() == TaskStatus.COMPLETED) {
taskMetrics.remove(event.newState().id());
}
}
private RecoveryStrategy suggestRecoveryStrategy(TaskMetrics metrics) {
// 基于指标建议恢复策略
if (metrics.failureCount > 3) {
return RecoveryStrategy.DECOMPOSE; // 建议分解任务
} else if (metrics.sameToolRepeatCount > 2) {
return RecoveryStrategy.CHANGE_APPROACH; // 建议换方法
} else {
return RecoveryStrategy.SEEK_HELP; // 建议寻求帮助
}
}
private double calculateCognitiveLoad() {
// 简单的认知负载估算:任务数 + 阻塞任务权重
var ctx = memory.getCognitiveContext();
int load = ctx.pending().size() + ctx.blocked().size() * 2;
return Math.min(1.0, load / 10.0); // 归一化到 0-1
}
// ------------------- 领域对象 -------------------
public record MonitoringResult(
List<Intervention> interventions,
double cognitiveLoad
) {
public boolean requiresIntervention() {
return !interventions.isEmpty();
}
public String toPromptInjection() {
if (interventions.isEmpty()) return "";
StringBuilder sb = new StringBuilder();
sb.append("<metacognitive_alert>\n");
sb.append(String.format(" <cognitive_load>%.0f%%</cognitive_load>\n", cognitiveLoad * 100));
sb.append(" <interventions>\n");
interventions.forEach(i -> {
sb.append(String.format(" <intervention type=\"%s\">\n", i.type()));
sb.append(String.format(" <message>%s</message>\n", i.message()));
i.suggestedStrategy().ifPresent(s ->
sb.append(String.format(" <suggestion>%s</suggestion>\n", s)));
sb.append(" </intervention>\n");
});
sb.append(" </interventions>\n");
sb.append("</metacognitive_alert>\n");
return sb.toString();
}
}
public record Intervention(
InterventionType type,
String message,
Optional<RecoveryStrategy> suggestedStrategy
) {
static Intervention memoryUpdateRequired(String msg) {
return new Intervention(InterventionType.MEMORY_UPDATE, msg, Optional.empty());
}
static Intervention taskStagnation(String taskId, String msg, RecoveryStrategy strategy) {
return new Intervention(InterventionType.STAGNATION, msg, Optional.of(strategy));
}
static Intervention repetitiveBehavior(String msg) {
return new Intervention(InterventionType.REPETITION, msg, Optional.of(RecoveryStrategy.CHANGE_APPROACH));
}
}
public enum InterventionType {
MEMORY_UPDATE, // 需要更新工作记忆
STAGNATION, // 任务停滞
REPETITION, // 重复模式
PLAN_DRIFT // 计划漂移
}
public enum RecoveryStrategy {
DECOMPOSE, // 任务分解
CHANGE_APPROACH, // 改变方法
SEEK_HELP, // 寻求帮助
REPLAN // 重新规划
}
private static class TaskMetrics {
final String taskId;
Instant lastStatusChange = Instant.now();
int toolCallCount = 0;
int failureCount = 0;
String lastToolName = null;
int sameToolRepeatCount = 0;
List<String> toolSequence = new ArrayList<>();
TaskMetrics(String taskId) {
this.taskId = taskId;
}
void recordActivity(List<ToolCall> tools) {
toolCallCount += tools.size();
for (ToolCall tool : tools) {
if (tool.name().equals(lastToolName)) {
sameToolRepeatCount++;
} else {
sameToolRepeatCount = 0;
lastToolName = tool.name();
}
if (!tool.success()) failureCount++;
toolSequence.add(tool.name());
}
lastStatusChange = Instant.now();
}
boolean isStagnant(Duration threshold) {
return Duration.between(lastStatusChange, Instant.now()).compareTo(threshold) > 0;
}
boolean detectRepetitivePattern(int threshold) {
if (toolSequence.size() < threshold * 2) return false;
// 简单检测:最后 threshold 个工具是否重复
List<String> recent = toolSequence.subList(
Math.max(0, toolSequence.size() - threshold),
toolSequence.size()
);
Set<String> unique = new HashSet<>(recent);
return unique.size() == 1 && recent.size() >= threshold; // 连续相同工具调用
}
}
public record ToolCall(String name, boolean success, Map<String, Object> params) {}
public interface LLMGateway {
// 用于触发深度反思或重新规划
String requestReflection(String context);
}
}
五、Agent 核心循环:集成与协调
5.1 完整的 Agent 主循环
java
import java.util.*;
/**
* ExecutiveAgent:具有执行功能的 Agent
* 整合了工作记忆与元认知监控的完整实现
*/
public class ExecutiveAgent {
private final WorkingMemory workingMemory;
private final MetacognitiveMonitor monitor;
private final ToolRegistry toolRegistry;
private final LLMClient llmClient;
// 配置
private final int maxIterations;
private final boolean autoReflect; // 是否启用自动反思
public ExecutiveAgent(LLMClient llmClient, ToolRegistry toolRegistry,
int maxTasks, int maxIterations) {
this.llmClient = llmClient;
this.toolRegistry = toolRegistry;
this.maxIterations = maxIterations;
this.workingMemory = new WorkingMemory(maxTasks, 5); // 最多5层嵌套
this.monitor = new MetacognitiveMonitor(
workingMemory,
llmClient::generate,
Duration.ofMinutes(5), // 5分钟停滞阈值
3 // 3轮无更新触发提醒
);
}
/**
* 执行用户请求
*/
public AgentResult execute(UserRequest request) {
// 初始化:将用户目标分解为初始任务
initializeTasks(request);
List<Message> conversation = new ArrayList<>();
conversation.add(new Message("user", request.content()));
for (int round = 0; round < maxIterations; round++) {
System.out.println("=== Round " + (round + 1) + " ===");
// 1. 构造认知增强的 Prompt
String augmentedPrompt = constructCognitivePrompt(conversation);
// 2. 调用 LLM
LLMResponse response = llmClient.generate(augmentedPrompt);
// 3. 解析工具调用
List<ToolCall> toolCalls = parseToolCalls(response);
// 4. 执行工具(包括可能的 Todo 工具)
List<ToolResult> results = executeTools(toolCalls);
// 5. 元认知监控(关键改进)
MetacognitiveMonitor.MonitoringResult monitoring =
monitor.afterRound(toolCalls);
// 6. 构造下一轮上下文
conversation.add(new Message("assistant", response.content()));
// 如果有干预建议,插入到上下文
if (monitoring.requiresIntervention()) {
String intervention = monitoring.toPromptInjection();
conversation.add(new Message("system", intervention));
System.out.println(">>> 元认知干预: " + intervention);
}
// 添加工具结果
results.forEach(r -> conversation.add(new Message("tool", r.toString())));
// 7. 检查终止条件
if (isTaskComplete()) {
return AgentResult.success(workingMemory.getCognitiveContext());
}
}
return AgentResult.incomplete(workingMemory.getCognitiveContext(), "达到最大迭代次数");
}
private String constructCognitivePrompt(List<Message> conversation) {
StringBuilder prompt = new StringBuilder();
// 系统提示:定义 Agent 的执行功能
prompt.append("""
<system>
你是一个具有执行功能的 AI Agent。你拥有以下能力:
1. 使用工具(bash, read_file, write_file, edit_file)
2. 管理工作记忆(通过 todo 工具更新任务状态)
3. 自我监控(关注 <metacognitive_alert> 的提示)
工作记忆协议:
- 每 1-3 轮必须更新一次任务状态(使用 todo 工具)
- 同一时间只能有一个 IN_PROGRESS 任务
- 任务完成后必须标记为 COMPLETED
- 遇到阻塞时标记为 BLOCKED 并说明原因
状态流转:
PENDING -> IN_PROGRESS -> COMPLETED
|
v
BLOCKED (需说明原因)
</system>
""");
// 注入当前工作记忆(思维链上下文)
prompt.append(workingMemory.toPromptRepresentation());
// 对话历史
conversation.forEach(msg ->
prompt.append(String.format("<%s>%s</%s>\n",
msg.role(), msg.content(), msg.role())));
return prompt.toString();
}
private List<ToolResult> executeTools(List<ToolCall> toolCalls) {
List<ToolResult> results = new ArrayList<>();
for (ToolCall call : toolCalls) {
try {
Object result;
if (call.name().equals("todo")) {
// 特殊处理:todo 工具直接操作工作记忆
result = handleTodoTool(call.params());
} else {
// 普通工具调用
result = toolRegistry.execute(call.name(), call.params());
}
results.add(new ToolResult(call.name(), true, result));
} catch (Exception e) {
results.add(new ToolResult(call.name(), false, e.getMessage()));
}
}
return results;
}
private String handleTodoTool(Map<String, Object> params) {
@SuppressWarnings("unchecked")
List<Map<String, Object>> items = (List<Map<String, Object>>) params.get("items");
List<TaskItem> newTasks = new ArrayList<>();
for (Map<String, Object> item : items) {
String id = (String) item.get("id");
String text = (String) item.get("text");
String statusStr = (String) item.getOrDefault("status", "pending");
int priority = (int) item.getOrDefault("priority", 3);
TaskItem task = TaskItem.pending(id, text, priority);
// 根据状态进行转换
TaskStatus targetStatus = TaskStatus.valueOf(statusStr.toUpperCase());
task = switch (targetStatus) {
case IN_PROGRESS -> task.start();
case COMPLETED -> task.complete(Map.of());
case BLOCKED -> task.block((String) item.getOrDefault("reason", "Unknown"));
default -> task;
};
newTasks.add(task);
}
var result = workingMemory.atomicUpdate(newTasks, List.of(), Optional.empty());
return result.success() ? "更新成功" : "更新失败: " + result.errorMessage();
}
private boolean isTaskComplete() {
var ctx = workingMemory.getCognitiveContext();
// 完成条件:有焦点任务且所有任务已完成或无任务
return ctx.focus() == null ||
(ctx.pending().isEmpty() && ctx.blocked().isEmpty() &&
ctx.focus().status() == TaskStatus.COMPLETED);
}
private void initializeTasks(UserRequest request) {
// 使用 LLM 进行初始任务分解
String decompositionPrompt = String.format("""
将以下目标分解为 3-5 个具体可执行的任务,返回 JSON 数组格式:
[{"id": "1", "description": "任务描述", "priority": 1-5}]
目标:%s
""", request.content());
// 简化:这里直接创建一个初始任务,实际应调用 LLM 分解
TaskItem initialTask = TaskItem.pending("task-1", request.content(), 1);
workingMemory.atomicUpdate(
List.of(initialTask),
List.of(),
Optional.of("task-1")
);
}
// 记录类定义
public record Message(String role, String content) {}
public record UserRequest(String content) {}
public record AgentResult(boolean success, CognitiveContext finalState, String reason) {
static AgentResult success(CognitiveContext state) {
return new AgentResult(true, state, "任务完成");
}
static AgentResult incomplete(CognitiveContext state, String reason) {
return new AgentResult(false, state, reason);
}
}
public record ToolResult(String toolName, boolean success, Object data) {}
public interface ToolRegistry {
Object execute(String name, Map<String, Object> params);
}
public interface LLMClient {
LLMResponse generate(String prompt);
}
public record LLMResponse(String content, List<ToolCall> toolCalls) {}
}
六、高级主题与生产优化
6.1 状态持久化与恢复
java
import java.nio.file.*;
import com.fasterxml.jackson.databind.ObjectMapper;
/**
* EventSourcedWorkingMemory:基于事件溯源的持久化实现
* 支持 Agent 会话的保存、恢复与审计
*/
public class PersistableWorkingMemory extends WorkingMemory {
private final Path snapshotPath;
private final ObjectMapper mapper;
public void saveSnapshot(String sessionId) throws Exception {
Snapshot snapshot = new Snapshot(
sessionId,
Instant.now(),
Map.copyOf(taskStore),
List.copyOf(eventLog)
);
Files.writeString(
snapshotPath.resolve(sessionId + ".json"),
mapper.writeValueAsString(snapshot)
);
}
public void restoreFromSnapshot(String sessionId) throws Exception {
String json = Files.readString(snapshotPath.resolve(sessionId + ".json"));
Snapshot snapshot = mapper.readValue(json, Snapshot.class);
// 重放事件恢复状态...
}
public record Snapshot(String sessionId, Instant timestamp,
Map<String, TaskItem> tasks,
List<StateChangeEvent> events) {}
}
6.2 多 Agent 协调
当需要多个 Agent 协作时,WorkingMemory 可扩展为共享黑板系统(Blackboard Architecture):
java
public class CollaborativeBlackboard {
// 共享工作空间,多个 Agent 可读写
private final WorkingMemory sharedMemory;
private final Map<String, AgentIdentity> contributors;
// 实现任务委托、结果聚合、冲突解决
public void delegateTask(String taskId, AgentIdentity agent) {
// 标记任务被委托,防止多个 Agent 重复执行
}
}
七、模式对比与认知架构演进
| 架构模式 | 状态管理 | 自我监控 | 适用场景 | 复杂度 |
|---|---|---|---|---|
| ReAct | 隐式(Prompt 上下文) | 无 | 简单工具调用 | ⭐⭐ |
| Plan-and-Solve | 静态计划 | 无 | 明确步骤的任务 | ⭐⭐⭐ |
| 本文架构 | 显式工作记忆 | 元认知监控 | 复杂长程任务 | ⭐⭐⭐⭐ |
| Reflexion | 尝试-错误-反思 | 事后评估 | 需要自我改进 | ⭐⭐⭐⭐ |
| Multi-Agent | 分布式共享状态 | 相互监控 | 协作复杂系统 | ⭐⭐⭐⭐⭐ |
八、总结:构建具有"执行功能"的 Agent
本文实现的不仅仅是一个待办事项管理器,而是一个符合认知科学原理的执行功能架构:
- 工作记忆系统 :通过
WorkingMemory实现跨轮次的上下文保持,解决大模型上下文窗口限制 - 状态机约束 :通过
TaskStatus的状态流转规则,防止认知混乱(如同时执行多个焦点任务) - 元认知监控 :通过
MetacognitiveMonitor实现自我监控,检测停滞、重复与漂移 - 干预机制:通过系统级 Prompt 注入,实现类人"提醒自己"的认知调控
关键设计哲学:
- 显式优于隐式:不依赖 LLM 隐式"记住"任务,而是强制使用结构化状态
- 约束即智能:通过状态机限制(单焦点、强制更新)防止 Agent 失控
- 监控即反馈:元认知层不仅监控,还提供恢复策略,形成认知闭环
这种架构使 Agent 能够处理需要数十轮交互的复杂任务(如大型代码重构、多步骤数据分析),而不会迷失方向或陷入死循环。
⭐️推荐: