阿里开源AgentScope多智能体框架解析系列（三）第3章：模型接口（Model）与适配器模式

总述

模型（Model）是AgentScope中连接Agent与LLM的桥梁。通过Model接口和Formatter系统，AgentScope实现了对多个LLM厂商的统一抽象，使得开发者可以无缝切换不同的模型，而无需修改Agent逻辑。

本章将从Model接口设计、多模型支持、流式响应、生成参数、Formatter适配等方面，深入讲解AgentScope如何优雅地支持不同的LLM。

3.1 Model 接口设计

3.1.1 为什么需要Model接口？

现状问题

在没有统一接口的情况下：

arduino 复制代码

问题1：代码与LLM耦合
├─ OpenAI API调用逻辑 与 Agent逻辑混杂
├─ 切换到Anthropic时，需要改写Agent代码
└─ 违反开闭原则

问题2：消息格式不统一
├─ OpenAI要求特定的消息格式
├─ DashScope有自己的格式要求
├─ Agent无法统一处理消息

问题3：功能差异处理复杂
├─ 不同LLM支持的功能不同（工具、视觉等）
├─ 在Agent中编写大量if-else判断
└─ 代码难以维护

AgentScope的解决方案：Model接口 + Formatter

markdown 复制代码

Agent ← → Model接口（统一） ← → Formatter ← → LLM API
        │                              │
        │                              ├─ OpenAIFormatter
        │                              ├─ DashScopeFormatter
        │                              ├─ AnthropicFormatter
        │                              └─ GeminiFormatter
        │
        └─ 不需要关心具体的LLM，只需调用Model接口

3.1.2 Model接口的核心方法

java 复制代码

public interface Model {
    
    /**
     * 流式调用LLM
     * @param messages 消息列表（统一的AgentScope Msg格式）
     * @param tools 工具定义列表（可选）
     * @param options 生成参数（可选）
     * @return Flux<ChatResponse> 响应流
     */
    Flux<ChatResponse> stream(
        List<Msg> messages,
        List<ToolSchema> tools,
        GenerateOptions options
    );
    
    /**
     * 获取模型名称
     */
    String getModelName();
}

设计特点：

swift 复制代码

特点1：响应式
├─ 返回Flux<ChatResponse>，支持流式处理
├─ 充分利用异步特性，提高吞吐量
└─ 支持背压（Backpressure）

特点2：通用性
├─ 所有参数都是AgentScope的统一格式
├─ Formatter负责转换为具体LLM的格式
└─ Agent无需关心LLM的具体类型

特点3：可扩展性
├─ 新增LLM时，只需实现Model接口
├─ 编写对应的Formatter
└─ 既有代码无需修改

3.1.3 GenerateOptions - 生成参数

java 复制代码

// GenerateOptions用于控制LLM的行为
GenerateOptions options = GenerateOptions.builder()
    .temperature(0.7)              // 温度（0-1，越高越有创意）
    .topP(0.9)                     // 核采样（影响多样性）
    .maxTokens(2000)               // 最大输出token数
    .frequencyPenalty(0.5)         // 频率惩罚（抑制重复）
    .presencePenalty(0.5)          // 存在惩罚（鼓励新话题）
    .stop(List.of("END"))          // 停止词
    .seed(42)                      // 随机种子（复现结果）
    .toolChoice(ToolChoice.AUTO)   // 工具选择策略
    .executionConfig(ExecutionConfig.builder()
        .timeout(Duration.ofMinutes(1))
        .maxAttempts(3)            // 重试次数
        .build())
    .build();

Msg response = agent.call(userMsg, options).block();

参数详解

参数	范围	默认值	说明
temperature	0-2	0.7	控制输出的随机性，越高越有创意
topP	0-1	0.9	核采样，用top_p比例最可能的token
maxTokens	1+	默认值	限制单次输出的最大token数
frequencyPenalty	-2-2	0	对重复token的惩罚
presencePenalty	-2-2	0	鼓励模型引入新概念

生产场景：根据用途调整参数

java 复制代码

// 场景1：创意写作（需要多样性）
GenerateOptions creativeOptions = GenerateOptions.builder()
    .temperature(1.2)     // 更高的创意度
    .topP(0.95)           // 更多的多样性
    .maxTokens(4000)      // 允许更长的输出
    .build();

Msg story = agent.call(userMsg, creativeOptions).block();

// 场景2：技术问答（需要准确性）
GenerateOptions technicalOptions = GenerateOptions.builder()
    .temperature(0.3)     // 低温度，更确定的答案
    .topP(0.8)            // 更保守的采样
    .maxTokens(1000)      // 简明扼要
    .build();

Msg answer = agent.call(userMsg, technicalOptions).block();

// 场景3：代码生成（需要格式正确）
GenerateOptions codeOptions = GenerateOptions.builder()
    .temperature(0.2)     // 非常低，确保格式
    .stop(List.of("```", "\n\n"))  // 停止词
    .maxTokens(2000)
    .build();

Msg code = agent.call(userMsg, codeOptions).block();

3.2 多模型支持

3.2.1 支持的LLM厂商

1. DashScope（阿里云通义千问）

java 复制代码

// 创建DashScope模型
DashScopeChatModel model = DashScopeChatModel.builder()
    .apiKey(System.getenv("DASHSCOPE_API_KEY"))
    .modelName("qwen-plus")         // 或 qwen-turbo, qwen-max等
    .requestTimeout(Duration.ofSeconds(30))
    .build();

// 在ReActAgent中使用
ReActAgent agent = ReActAgent.builder()
    .name("Assistant")
    .model(model)
    .build();

DashScope优势：

复制代码

✓ 国内厂商，延迟低，价格便宜
✓ 支持中文优化
✓ 支持长文本输入（200K tokens）
✓ 支持推理模式（qwen-reasoning）

2. OpenAI GPT

java 复制代码

// 创建OpenAI模型
OpenAIChatModel model = OpenAIChatModel.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .modelName("gpt-4o")            // 或 gpt-4-turbo, gpt-3.5-turbo
    .baseUrl("https://api.openai.com/v1")  // 支持自定义端点
    .build();

ReActAgent agent = ReActAgent.builder()
    .model(model)
    .build();

OpenAI优势：

javascript 复制代码

✓ 业界标准，功能完整
✓ 支持Vision（图像理解）
✓ 支持JSON Mode（结构化输出）
✓ 社区资源丰富

3. Anthropic Claude

java 复制代码

// 创建Anthropic模型
AnthropicChatModel model = AnthropicChatModel.builder()
    .apiKey(System.getenv("ANTHROPIC_API_KEY"))
    .modelName("claude-sonnet-4-5")  // 最新Claude模型
    .build();

ReActAgent agent = ReActAgent.builder()
    .model(model)
    .build();

Claude优势：

复制代码

✓ 推理能力强
✓ 支持思考过程（可见推理步骤）
✓ 100K文本窗口（某些版本支持200K）
✓ 安全性好

4. Google Gemini

java 复制代码

// 方式1：使用Gemini API
GeminiChatModel model = GeminiChatModel.builder()
    .apiKey(System.getenv("GEMINI_API_KEY"))
    .modelName("gemini-2.0-flash")
    .build();

// 方式2：使用Vertex AI（GCP）
GeminiChatModel model = GeminiChatModel.builder()
    .modelName("gemini-2.0-flash")
    .project("your-gcp-project-id")
    .location("us-central1")
    .vertexAI(true)
    .build();

ReActAgent agent = ReActAgent.builder()
    .model(model)
    .build();

Gemini优势：

复制代码

✓ 支持多模态（视觉、音频）
✓ 上下文窗口大（1000K+）
✓ 推理能力强
✓ 价格有竞争力

3.2.2 模型兼容性与选择指南

scss 复制代码

选择标准：

┌─ 功能需求
│  ├─ 需要Vision（图像理解）？ → OpenAI/Claude/Gemini
│  ├─ 需要推理能力？ → Claude/Gemini/DashScope(qwen-reasoning)
│  ├─ 需要长文本？ → Gemini(1M) > DashScope(200K) > Claude(100K)
│  └─ 需要结构化输出？ → OpenAI(JSON Mode) ✓
│
├─ 成本考虑
│  ├─ 成本敏感 → DashScope(最便宜)
│  ├─ 均衡 → OpenAI/Anthropic
│  └─ 不限成本 → Claude(质量最好)
│
└─ 地理位置
   ├─ 国内 → DashScope（低延迟）
   ├─ 国际 → OpenAI（稳定性好）
   └─ GCP用户 → Gemini/Vertex AI

3.2.3 多模型管理

java 复制代码

// 场景：同时使用多个模型

// 快速回复用便宜的模型
Model fastModel = DashScopeChatModel.builder()
    .apiKey(System.getenv("DASHSCOPE_API_KEY"))
    .modelName("qwen-turbo")  // 快速且便宜
    .build();

// 复杂任务用强大的模型
Model powerfulModel = OpenAIChatModel.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .modelName("gpt-4o")      // 功能全面
    .build();

// 路由决策
public Model selectModel(String taskType) {
    return switch(taskType) {
        case "chat" -> fastModel;        // 简单对话用快速模型
        case "analysis" -> powerfulModel; // 复杂分析用强大模型
        case "vision" -> visionModel;     // 视觉任务用Vision模型
        default -> fastModel;
    };
}

// Agent中使用
ReActAgent agent = ReActAgent.builder()
    .name("SmartAssistant")
    .model(selectModel(taskType))  // 动态选择模型
    .build();

3.3 流式响应与响应式编程

3.3.1 为什么使用流式响应？

传统非流式方式的问题

markdown 复制代码

用户输入 → 等待... → 等待... → 模型完成 → 返回完整回复
                    (延迟感很强)

流式响应的优势

markdown 复制代码

用户输入 → Token1 → Token2 → Token3 → ... → 完成
          (实时展示，用户感到更快)

生产价值：

复制代码

✓ 用户体验更好（实时反馈，不显得卡顿）
✓ 支持实时展示推理过程（如Claude的思考过程）
✓ 更快发现错误（如工具调用错误）
✓ 支持大文本输出（不需等待全部生成）

3.3.2 流式API的使用

java 复制代码

// 方式1：基础的流式处理
Flux<ChatResponse> responses = model.stream(
    List.of(userMessage),
    null,  // 没有工具
    null   // 使用默认参数
);

responses.subscribe(
    response -> System.out.print(extractText(response)),  // 每收到一个chunk就打印
    error -> System.err.println("Error: " + error),
    () -> System.out.println("\n[完成]")
);

// 方式2：收集所有responses后处理
String fullResponse = responses
    .map(response -> extractText(response))
    .collect(Collectors.joining())
    .block();  // 等待全部完成
System.out.println(fullResponse);

// 方式3：处理时进行过滤和转换
responses
    .filter(response -> response.getContent() != null && !response.getContent().isEmpty())
    .map(response -> extractText(response))
    .doOnNext(chunk -> updateUI(chunk))  // 更新UI
    .onErrorResume(error -> handleError(error))
    .doFinally(signal -> cleanup())
    .subscribe();

3.3.3 实时推理过程展示

java 复制代码

// 某些模型（如Claude）支持返回思考过程

public void demonstrateReasoningStream() {
    ReActAgent agent = ReActAgent.builder()
        .name("ThinkingBot")
        .model(AnthropicChatModel.builder()
            .apiKey(System.getenv("ANTHROPIC_API_KEY"))
            .modelName("claude-opus")  // 支持思考过程的模型
            .build())
        .build();
    
    // 流式调用
    agent.stream(
        Msg.builder().textContent("解释量子计算的基本原理")
        .build()
    ).subscribe(event -> {
        // 根据事件类型处理
        switch(event) {
            case ReasoningChunkEvent reasoning -> 
                System.out.print("💭 " + reasoning.getDelta());  // 显示思考过程
            case ActingChunkEvent acting -> 
                System.out.print("🔧 " + acting.getDelta());     // 显示工具调用
            default -> 
                System.out.print("📝 " + event.getDelta());      // 显示最终答案
        }
    });
}

3.3.4 WebSocket实时流式传输

java 复制代码

// Spring Boot Controller示例：实时流式输出

@RestController
@RequestMapping("/api/agent")
public class AgentController {
    
    @PostMapping(value = "/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public SseEmitter stream(@RequestBody String userInput) {
        SseEmitter emitter = new SseEmitter(60000L);  // 60秒超时
        
        Executors.newSingleThreadExecutor().execute(() -> {
            try {
                // 调用Agent的流式API
                agent.stream(
                    Msg.builder().textContent(userInput).build()
                ).subscribe(
                    event -> {
                        try {
                            // 实时发送给客户端
                            emitter.send(SseEmitter.event()
                                .id(event.getMessageId())
                                .data(event.getDelta())
                                .build());
                        } catch (IOException e) {
                            emitter.completeWithError(e);
                        }
                    },
                    error -> {
                        try {
                            emitter.send(SseEmitter.event()
                                .id("error")
                                .data("Error: " + error.getMessage())
                                .build());
                            emitter.complete();
                        } catch (IOException e) {
                            emitter.completeWithError(e);
                        }
                    },
                    () -> {
                        try {
                            emitter.send(SseEmitter.event()
                                .id("done")
                                .data("完成")
                                .build());
                            emitter.complete();
                        } catch (IOException e) {
                            emitter.completeWithError(e);
                        }
                    }
                );
            } catch (Exception e) {
                emitter.completeWithError(e);
            }
        });
        
        return emitter;
    }
}

// 前端JavaScript代码
const eventSource = new EventSource('/api/agent/stream', {
    method: 'POST',
    body: JSON.stringify({userInput: "你好"})
});

eventSource.onmessage = (event) => {
    if (event.id === 'done') {
        eventSource.close();
    } else if (event.id === 'error') {
        console.error(event.data);
    } else {
        document.getElementById('response').innerHTML += event.data;
    }
};

3.4 Formatter与消息格式转换

3.4.1 为什么需要Formatter？

每个LLM提供商的API格式都不同：

json 复制代码

DashScope期望的格式：
{
  "model": "qwen-plus",
  "messages": [{"role": "user", "content": [...]}],
  "parameters": {"temperature": 0.7}
}

OpenAI期望的格式：
{
  "model": "gpt-4",
  "messages": [{"role": "user", "content": [...]}],
  "temperature": 0.7
}

Anthropic期望的格式：
{
  "model": "claude-opus",
  "messages": [{"role": "user", "content": [...]}],
  "temperature": 0.7
}

Gemini期望的格式：
{
  "generationConfig": {"temperature": 0.7},
  "contents": [{"role": "user", "parts": [...]}]
}

Formatter的作用：

markdown 复制代码

AgentScope Msg格式 
    ↓ (Formatter转换)
    ├─ DashScopeFormatter → DashScope API格式
    ├─ OpenAIFormatter → OpenAI API格式
    ├─ AnthropicFormatter → Anthropic API格式
    └─ GeminiFormatter → Gemini API格式

3.4.2 Formatter的自动选择

java 复制代码

// 方式1：自动使用默认Formatter
DashScopeChatModel model1 = DashScopeChatModel.builder()
    .apiKey(key)
    .modelName("qwen-plus")
    // 自动使用 DashScopeChatFormatter
    .build();

// 方式2：显式指定Formatter
DashScopeChatModel model2 = DashScopeChatModel.builder()
    .apiKey(key)
    .modelName("qwen-plus")
    .formatter(new DashScopeMultiAgentFormatter())  // 用于多Agent场景
    .build();

// 方式3：自定义Formatter
public class CustomFormatter implements Formatter {
    @Override
    public Object format(List<Msg> messages, List<ToolSchema> tools) {
        // 自定义格式化逻辑
        return customFormattedRequest;
    }
}

DashScopeChatModel model3 = DashScopeChatModel.builder()
    .apiKey(key)
    .modelName("qwen-plus")
    .formatter(new CustomFormatter())
    .build();

3.4.3 多模态内容的Formatter处理

java 复制代码

// 包含图像的消息如何转换？

Msg multimodalMsg = Msg.builder()
    .role(MsgRole.USER)
    .content(
        TextBlock.builder().text("分析这张图片").build(),
        ImageBlock.builder()
            .source(Base64Source.builder()
                .data(base64Image)
                .mediaType("image/png")
                .build())
            .build()
    )
    .build();

// DashScopeFormatter转换后：
/*
{
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "分析这张图片"},
      {"type": "image", "image": "base64_image_data"}
    ]
  }]
}
*/

// OpenAIFormatter转换后：
/*
{
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "分析这张图片"},
      {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
    ]
  }]
}
*/

3.5 生产场景示例

3.5.1 多模型智能路由

java 复制代码

public class IntelligentModelRouter {
    
    private final DashScopeChatModel fastModel;
    private final OpenAIChatModel standardModel;
    private final AnthropicChatModel powerfulModel;
    
    public IntelligentModelRouter() {
        this.fastModel = DashScopeChatModel.builder()
            .apiKey(System.getenv("DASHSCOPE_API_KEY"))
            .modelName("qwen-turbo")
            .build();
        
        this.standardModel = OpenAIChatModel.builder()
            .apiKey(System.getenv("OPENAI_API_KEY"))
            .modelName("gpt-4o")
            .build();
        
        this.powerfulModel = AnthropicChatModel.builder()
            .apiKey(System.getenv("ANTHROPIC_API_KEY"))
            .modelName("claude-opus")
            .build();
    }
    
    /**
     * 根据任务复杂度和成本约束选择最优模型
     */
    public Model selectModel(String userQuery, QueryContext context) {
        // 分析查询复杂度
        int complexity = analyzeComplexity(userQuery);
        double costBudget = context.getCostBudget();
        
        // 简单查询 + 有成本限制 → 快速模型
        if (complexity < 3 && costBudget < 0.01) {
            return fastModel;
        }
        
        // 中等复杂度 → 标准模型
        if (complexity < 6) {
            return standardModel;
        }
        
        // 复杂任务 + 无成本限制 → 强大模型
        if (costBudget > 0.1) {
            return powerfulModel;
        }
        
        // 默认使用标准模型
        return standardModel;
    }
    
    private int analyzeComplexity(String query) {
        // 基于关键词估算复杂度
        int complexity = 0;
        if (query.contains("分析")) complexity += 2;
        if (query.contains("对比")) complexity += 2;
        if (query.contains("推理")) complexity += 3;
        if (query.contains("代码")) complexity += 2;
        return complexity;
    }
}

// 使用示例
public void processQuery(String userQuery) {
    IntelligentModelRouter router = new IntelligentModelRouter();
    QueryContext context = new QueryContext()
        .setCostBudget(0.05);  // 5分钱的成本限制
    
    Model selectedModel = router.selectModel(userQuery, context);
    
    ReActAgent agent = ReActAgent.builder()
        .name("SmartAssistant")
        .model(selectedModel)
        .build();
    
    Msg response = agent.call(
        Msg.builder().textContent(userQuery).build()
    ).block();
    
    System.out.println("Response: " + response.getTextContent());
    System.out.println("Model: " + selectedModel.getModelName());
}

3.5.2 带超时和重试的模型调用

java 复制代码

public void robustModelCall() {
    // 配置超时和重试
    ExecutionConfig execConfig = ExecutionConfig.builder()
        .timeout(Duration.ofMinutes(2))        // 2分钟超时
        .maxAttempts(3)                        // 最多重试3次
        .initialBackoff(Duration.ofSeconds(1)) // 首次重试延迟1秒
        .maxBackoff(Duration.ofSeconds(10))    // 最大延迟10秒
        .backoffMultiplier(2.0)                // 每次延迟翻倍
        .build();
    
    GenerateOptions options = GenerateOptions.builder()
        .executionConfig(execConfig)
        .temperature(0.7)
        .maxTokens(1000)
        .build();
    
    ReActAgent agent = ReActAgent.builder()
        .name("RobustAgent")
        .model(DashScopeChatModel.builder()
            .apiKey(System.getenv("DASHSCOPE_API_KEY"))
            .modelName("qwen-plus")
            .build())
        .maxIters(5)  // Agent最多5轮推理
        .build();
    
    try {
        Msg response = agent.call(
            Msg.builder().textContent("请解释量子计算").build(),
            options  // 使用超时和重试配置
        ).timeout(Duration.ofMinutes(3))  // 总超时3分钟
         .onErrorMap(TimeoutException.class, 
                     e -> new RuntimeException("模型响应超时，请重试"))
         .block();
        
        System.out.println("Success: " + response.getTextContent());
    } catch (Exception e) {
        System.err.println("Failed after retries: " + e.getMessage());
        // 降级处理：使用缓存或提供预设答案
    }
}

3.5.3 Token使用监控与成本计算

java 复制代码

public class ModelCostTracker {
    
    private final Map<String, Double> tokenPrices = Map.ofEntries(
        // 价格/1K tokens
        Map.entry("qwen-turbo-input", 0.0001),
        Map.entry("qwen-turbo-output", 0.0002),
        Map.entry("gpt-4o-input", 0.015),
        Map.entry("gpt-4o-output", 0.06),
        Map.entry("claude-opus-input", 0.015),
        Map.entry("claude-opus-output", 0.075)
    );
    
    public double calculateCost(String modelName, ChatUsage usage) {
        String inputKey = modelName + "-input";
        String outputKey = modelName + "-output";
        
        double inputCost = (usage.getInputTokens() * 
                           tokenPrices.getOrDefault(inputKey, 0.0)) / 1000;
        double outputCost = (usage.getOutputTokens() * 
                            tokenPrices.getOrDefault(outputKey, 0.0)) / 1000;
        
        return inputCost + outputCost;
    }
    
    public void trackAndLog(Msg response, String modelName) {
        ChatUsage usage = response.getChatUsage();
        if (usage == null) return;
        
        double cost = calculateCost(modelName, usage);
        double costPerToken = (cost / usage.getTotalTokens()) * 1000;
        
        System.out.printf(
            "Model: %s | Tokens: %d/%d | Cost: ¥%.4f (¥%.4f/1K)\n",
            modelName,
            usage.getInputTokens(),
            usage.getOutputTokens(),
            cost,
            costPerToken
        );
        
        // 保存到数据库用于成本分析
        saveCostMetrics(modelName, usage, cost);
    }
    
    private void saveCostMetrics(String modelName, ChatUsage usage, double cost) {
        // 存储到数据库或监控系统
    }
}

总结

核心概念回顾

概念	说明
Model接口	统一的LLM接口，屏蔽不同厂商的差异
多模型支持	支持DashScope、OpenAI、Anthropic、Gemini等
流式响应	实时推送Token，提升用户体验
Formatter	自动转换消息格式适配不同LLM
GenerateOptions	灵活的生成参数控制

实现要点

markdown 复制代码

1. 适配器模式的应用
   └─ Model接口 + 多个实现 + Formatter
   └─ 新增模型时无需修改现有代码

2. 流式处理
   └─ 使用Flux<ChatResponse>支持背压
   └─ 提升用户体验

3. 成本和性能平衡
   └─ 根据任务复杂度选择合适的模型
   └─ 监控Token使用和成本

4. 可靠性设计
   └─ 超时控制防止无限等待
   └─ 自动重试处理瞬时故障

本章完

在下一章中，我们将深入探讨Agent接口与AgentBase基类，了解如何实现自定义Agent。