springAI学习（二）模型

一、商业API模型（类似GPT-3.5-turbo）

1. OpenAI系列

复制代码

# 配置示例
spring:
  ai:
    openai:
      chat:
        model: gpt-3.5-turbo        # 性价比最高
        # model: gpt-3.5-turbo-1106  # 最新版本，支持128K上下文
        # model: gpt-4               # 更强能力，价格高
        # model: gpt-4-turbo         # 平衡性能与成本

Anthropic Claude系列

spring:
ai:
anthropic:
chat:
model: claude-3-haiku-20240307 # 最便宜，响应快
# model: claude-3-sonnet-20240229 # 平衡型，类似GPT-3.5
# model: claude-3-opus-20240229 # 最强能力，类似GPT-4

特点对比：

Haiku：响应最快，成本最低（$0.25/1M tokens）
Sonnet：平衡型，性价比高（$3/1M tokens）
Opus：最强推理能力（$15/1M tokens）

3. Google Gemini系列

复制代码

spring:
  ai:
    google:
      gemini:
        chat:
          model: gemini-1.5-pro    # 最新，128K上下文
          # model: gemini-1.0-pro   # 稳定版

复制代码

4. 其他商业API

复制代码

# 腾讯混元
spring:
  ai:
    tencent:
      chat:
        model: hunyuan-standard

# 阿里通义千问
spring:
  ai:
    aliyun:
      chat:
        model: qwen-plus

# 百度文心一言
spring:
  ai:
    baidu:
      chat:
        model: ernie-3.5

二、开源/本地部署模型

1. 通过Ollama运行（最易用）

复制代码

# 安装Ollama
curl -fsSL https://ollama.com/install.sh | sh

# 拉取模型
ollama pull llama2      # 7B参数
ollama pull mistral     # 7B，表现优秀
ollama pull codellama   # 代码专用
ollama pull qwen:7b     # 通义千问

复制代码

# Spring AI配置
spring:
  ai:
    ollama:
      base-url: http://localhost:11434
      chat:
        model: llama2    # 或 mistral、qwen:7b 等

复制代码

2. Llama系列（Meta开源）

复制代码

# Llama 2（需申请许可）
- llama2-7b
- llama2-13b
- llama2-70b

# Llama 3（最新，性能更强）
- llama3-8b
- llama3-70b

# Code Llama（代码专用）
- codellama-7b
- codellama-13b
- codellama-34b

复制代码

3. Mistral AI系列

复制代码

# Mistral 7B（表现接近GPT-3.5）
- mistral-7b

# Mixtral 8x7B（MoE架构，性价比高）
- mixtral-8x7b

# Mistral Small（商业版，性能优秀）

复制代码

4. 国内开源模型

复制代码

# 通义千问
- Qwen-7B
- Qwen-14B
- Qwen-72B

# 智谱AI
- ChatGLM3-6B
- ChatGLM3-128K（长上下文）

# 百川智能
- Baichuan2-7B
- Baichuan2-13B

# 零一万物
- Yi-6B
- Yi-34B

复制代码

5. 代码专用模型

复制代码

# Code专用
- starcoder
- starcoder2
- deepseek-coder

# 代码解释
- wizardcoder
- phind-coder

三、性能对比表

模型	参数量	上下文	近似GPT	适合场景
GPT-3.5-turbo	175B	16K	基准	通用对话、API集成
Claude Haiku	未知	200K	3.5	快速响应、长文档
Gemini Pro	未知	128K	3.5	多模态、Google生态
Llama 3 8B	8B	8K	3.0	本地部署、轻量级
Mistral 7B	7B	8K	3.0	平衡性能、推理强
Qwen 7B	7B	8K	3.0	中文优化、代码
Mixtral 8x7B	47B	32K	3.5	专业任务、高质量

四、Spring AI配置示例

1. 多模型切换配置

java 复制代码

@Configuration
public class AiModelConfig {
    
    @Bean
    @Primary
    @ConditionalOnProperty(name = "ai.provider", havingValue = "openai")
    public ChatModel openAiChatModel(OpenAiChatProperties properties) {
        return new OpenAiChatModel(properties);
    }
    
    @Bean
    @ConditionalOnProperty(name = "ai.provider", havingValue = "anthropic")
    public ChatModel anthropicChatModel(AnthropicChatProperties properties) {
        return new AnthropicChatModel(properties);
    }
    
    @Bean
    @ConditionalOnProperty(name = "ai.provider", havingValue = "ollama")
    public ChatModel ollamaChatModel(OllamaChatProperties properties) {
        return new OllamaChatModel(properties);
    }
}

复制代码

2. 模型工厂模式

java 复制代码

@Service
public class ModelFactory {
    
    private final Map<String, ChatModel> models;
    
    public ChatModel getModel(String modelType) {
        return models.getOrDefault(modelType, models.get("default"));
    }
    
    // 模型路由逻辑
    public String routeModel(String query) {
        if (query.contains("代码") || query.contains("programming")) {
            return "codellama";
        } else if (query.length() > 1000) {
            return "claude-haiku"; // 长文本用Claude
        } else {
            return "gpt-3.5-turbo";
        }
    }
}

复制代码

3. 多模型Fallback策略

java 复制代码

@Component
public class ChatService {
    private final List<ChatModel> chatModels;
    
    public String chatWithFallback(String message) {
        for (ChatModel model : chatModels) {
            try {
                return model.call(message);
            } catch (Exception e) {
                // 日志记录，尝试下一个模型
                continue;
            }
        }
        throw new RuntimeException("所有模型都失败了");
    }
}

复制代码

五、选择建议

根据场景选择：

生产环境API调用：
- 优先：GPT-3.5-turbo（稳定性最好）
- 备选：Claude Haiku（成本低）、Gemini Pro
本地/私有化部署：
- 资源有限：Llama 3 8B、Qwen 7B
- 性能要求高：Mixtral 8x7B、Qwen 72B
- 中文场景：Qwen、ChatGLM、Yi
特定领域：
- 代码生成：CodeLlama、StarCoder
- 长文档处理：Claude（200K上下文）
- 多语言：Llama、Mistral

成本考虑：

java 复制代码

// 简单成本估算
public class CostCalculator {
    public double calculateCost(String model, int inputTokens, int outputTokens) {
        Map<String, Double> inputPrice = Map.of(
            "gpt-3.5-turbo", 0.0015,
            "claude-haiku", 0.00025,
            "llama3-8b", 0.0  // 本地部署无API成本
        );
        
        Map<String, Double> outputPrice = Map.of(
            "gpt-3.5-turbo", 0.002,
            "claude-haiku", 0.00125,
            "llama3-8b", 0.0
        );
        
        return (inputTokens/1000.0)*inputPrice.getOrDefault(model, 0.0) 
             + (outputTokens/1000.0)*outputPrice.getOrDefault(model, 0.0);
    }
}

复制代码

六、最新趋势

小型化：7B-13B参数模型效果接近GPT-3.5
长上下文：128K-200K成为新标准
MoE架构：Mixtral等模型在成本和质量间取得平衡
多模态：文本+图像+音频一体化
边缘部署：手机、浏览器端运行小模型

springAI学习 （二） 模型

一、商业API模型（类似GPT-3.5-turbo）

1. OpenAI系列

3. Google Gemini系列

4. 其他商业API

二、开源/本地部署模型

1. 通过Ollama运行（最易用）

2. Llama系列（Meta开源）

3. Mistral AI系列

4. 国内开源模型

5. 代码专用模型

三、性能对比表

四、Spring AI配置示例

1. 多模型切换配置

2. 模型工厂模式

3. 多模型Fallback策略

五、选择建议

根据场景选择：

成本考虑：

六、最新趋势

springAI学习（二）模型