【Spring AI实战】第2章 大模型基础调用:同步/异步/流式输出

1. 同步问答开发:基础文本对话接口

我来详细介绍如何使用 Spring AI 进行同步问答开发,特别是基础文本对话接口的实现。

项目依赖配置

首先,在 ​​pom.xml​​ 或 ​​build.gradle​​ 中添加依赖:

XML 复制代码
<!-- Spring AI 核心依赖 -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-core</artifactId>
    <version>1.0.0-M2</version> <!-- 请使用最新版本 -->
</dependency>

<!-- 选择具体的 AI 提供商,例如 OpenAI -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    <version>1.0.0-M2</version>
</dependency>

<!-- 或者使用 Azure OpenAI -->
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-azure-openai-spring-boot-starter</artifactId>
    <version>1.0.0-M2</version>
</dependency>

配置文件

在 ​​application.yml​​ 或 ​​application.properties​​ 中配置:

Crystal 复制代码
spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      chat:
        options:
          model: gpt-3.5-turbo
          temperature: 0.7
          max-tokens: 1000

基础同步问答接口实现

3.1 创建 ChatController
java 复制代码
import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.chat.ChatResponse;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.messages.Message;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.web.bind.annotation.*;

import java.util.List;
import java.util.Map;

@RestController
@RequestMapping("/api/chat")
public class ChatController {
    
    private final ChatClient chatClient;
    
    public ChatController(ChatClient chatClient) {
        this.chatClient = chatClient;
    }
    
    /**
     * 基础同步问答接口
     */
    @PostMapping("/simple")
    public String simpleChat(@RequestBody Map<String, String> request) {
        String message = request.get("message");
        return chatClient.call(message);
    }
    
    /**
     * 带参数的同步问答接口
     */
    @PostMapping("/with-params")
    public ChatResponse chatWithParams(@RequestBody ChatRequest request) {
        Prompt prompt = new Prompt(
            new UserMessage(request.getMessage()),
            request.getOptions()
        );
        
        return chatClient.call(prompt);
    }
    
    /**
     * 流式响应(非流式返回)
     */
    @PostMapping("/detailed")
    public ChatResponse detailedChat(@RequestBody ChatRequest request) {
        Prompt prompt = new Prompt(
            new UserMessage(request.getMessage()),
            request.getOptions()
        );
        
        return chatClient.call(prompt);
    }
    
    /**
     * 对话历史记录
     */
    @PostMapping("/with-history")
    public String chatWithHistory(@RequestBody ChatHistoryRequest request) {
        List<Message> messages = request.getMessages().stream()
            .map(msg -> {
                if ("user".equals(msg.getRole())) {
                    return new UserMessage(msg.getContent());
                } else {
                    return new AssistantMessage(msg.getContent());
                }
            })
            .collect(Collectors.toList());
            
        Prompt prompt = new Prompt(messages, request.getOptions());
        ChatResponse response = chatClient.call(prompt);
        
        return response.getResult().getOutput().getContent();
    }
}
3.2 请求和响应对象
java 复制代码
import org.springframework.ai.chat.prompt.PromptOptions;
import java.util.List;

// 基础请求对象
public class ChatRequest {
    private String message;
    private PromptOptions options;
    
    // getters and setters
}

// 带历史记录的请求对象
public class ChatHistoryRequest {
    private List<ChatMessage> messages;
    private PromptOptions options;
    
    // getters and setters
    
    public static class ChatMessage {
        private String role; // "user" or "assistant"
        private String content;
        
        // getters and setters
    }
}

// 响应对象
public class ChatResponseDTO {
    private String content;
    private String model;
    private Long tokens;
    private Long completionTokens;
    
    // getters and setters
}

服务层封装

java 复制代码
import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.chat.ChatResponse;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.messages.Message;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.chat.messages.AssistantMessage;
import org.springframework.ai.chat.messages.SystemMessage;
import org.springframework.stereotype.Service;

import java.util.ArrayList;
import java.util.List;

@Service
public class ChatService {
    
    private final ChatClient chatClient;
    
    public ChatService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }
    
    /**
     * 简单问答
     */
    public String simpleChat(String message) {
        return chatClient.call(message);
    }
    
    /**
     * 带系统提示的问答
     */
    public String chatWithSystemPrompt(String systemPrompt, String userMessage) {
        List<Message> messages = new ArrayList<>();
        messages.add(new SystemMessage(systemPrompt));
        messages.add(new UserMessage(userMessage));
        
        Prompt prompt = new Prompt(messages);
        ChatResponse response = chatClient.call(prompt);
        
        return response.getResult().getOutput().getContent();
    }
    
    /**
     * 带参数的问答
     */
    public ChatResponse chatWithOptions(String message, 
                                        String model, 
                                        Double temperature, 
                                        Integer maxTokens) {
        Prompt prompt = new Prompt(
            new UserMessage(message),
            PromptOptions.builder()
                .withModel(model)
                .withTemperature(temperature)
                .withMaxTokens(maxTokens)
                .build()
        );
        
        return chatClient.call(prompt);
    }
    
    /**
     * 多轮对话
     */
    public String multiTurnChat(List<ChatTurn> conversationHistory, String newMessage) {
        List<Message> messages = new ArrayList<>();
        
        // 添加历史对话
        for (ChatTurn turn : conversationHistory) {
            messages.add(new UserMessage(turn.getUserMessage()));
            messages.add(new AssistantMessage(turn.getAssistantMessage()));
        }
        
        // 添加新消息
        messages.add(new UserMessage(newMessage));
        
        Prompt prompt = new Prompt(messages);
        ChatResponse response = chatClient.call(prompt);
        
        return response.getResult().getOutput().getContent();
    }
}

高级配置和自定义

5.1 自定义 ChatClient 配置
java 复制代码
import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.openai.OpenAiChatOptions;
import org.springframework.ai.openai.OpenAiChatClient;
import org.springframework.ai.openai.api.OpenAiApi;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class AiConfig {
    
    @Bean
    public ChatClient chatClient() {
        OpenAiApi openAiApi = new OpenAiApi("your-api-key");
        
        OpenAiChatOptions options = OpenAiChatOptions.builder()
            .withModel("gpt-4")
            .withTemperature(0.8)
            .withMaxTokens(2000)
            .withTopP(0.9)
            .withFrequencyPenalty(0.0)
            .withPresencePenalty(0.0)
            .build();
            
        return new OpenAiChatClient(openAiApi, options);
    }
}
5.2 异常处理
java 复制代码
import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.chat.ChatResponse;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.retry.annotation.Backoff;
import org.springframework.retry.annotation.Retryable;
import org.springframework.stereotype.Service;

@Service
public class RobustChatService {
    
    private final ChatClient chatClient;
    
    public RobustChatService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }
    
    @Retryable(
        value = { Exception.class },
        maxAttempts = 3,
        backoff = @Backoff(delay = 1000, multiplier = 2)
    )
    public String chatWithRetry(String message) {
        try {
            return chatClient.call(message);
        } catch (Exception e) {
            // 记录日志
            throw new ChatServiceException("聊天服务调用失败", e);
        }
    }
    
    /**
     * 带超时控制的聊天
     */
    public String chatWithTimeout(String message, long timeoutMillis) {
        return CompletableFuture.supplyAsync(() -> chatClient.call(message))
            .orTimeout(timeoutMillis, TimeUnit.MILLISECONDS)
            .exceptionally(ex -> "请求超时,请稍后重试")
            .join();
    }
}

测试示例

java 复制代码
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;

import static org.assertj.core.api.Assertions.assertThat;

@SpringBootTest
class ChatServiceTest {
    
    @Autowired
    private ChatService chatService;
    
    @Test
    void testSimpleChat() {
        String response = chatService.simpleChat("你好,请介绍一下你自己");
        
        assertThat(response).isNotNull();
        assertThat(response).isNotBlank();
        System.out.println("AI回复: " + response);
    }
    
    @Test
    void testChatWithSystemPrompt() {
        String systemPrompt = "你是一个专业的编程助手,请用简洁的语言回答";
        String userMessage = "如何学习Spring Boot?";
        
        String response = chatService.chatWithSystemPrompt(systemPrompt, userMessage);
        
        assertThat(response).contains("Spring");
        System.out.println("AI回复: " + response);
    }
}

使用示例(HTTP调用)

Crystal 复制代码
# 简单问答
curl -X POST http://localhost:8080/api/chat/simple \
  -H "Content-Type: application/json" \
  -d '{"message": "什么是微服务?"}'

# 带参数的问答
curl -X POST http://localhost:8080/api/chat/with-params \
  -H "Content-Type: application/json" \
  -d '{
    "message": "解释一下RESTful API设计原则",
    "options": {
      "model": "gpt-4",
      "temperature": 0.5,
      "maxTokens": 500
    }
  }'

最佳实践建议

  1. 错误处理 :始终处理API调用可能出现的异常
  2. 超时控制 :设置合理的超时时间
  3. 限流控制 :避免频繁调用导致API限制
  4. 日志记录 :记录请求和响应用于调试和监控
  5. 参数验证 :验证输入参数的有效性
  6. 敏感信息 :不要在响应中泄露敏感信息
  7. 性能监控 :监控API调用的响应时间和成功率

这个基础文本对话接口提供了同步调用的完整实现,可以根据具体需求进行扩展和优化。

2. 异步调用:解决接口超时、阻塞问题

Spring AI 异步调用是解决接口超时和阻塞问题的有效方案。以下是完整的实现方案:

核心异步方案

1.1 使用 @Async 注解
java 复制代码
@Service
public class AIService {
    
    @Async("aiTaskExecutor")
    public CompletableFuture<String> generateContentAsync(String prompt) {
        try {
            String result = aiClient.call(prompt);
            return CompletableFuture.completedFuture(result);
        } catch (Exception e) {
            return CompletableFuture.failedFuture(e);
        }
    }
}
1.2 配置线程池
java 复制代码
@Configuration
@EnableAsync
public class AsyncConfig {
    
    @Bean("aiTaskExecutor")
    public TaskExecutor aiTaskExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(10);
        executor.setMaxPoolSize(50);
        executor.setQueueCapacity(100);
        executor.setThreadNamePrefix("ai-async-");
        executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
        executor.initialize();
        return executor;
    }
}

响应式编程方案

2.1 使用 WebFlux + Reactor
java 复制代码
@RestController
@RequestMapping("/api/ai")
public class AIAsyncController {
    
    @GetMapping("/stream")
    public Flux<String> streamResponse(@RequestParam String prompt) {
        return Flux.create(sink -> {
            try {
                // 流式响应
                aiClient.stream(prompt)
                    .subscribe(
                        chunk -> sink.next(chunk),
                        sink::error,
                        sink::complete
                    );
            } catch (Exception e) {
                sink.error(e);
            }
        });
    }
    
    @GetMapping("/async")
    public Mono<ResponseEntity<String>> asyncGenerate(@RequestParam String prompt) {
        return Mono.fromCallable(() -> aiClient.call(prompt))
            .timeout(Duration.ofSeconds(30))
            .map(ResponseEntity::ok)
            .onErrorResume(e -> Mono.just(
                ResponseEntity.status(HttpStatus.REQUEST_TIMEOUT)
                    .body("请求超时")
            ));
    }
}

消息队列解耦方案

3.1 使用 RabbitMQ/Kafka
java 复制代码
@Component
public class AIRequestProcessor {
    
    @RabbitListener(queues = "ai.request.queue")
    public void processRequest(AIRequest request) {
        String result = aiClient.call(request.getPrompt());
        
        // 存储结果到缓存
        redisTemplate.opsForValue().set(
            "ai:result:" + request.getRequestId(),
            result,
            Duration.ofMinutes(10)
        );
        
        // 发送WebSocket通知
        messagingTemplate.convertAndSend(
            "/topic/ai-result/" + request.getRequestId(),
            new AIResponse(request.getRequestId(), result)
        );
    }
}
3.2 轮询结果接口
java 复制代码
@RestController
public class AIResultController {
    
    @GetMapping("/ai/result/{requestId}")
    public ResponseEntity<?> getResult(@PathVariable String requestId) {
        String result = redisTemplate.opsForValue()
            .get("ai:result:" + requestId);
        
        if (result == null) {
            return ResponseEntity.accepted()
                .header("Retry-After", "5")
                .body(Map.of("status", "processing"));
        }
        
        return ResponseEntity.ok(Map.of("result", result));
    }
}

超时和重试机制

4.1 配置超时设置
java 复制代码
@Configuration
public class AIClientConfig {
    
    @Bean
    public AiClient aiClient() {
        OpenAiChatClient client = new OpenAiChatClient(
            OpenAiChatOptions.builder()
                .apiKey(apiKey)
                .model("gpt-4")
                .temperature(0.7)
                .build()
        );
        
        // 配置超时
        client.setRequestTimeout(Duration.ofSeconds(60));
        client.setConnectTimeout(Duration.ofSeconds(10));
        
        return client;
    }
}
4.2 重试机制
java 复制代码
@Service
public class AIServiceWithRetry {
    
    @Retryable(
        value = {TimeoutException.class, AIException.class},
        maxAttempts = 3,
        backoff = @Backoff(delay = 1000, multiplier = 2)
    )
    @CircuitBreaker(name = "aiService", fallbackMethod = "fallback")
    public String generateWithRetry(String prompt) {
        return aiClient.call(prompt);
    }
    
    public String fallback(String prompt, Throwable t) {
        return "服务暂时不可用,请稍后重试";
    }
}

完整示例:异步流式响应

java 复制代码
@RestController
public class AIStreamController {
    
    @GetMapping(value = "/ai/chat-stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux<ServerSentEvent<String>> chatStream(
            @RequestParam String message,
            @RequestParam(defaultValue = "false") boolean async) {
        
        if (async) {
            // 异步处理
            return Flux.defer(() -> 
                aiClient.stream(message)
                    .map(chunk -> ServerSentEvent.builder(chunk).build())
                    .timeout(Duration.ofSeconds(30))
                    .onErrorResume(e -> Flux.just(
                        ServerSentEvent.builder("[ERROR] " + e.getMessage()).build()
                    ))
            ).subscribeOn(Schedulers.boundedElastic());
        } else {
            // 同步流式
            return aiClient.stream(message)
                .map(chunk -> ServerSentEvent.builder(chunk).build());
        }
    }
    
    @PostMapping("/ai/async-batch")
    public CompletableFuture<BatchResponse> batchProcess(
            @RequestBody BatchRequest request) {
        
        List<CompletableFuture<String>> futures = request.getPrompts().stream()
            .map(prompt -> CompletableFuture.supplyAsync(
                () -> aiClient.call(prompt),
                aiTaskExecutor
            ))
            .collect(Collectors.toList());
        
        return CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
            .thenApply(v -> {
                List<String> results = futures.stream()
                    .map(CompletableFuture::join)
                    .collect(Collectors.toList());
                return new BatchResponse(results);
            });
    }
}

监控和熔断

java 复制代码
@Configuration
public class ResilienceConfig {
    
    @Bean
    public CircuitBreakerConfig aiCircuitBreakerConfig() {
        return CircuitBreakerConfig.custom()
            .failureRateThreshold(50)
            .waitDurationInOpenState(Duration.ofSeconds(30))
            .slidingWindowSize(10)
            .build();
    }
    
    @Bean
    public TimeLimiterConfig aiTimeLimiterConfig() {
        return TimeLimiterConfig.custom()
            .timeoutDuration(Duration.ofSeconds(30))
            .cancelRunningFuture(true)
            .build();
    }
}

最佳实践建议

  1. 选择合适的异步方案 :
  1. 短任务:使用 @Async
  2. 流式响应:使用 WebFlux
  3. 长时间任务:使用消息队列
  1. 资源管理 :
  1. 根据业务量调整线程池大小
  2. 设置合理的队列容量
  3. 监控线程池状态
  1. 错误处理 :
  1. 超时后及时释放资源
  2. 实现降级策略
  3. 记录详细日志
  1. 性能优化 :
  1. 使用连接池
  2. 启用响应式背压
  3. 缓存常用结果

这种异步架构可以有效解决接口超时和阻塞问题,提升系统吞吐量和用户体验。

3. 流式输出核心实战:SSE 实现打字机效果(前端实时响应)

SpringAI 流式输出核心实战:SSE 实现打字机效果

本文将详细介绍如何使用 SpringAI 结合 Server-Sent Events (SSE) 实现打字机效果的流式输出,让前端能够实时响应 AI 生成的内容。

一、SSE 技术简介

Server-Sent Events (SSE) 是一种允许服务器向客户端推送事件的 HTML5 技术。与 WebSocket 相比,SSE 是单向的(服务器到客户端),更适合实时数据流场景。

SSE 核心特点:
  • 基于 HTTP/HTTPS 协议
  • 支持自动重连
  • 轻量级,API 简单
  • 天然支持流式数据传输

二、SpringAI 集成 SSE 实现方案

环境准备

pom.xml 依赖配置:

XML 复制代码
<dependencies>
    <!-- Spring Boot Web -->
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
    
    <!-- SpringAI OpenAI -->
    <dependency>
        <groupId>org.springframework.ai</groupId>
        <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
        <version>0.8.1</version>
    </dependency>
    
    <!-- Lombok(可选) -->
    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <optional>true</optional>
    </dependency>
</dependencies>
配置类

application.yml 配置:

Crystal 复制代码
spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      chat:
        options:
          model: gpt-3.5-turbo
          temperature: 0.7
SSE 控制器实现
java 复制代码
import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.http.MediaType;
import org.springframework.http.codec.ServerSentEvent;
import org.springframework.web.bind.annotation.*;
import reactor.core.publisher.Flux;

import java.time.Duration;
import java.util.Map;

@RestController
@RequestMapping("/api/ai")
@CrossOrigin(origins = "*") // 允许跨域访问
public class AIChatController {
    
    private final ChatClient chatClient;
    
    public AIChatController(ChatClient chatClient) {
        this.chatClient = chatClient;
    }
    
    /**
     * SSE 流式聊天接口
     * @param message 用户消息
     * @return SSE 流
     */
    @GetMapping(value = "/stream-chat", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux<ServerSentEvent<String>> streamChat(
            @RequestParam String message,
            @RequestParam(required = false, defaultValue = "false") boolean withMetadata) {
        
        Prompt prompt = new Prompt(new UserMessage(message));
        
        // 使用 SpringAI 的流式 API
        Flux<String> aiResponseStream = chatClient.stream(prompt)
            .map(response -> response.getResult().getOutput().getContent());
        
        return aiResponseStream
            .map(content -> {
                // 构建 SSE 事件
                ServerSentEvent.Builder<String> builder = ServerSentEvent.builder(content);
                
                // 添加事件类型(可选)
                builder.event("message");
                
                // 添加重试时间
                builder.retry(Duration.ofSeconds(5));
                
                // 如果需要元数据,可以添加 ID 等
                if (withMetadata) {
                    builder.id(String.valueOf(System.currentTimeMillis()));
                }
                
                return builder.build();
            })
            // 添加心跳保持连接
            .mergeWith(heartbeatFlux());
    }
    
    /**
     * 心跳 Flux,保持 SSE 连接
     */
    private Flux<ServerSentEvent<String>> heartbeatFlux() {
        return Flux.interval(Duration.ofSeconds(15))
            .map(seq -> ServerSentEvent.<String>builder()
                .comment("heartbeat")
                .build());
    }
    
    /**
     * 批量聊天接口(非流式,用于对比)
     */
    @PostMapping("/chat")
    public Map<String, String> chat(@RequestBody Map<String, String> request) {
        String message = request.get("message");
        Prompt prompt = new Prompt(new UserMessage(message));
        String response = chatClient.call(prompt);
        
        return Map.of("response", response);
    }
}
增强版:支持上下文和配置参数
java 复制代码
import org.springframework.ai.chat.ChatResponse;
import org.springframework.ai.chat.Generation;
import org.springframework.ai.chat.messages.AssistantMessage;
import org.springframework.ai.chat.messages.Message;
import org.springframework.ai.chat.messages.SystemMessage;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.openai.OpenAiChatOptions;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.http.MediaType;
import org.springframework.http.codec.ServerSentEvent;
import org.springframework.web.bind.annotation.*;
import reactor.core.publisher.Flux;

import java.time.Duration;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

@RestController
@RequestMapping("/api/ai/v2")
public class EnhancedAIChatController {
    
    private final ChatClient chatClient;
    
    // 存储用户会话上下文(生产环境建议使用Redis)
    private final Map<String, List<Message>> sessionContexts = new ConcurrentHashMap<>();
    
    @Value("${spring.ai.openai.chat.options.model:gpt-3.5-turbo}")
    private String model;
    
    public EnhancedAIChatController(ChatClient chatClient) {
        this.chatClient = chatClient;
    }
    
    /**
     * 增强版流式聊天,支持上下文管理
     */
    @GetMapping(value = "/stream-chat-enhanced", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux<ServerSentEvent<String>> streamChatEnhanced(
            @RequestParam String message,
            @RequestParam(required = false) String sessionId,
            @RequestParam(required = false, defaultValue = "0.7") double temperature,
            @RequestParam(required = false, defaultValue = "false") boolean stream) {
        
        // 获取或创建会话上下文
        String actualSessionId = sessionId != null ? sessionId : generateSessionId();
        List<Message> context = sessionContexts.computeIfAbsent(
            actualSessionId, k -> new ArrayList<>());
        
        // 添加上下文(可选)
        if (context.isEmpty()) {
            context.add(new SystemMessage("你是一个有帮助的AI助手"));
        }
        
        // 添加用户消息到上下文
        UserMessage userMessage = new UserMessage(message);
        context.add(userMessage);
        
        // 创建带选项的Prompt
        OpenAiChatOptions options = OpenAiChatOptions.builder()
            .model(model)
            .temperature(temperature)
            .maxTokens(1000)
            .build();
        
        Prompt prompt = new Prompt(context, options);
        
        // 流式响应
        Flux<ChatResponse> responseFlux = chatClient.stream(prompt);
        
        return responseFlux
            .map(ChatResponse::getResults)
            .flatMapIterable(results -> results)
            .map(Generation::getOutput)
            .map(AssistantMessage::getContent)
            .map(content -> {
                // 更新上下文
                context.add(new AssistantMessage(content));
                
                // 限制上下文长度(防止token超限)
                if (context.size() > 20) {
                    context.remove(1); // 保留系统消息
                }
                
                // 构建SSE事件
                return ServerSentEvent.<String>builder()
                    .data(content)
                    .event("chunk")
                    .id(actualSessionId)
                    .build();
            })
            .onErrorResume(e -> {
                // 错误处理
                return Flux.just(
                    ServerSentEvent.<String>builder()
                        .event("error")
                        .data("服务暂时不可用: " + e.getMessage())
                        .build()
                );
            })
            .mergeWith(heartbeatFlux());
    }
    
    /**
     * 清除会话上下文
     */
    @DeleteMapping("/session/{sessionId}")
    public Map<String, String> clearSession(@PathVariable String sessionId) {
        sessionContexts.remove(sessionId);
        return Map.of("status", "success", "message", "会话已清除");
    }
    
    /**
     * 获取会话信息
     */
    @GetMapping("/session/{sessionId}/info")
    public Map<String, Object> getSessionInfo(@PathVariable String sessionId) {
        List<Message> context = sessionContexts.get(sessionId);
        return Map.of(
            "sessionId", sessionId,
            "hasContext", context != null,
            "contextSize", context != null ? context.size() : 0
        );
    }
    
    private String generateSessionId() {
        return "session_" + System.currentTimeMillis() + "_" + 
               Math.abs((int) (Math.random() * 10000));
    }
    
    private Flux<ServerSentEvent<String>> heartbeatFlux() {
        return Flux.interval(Duration.ofSeconds(30))
            .map(seq -> ServerSentEvent.<String>builder()
                .comment("heartbeat")
                .build());
    }
}

三、前端实现打字机效果

HTML 页面
XML 复制代码
<!DOCTYPE html>
<html lang="zh-CN">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>SpringAI 打字机效果演示</title>
    <style>
        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }
        
        body {
            font-family: 'Segoe UI', 'Microsoft YaHei', sans-serif;
            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
            min-height: 100vh;
            padding: 20px;
        }
        
        .container {
            max-width: 1200px;
            margin: 0 auto;
            display: grid;
            grid-template-columns: 300px 1fr;
            gap: 20px;
            height: calc(100vh - 40px);
        }
        
        /* 左侧控制面板 */
        .control-panel {
            background: rgba(255, 255, 255, 0.95);
            border-radius: 20px;
            padding: 25px;
            box-shadow: 0 20px 60px rgba(0, 0, 0, 0.3);
            backdrop-filter: blur(10px);
            border: 1px solid rgba(255, 255, 255, 0.2);
        }
        
        .panel-title {
            font-size: 24px;
            font-weight: 600;
            color: #333;
            margin-bottom: 25px;
            display: flex;
            align-items: center;
            gap: 10px;
        }
        
        .panel-title i {
            color: #667eea;
        }
        
        .control-group {
            margin-bottom: 25px;
        }
        
        .control-label {
            display: block;
            font-size: 14px;
            color: #666;
            margin-bottom: 8px;
            font-weight: 500;
        }
        
        .slider-container {
            position: relative;
            padding: 10px 0;
        }
        
        .slider-value {
            position: absolute;
            right: 0;
            top: 0;
            background: #667eea;
            color: white;
            padding: 2px 8px;
            border-radius: 12px;
            font-size: 12px;
            font-weight: 600;
        }
        
        input[type="range"] {
            width: 100%;
            height: 6px;
            -webkit-appearance: none;
            background: linear-gradient(to right, #667eea, #764ba2);
            border-radius: 3px;
            outline: none;
        }
        
        input[type="range"]::-webkit-slider-thumb {
            -webkit-appearance: none;
            width: 20px;
            height: 20px;
            background: white;
            border: 2px solid #667eea;
            border-radius: 50%;
            cursor: pointer;
            box-shadow: 0 2px 10px rgba(0, 0, 0, 0.2);
        }
        
        .btn-group {
            display: flex;
            gap: 10px;
            margin-top: 20px;
        }
        
        button {
            flex: 1;
            padding: 12px;
            border: none;
            border-radius: 12px;
            font-weight: 600;
            cursor: pointer;
            transition: all 0.3s ease;
            display: flex;
            align-items: center;
            justify-content: center;
            gap: 8px;
        }
        
        .btn-primary {
            background: linear-gradient(135deg, #667eea, #764ba2);
            color: white;
        }
        
        .btn-secondary {
            background: #f1f3f9;
            color: #667eea;
            border: 2px solid #e4e8f7;
        }
        
        button:hover {
            transform: translateY(-2px);
            box-shadow: 0 10px 20px rgba(0, 0, 0, 0.2);
        }
        
        button:active {
            transform: translateY(0);
        }
        
        button:disabled {
            opacity: 0.5;
            cursor: not-allowed;
            transform: none !important;
        }
        
        /* 右侧聊天区域 */
        .chat-area {
            background: rgba(255, 255, 255, 0.95);
            border-radius: 20px;
            display: flex;
            flex-direction: column;
            box-shadow: 0 20px 60px rgba(0, 0, 0, 0.3);
            backdrop-filter: blur(10px);
            border: 1px solid rgba(255, 255, 255, 0.2);
            overflow: hidden;
        }
        
        .chat-header {
            padding: 20px 25px;
            border-bottom: 1px solid #eee;
            display: flex;
            justify-content: space-between;
            align-items: center;
        }
        
        .header-title {
            font-size: 20px;
            font-weight: 600;
            color: #333;
        }
        
        .connection-status {
            display: flex;
            align-items: center;
            gap: 8px;
            font-size: 14px;
            color: #666;
        }
        
        .status-dot {
            width: 8px;
            height: 8px;
            border-radius: 50%;
            background: #ccc;
        }
        
        .status-dot.connected {
            background: #4CAF50;
            animation: pulse 2s infinite;
        }
        
        @keyframes pulse {
            0% { opacity: 1; }
            50% { opacity: 0.5; }
            100% { opacity: 1; }
        }
        
        .chat-container {
            flex: 1;
            padding: 25px;
            overflow-y: auto;
            display: flex;
            flex-direction: column;
            gap: 20px;
        }
        
        .message {
            max-width: 80%;
            padding: 15px 20px;
            border-radius: 18px;
            line-height: 1.6;
            position: relative;
            animation: fadeIn 0.3s ease;
        }
        
        @keyframes fadeIn {
            from { opacity: 0; transform: translateY(10px); }
            to { opacity: 1; transform: translateY(0); }
        }
        
        .user-message {
            align-self: flex-end;
            background: linear-gradient(135deg, #667eea, #764ba2);
            color: white;
            border-bottom-right-radius: 5px;
        }
        
        .ai-message {
            align-self: flex-start;
            background: #f1f3f9;
            color: #333;
            border-bottom-left-radius: 5px;
        }
        
        .ai-message.typing {
            min-height: 60px;
            display: flex;
            align-items: center;
        }
        
        .typing-indicator {
            display: flex;
            gap: 4px;
        }
        
        .typing-dot {
            width: 8px;
            height: 8px;
            border-radius: 50%;
            background: #667eea;
            animation: typing 1.4s infinite;
        }
        
        .typing-dot:nth-child(2) { animation-delay: 0.2s; }
        .typing-dot:nth-child(3) { animation-delay: 0.4s; }
        
        @keyframes typing {
            0%, 60%, 100% { transform: translateY(0); }
            30% { transform: translateY(-10px); }
        }
        
        .chat-input-area {
            padding: 20px 25px;
            border-top: 1px solid #eee;
            background: #fafafa;
        }
        
        .input-container {
            display: flex;
            gap: 15px;
            align-items: flex-end;
        }
        
        textarea {
            flex: 1;
            min-height: 60px;
            max-height: 150px;
            padding: 15px;
            border: 2px solid #e4e8f7;
            border-radius: 12px;
            font-family: inherit;
            font-size: 14px;
            resize: none;
            outline: none;
            transition: border-color 0.3s ease;
            background: white;
        }
        
        textarea:focus {
            border-color: #667eea;
        }
        
        .send-btn {
            width: 50px;
            height: 50px;
            border-radius: 50%;
            background: linear-gradient(135deg, #667eea, #764ba2);
            color: white;
            border: none;
            cursor: pointer;
            display: flex;
            align-items: center;
            justify-content: center;
            transition: all 0.3s ease;
        }
        
        .send-btn:hover {
            transform: scale(1.1);
            box-shadow: 0 10px 20px rgba(102, 126, 234, 0.4);
        }
        
        /* 滚动条样式 */
        .chat-container::-webkit-scrollbar {
            width: 6px;
        }
        
        .chat-container::-webkit-scrollbar-track {
            background: #f1f1f1;
            border-radius: 3px;
        }
        
        .chat-container::-webkit-scrollbar-thumb {
            background: #c1c1c1;
            border-radius: 3px;
        }
        
        .chat-container::-webkit-scrollbar-thumb:hover {
            background: #a1a1a1;
        }
        
        /* 响应式设计 */
        @media (max-width: 768px) {
            .container {
                grid-template-columns: 1fr;
                height: auto;
            }
            
            .control-panel {
                order: 2;
            }
            
            .message {
                max-width: 90%;
            }
        }
    </style>
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0/css/all.min.css">
</head>
<body>
    <div class="container">
        <!-- 左侧控制面板 -->
        <div class="control-panel">
            <div class="panel-title">
                <i class="fas fa-sliders-h"></i>
                控制面板
            </div>
            
            <div class="control-group">
                <label class="control-label">
                    <i class="fas fa-thermometer-half"></i>
                    温度 (Temperature)
                </label>
                <div class="slider-container">
                    <span class="slider-value" id="tempValue">0.7</span>
                    <input type="range" id="temperature" min="0" max="1" step="0.1" value="0.7">
                </div>
                <div class="help-text">值越高,回答越随机;值越低,回答越确定</div>
            </div>
            
            <div class="control-group">
                <label class="control-label">
                    <i class="fas fa-tachometer-alt"></i>
                    速度 (Speed)
                </label>
                <div class="slider-container">
                    <span class="slider-value" id="speedValue">50</span>
                    <input type="range" id="typingSpeed" min="10" max="100" value="50">
                </div>
                <div class="help-text">打字机效果的速度</div>
            </div>
            
            <div class="control-group">
                <label class="control-label">
                    <i class="fas fa-robot"></i>
                    AI 模型
                </label>
                <select id="modelSelect" class="model-select">
                    <option value="gpt-3.5-turbo">GPT-3.5 Turbo</option>
                    <option value="gpt-4">GPT-4</option>
                    <option value="gpt-4-turbo">GPT-4 Turbo</option>
                </select>
            </div>
            
            <div class="control-group">
                <label class="control-label">
                    <i class="fas fa-history"></i>
                    上下文长度
                </label>
                <select id="contextLength" class="model-select">
                    <option value="5">5 轮对话</option>
                    <option value="10" selected>10 轮对话</option>
                    <option value="20">20 轮对话</option>
                    <option value="50">50 轮对话</option>
                </select>
            </div>
            
            <div class="btn-group">
                <button class="btn-secondary" onclick="clearChat()">
                    <i class="fas fa-trash"></i>
                    清空对话
                </button>
                <button class="btn-secondary" onclick="newSession()">
                    <i class="fas fa-plus"></i>
                    新会话
                </button>
            </div>
            
            <div class="connection-info">
                <div class="control-label">
                    <i class="fas fa-link"></i>
                    连接状态
                </div>
                <div class="connection-status">
                    <span class="status-dot" id="connectionDot"></span>
                    <span id="connectionStatus">未连接</span>
                </div>
            </div>
        </div>
        
        <!-- 右侧聊天区域 -->
        <div class="chat-area">
            <div class="chat-header">
                <div class="header-title">
                    <i class="fas fa-robot"></i>
                    SpringAI 智能助手
                </div>
                <div class="connection-status">
                    <span class="status-dot" id="chatConnectionDot"></span>
                    <span id="chatConnectionStatus">准备就绪</span>
                </div>
            </div>
            
            <div class="chat-container" id="chatContainer">
                <div class="message ai-message">
                    您好!我是基于 SpringAI 的智能助手。我可以帮您解答问题、编写代码、创作内容等。
                    <br><br>
                    请在下方的输入框中输入您的问题,我会以打字机效果实时回复您。
                </div>
            </div>
            
            <div class="chat-input-area">
                <div class="input-container">
                    <textarea 
                        id="messageInput" 
                        placeholder="请输入您的问题(Shift + Enter 换行,Enter 发送)..."
                        onkeydown="handleKeyPress(event)"
                        rows="1"></textarea>
                    <button class="send-btn" id="sendButton" onclick="sendMessage()">
                        <i class="fas fa-paper-plane"></i>
                    </button>
                </div>
                <div class="input-hint">
                    <small>按 Enter 发送,Shift + Enter 换行</small>
                </div>
            </div>
        </div>
    </div>

    <script>
        // 全局变量
        let eventSource = null;
        let sessionId = 'session_' + Date.now();
        let isConnected = false;
        let typingSpeed = 50; // 打字速度(字符/秒)
        let currentTypingAnimation = null;
        
        // 初始化
        document.addEventListener('DOMContentLoaded', function() {
            // 初始化滑块
            initSliders();
            
            // 初始化事件监听
            initEventListeners();
            
            // 自动调整输入框高度
            autoResizeTextarea();
            
            // 连接SSE
            connectSSE();
        });
        
        // 初始化滑块
        function initSliders() {
            const tempSlider = document.getElementById('temperature');
            const tempValue = document.getElementById('tempValue');
            const speedSlider = document.getElementById('typingSpeed');
            const speedValue = document.getElementById('speedValue');
            
            tempSlider.addEventListener('input', function() {
                tempValue.textContent = this.value;
            });
            
            speedSlider.addEventListener('input', function() {
                typingSpeed = parseInt(this.value);
                speedValue.textContent = this.value;
            });
        }
        
        // 初始化事件监听
        function initEventListeners() {
            // 输入框自动高度
            const textarea = document.getElementById('messageInput');
            textarea.addEventListener('input', autoResizeTextarea);
            
            // 模型切换
            document.getElementById('modelSelect').addEventListener('change', function() {
                console.log('切换模型:', this.value);
            });
            
            // 上下文长度切换
            document.getElementById('contextLength').addEventListener('change', function() {
                console.log('切换上下文长度:', this.value);
            });
        }
        
        // 自动调整输入框高度
        function autoResizeTextarea() {
            const textarea = document.getElementById('messageInput');
            textarea.style.height = 'auto';
            textarea.style.height = Math.min(textarea.scrollHeight, 150) + 'px';
        }
        
        // 连接SSE
        function connectSSE() {
            if (eventSource) {
                eventSource.close();
            }
            
            updateConnectionStatus('connecting');
            
            // 构建SSE URL
            const model = document.getElementById('modelSelect').value;
            const temperature = document.getElementById('temperature').value;
            const url = `/api/ai/stream-chat?sessionId=${encodeURIComponent(sessionId)}&temperature=${temperature}&model=${model}`;
            
            eventSource = new EventSource(url);
            
            eventSource.onopen = function() {
                updateConnectionStatus('connected');
                console.log('SSE连接已建立');
            };
            
                        eventSource.onmessage = function(event) {
                const data = JSON.parse(event.data);
                handleStreamData(data);
            };
            
            eventSource.onerror = function(error) {
                console.error('SSE连接错误:', error);
                updateConnectionStatus('error');
                
                // 尝试重新连接
                setTimeout(() => {
                    if (!isConnected) {
                        console.log('尝试重新连接...');
                        connectSSE();
                    }
                }, 3000);
            };
        }
        
        // 更新连接状态
        function updateConnectionStatus(status) {
            const dot = document.getElementById('connectionDot');
            const chatDot = document.getElementById('chatConnectionDot');
            const statusText = document.getElementById('connectionStatus');
            const chatStatusText = document.getElementById('chatConnectionStatus');
            
            switch(status) {
                case 'connecting':
                    dot.className = 'status-dot connecting';
                    chatDot.className = 'status-dot connecting';
                    statusText.textContent = '连接中...';
                    chatStatusText.textContent = '连接中...';
                    isConnected = false;
                    break;
                case 'connected':
                    dot.className = 'status-dot connected';
                    chatDot.className = 'status-dot connected';
                    statusText.textContent = '已连接';
                    chatStatusText.textContent = '已连接';
                    isConnected = true;
                    break;
                case 'error':
                    dot.className = 'status-dot error';
                    chatDot.className = 'status-dot error';
                    statusText.textContent = '连接错误';
                    chatStatusText.textContent = '连接错误';
                    isConnected = false;
                    break;
                case 'disconnected':
                    dot.className = 'status-dot';
                    chatDot.className = 'status-dot';
                    statusText.textContent = '未连接';
                    chatStatusText.textContent = '未连接';
                    isConnected = false;
                    break;
            }
        }
        
        // 处理流式数据
        function handleStreamData(data) {
            const { type, content, messageId, isComplete } = data;
            
            switch(type) {
                case 'start':
                    // 开始新的AI消息
                    createAIMessage(messageId);
                    break;
                    
                case 'content':
                    // 追加内容到AI消息
                    appendToAIMessage(messageId, content);
                    break;
                    
                case 'complete':
                    // 消息完成
                    completeAIMessage(messageId, isComplete);
                    break;
                    
                case 'error':
                    // 错误处理
                    showErrorMessage(content);
                    break;
            }
        }
        
        // 创建AI消息容器
        function createAIMessage(messageId) {
            const chatContainer = document.getElementById('chatContainer');
            
            // 移除之前的打字中状态
            const existingTyping = chatContainer.querySelector('.ai-message.typing');
            if (existingTyping) {
                chatContainer.removeChild(existingTyping);
            }
            
            // 创建新的消息容器
            const messageDiv = document.createElement('div');
            messageDiv.className = 'message ai-message typing';
            messageDiv.id = `msg-${messageId}`;
            messageDiv.dataset.messageId = messageId;
            messageDiv.dataset.content = '';
            
            // 添加打字指示器
            const typingIndicator = document.createElement('div');
            typingIndicator.className = 'typing-indicator';
            typingIndicator.innerHTML = `
                <div class="typing-dot"></div>
                <div class="typing-dot"></div>
                <div class="typing-dot"></div>
            `;
            
            messageDiv.appendChild(typingIndicator);
            chatContainer.appendChild(messageDiv);
            
            // 滚动到底部
            scrollToBottom();
        }
        
        // 追加内容到AI消息(打字机效果)
        function appendToAIMessage(messageId, newContent) {
            const messageDiv = document.getElementById(`msg-${messageId}`);
            if (!messageDiv) return;
            
            const currentContent = messageDiv.dataset.content || '';
            const targetContent = currentContent + newContent;
            
            // 停止之前的动画
            if (currentTypingAnimation) {
                clearTimeout(currentTypingAnimation);
            }
            
            // 打字机动画
            let index = currentContent.length;
            const speed = 1000 / typingSpeed; // 每个字符的间隔时间
            
            function typeCharacter() {
                if (index < targetContent.length) {
                    messageDiv.dataset.content = targetContent.substring(0, index + 1);
                    
                    // 移除打字指示器
                    const typingIndicator = messageDiv.querySelector('.typing-indicator');
                    if (typingIndicator && index === 0) {
                        messageDiv.removeChild(typingIndicator);
                    }
                    
                    // 添加内容
                    if (index === 0) {
                        messageDiv.innerHTML = '';
                        messageDiv.className = 'message ai-message';
                    }
                    
                    // 创建或更新内容显示
                    let contentSpan = messageDiv.querySelector('.message-content');
                    if (!contentSpan) {
                        contentSpan = document.createElement('span');
                        contentSpan.className = 'message-content';
                        messageDiv.appendChild(contentSpan);
                    }
                    
                    contentSpan.textContent = messageDiv.dataset.content;
                    
                    // 高亮代码块
                    highlightCodeBlocks(messageDiv);
                    
                    // 处理Markdown格式
                    formatMarkdown(messageDiv);
                    
                    index++;
                    currentTypingAnimation = setTimeout(typeCharacter, speed);
                    
                    // 滚动到底部
                    scrollToBottom();
                }
            }
            
            typeCharacter();
        }
        
        // 完成AI消息
        function completeAIMessage(messageId, success) {
            const messageDiv = document.getElementById(`msg-${messageId}`);
            if (!messageDiv) return;
            
            // 移除打字中状态
            const typingIndicator = messageDiv.querySelector('.typing-indicator');
            if (typingIndicator) {
                messageDiv.removeChild(typingIndicator);
            }
            
            // 如果消息为空,添加提示
            if (!messageDiv.dataset.content || messageDiv.dataset.content.trim() === '') {
                messageDiv.innerHTML = '<span class="message-content">AI没有返回内容</span>';
            }
            
            // 添加完成标记
            if (success) {
                messageDiv.classList.add('complete');
                
                // 添加复制按钮
                addCopyButton(messageDiv);
            } else {
                messageDiv.classList.add('error');
                messageDiv.innerHTML = '<span class="message-content">消息生成失败</span>';
            }
            
            // 清理动画
            if (currentTypingAnimation) {
                clearTimeout(currentTypingAnimation);
                currentTypingAnimation = null;
            }
            
            // 滚动到底部
            scrollToBottom();
        }
        
        // 发送消息
        function sendMessage() {
            const input = document.getElementById('messageInput');
            const message = input.value.trim();
            
            if (!message) {
                showToast('请输入消息');
                return;
            }
            
            if (!isConnected) {
                showToast('正在连接服务器,请稍后...');
                return;
            }
            
            // 添加用户消息到界面
            addUserMessage(message);
            
            // 清空输入框
            input.value = '';
            autoResizeTextarea();
            
            // 发送到服务器
            const model = document.getElementById('modelSelect').value;
            const temperature = document.getElementById('temperature').value;
            const contextLength = document.getElementById('contextLength').value;
            
            fetch('/api/ai/chat', {
                method: 'POST',
                headers: {
                    'Content-Type': 'application/json',
                },
                body: JSON.stringify({
                    sessionId: sessionId,
                    message: message,
                    temperature: parseFloat(temperature),
                    model: model,
                    maxContextLength: parseInt(contextLength)
                })
            })
            .then(response => {
                if (!response.ok) {
                    throw new Error('发送失败');
                }
            })
            .catch(error => {
                console.error('发送消息失败:', error);
                showToast('发送失败,请检查网络连接');
            });
        }
        
        // 添加用户消息
        function addUserMessage(content) {
            const chatContainer = document.getElementById('chatContainer');
            
            const messageDiv = document.createElement('div');
            messageDiv.className = 'message user-message';
            messageDiv.innerHTML = `<span class="message-content">${escapeHtml(content)}</span>`;
            
            chatContainer.appendChild(messageDiv);
            scrollToBottom();
        }
        
        // 显示错误消息
        function showErrorMessage(error) {
            const chatContainer = document.getElementById('chatContainer');
            
            const errorDiv = document.createElement('div');
            errorDiv.className = 'message ai-message error';
            errorDiv.innerHTML = `<span class="message-content">错误: ${escapeHtml(error)}</span>`;
            
            chatContainer.appendChild(errorDiv);
            scrollToBottom();
        }
        
        // 滚动到底部
        function scrollToBottom() {
            const chatContainer = document.getElementById('chatContainer');
            chatContainer.scrollTop = chatContainer.scrollHeight;
        }
        
        // 键盘事件处理
        function handleKeyPress(event) {
            if (event.key === 'Enter' && !event.shiftKey) {
                event.preventDefault();
                sendMessage();
            }
        }
        
        // 清空对话
        function clearChat() {
            const chatContainer = document.getElementById('chatContainer');
            chatContainer.innerHTML = `
                <div class="message ai-message">
                    对话已清空。请输入您的问题开始新的对话。
                </div>
            `;
            
            // 生成新的会话ID
            sessionId = 'session_' + Date.now();
            
            showToast('对话已清空');
        }
        
        // 新会话
        function newSession() {
            sessionId = 'session_' + Date.now();
            clearChat();
            showToast('新会话已创建');
        }
        
        // 添加复制按钮
        function addCopyButton(messageDiv) {
            const copyBtn = document.createElement('button');
            copyBtn.className = 'copy-btn';
            copyBtn.innerHTML = '<i class="far fa-copy"></i>';
            copyBtn.title = '复制内容';
            copyBtn.onclick = function() {
                const content = messageDiv.dataset.content || 
                              messageDiv.querySelector('.message-content').textContent;
                navigator.clipboard.writeText(content).then(() => {
                    showToast('已复制到剪贴板');
                });
            };
            
            messageDiv.appendChild(copyBtn);
        }
        
        // 高亮代码块
        function highlightCodeBlocks(messageDiv) {
            const content = messageDiv.innerHTML;
            const codeBlockRegex = /```(\w+)?\n([\s\S]*?)```/g;
            
            let highlighted = content.replace(codeBlockRegex, (match, lang, code) => {
                const language = lang || 'text';
                return `
                    <div class="code-block">
                        <div class="code-header">
                            <span class="language">${language}</span>
                            <button class="copy-code-btn" onclick="copyCode(this)">
                                <i class="far fa-copy"></i>
                            </button>
                        </div>
                        <pre><code class="language-${language}">${escapeHtml(code.trim())}</code></pre>
                    </div>
                `;
            });
            
            // 行内代码
            highlighted = highlighted.replace(/`([^`]+)`/g, '<code class="inline-code">$1</code>');
            
            messageDiv.innerHTML = highlighted;
            
            // 如果有highlight.js,应用语法高亮
            if (window.hljs) {
                messageDiv.querySelectorAll('pre code').forEach((block) => {
                    hljs.highlightElement(block);
                });
            }
        }
        
        // 格式化Markdown
        function formatMarkdown(messageDiv) {
            let content = messageDiv.innerHTML;
            
            // 标题
            content = content.replace(/^### (.*$)/gm, '<h3>$1</h3>');
            content = content.replace(/^## (.*$)/gm, '<h2>$1</h2>');
            content = content.replace(/^# (.*$)/gm, '<h1>$1</h1>');
            
            // 列表
            content = content.replace(/^\* (.*$)/gm, '<li>$1</li>');
            content = content.replace(/(<li>.*<\/li>)/s, '<ul>$1</ul>');
            
            // 粗体
            content = content.replace(/\*\*(.*?)\*\*/g, '<strong>$1</strong>');
            
            // 斜体
            content = content.replace(/\*(.*?)\*/g, '<em>$1</em>');
            
            // 链接
            content = content.replace(/\[([^\]]+)\]\(([^)]+)\)/g, '<a href="$2" target="_blank">$1</a>');
            
            messageDiv.innerHTML = content;
        }
        
        // 复制代码
        function copyCode(button) {
            const codeBlock = button.closest('.code-block');
            const code = codeBlock.querySelector('code').textContent;
            
            navigator.clipboard.writeText(code).then(() => {
                const originalHTML = button.innerHTML;
                button.innerHTML = '<i class="fas fa-check"></i>';
                button.style.color = '#4CAF50';
                
                setTimeout(() => {
                    button.innerHTML = originalHTML;
                    button.style.color = '';
                }, 2000);
            });
        }
        
        // 显示Toast提示
        function showToast(message) {
            // 移除现有的toast
            const existingToast = document.querySelector('.toast');
            if (existingToast) {
                existingToast.remove();
            }
            
            const toast = document.createElement('div');
            toast.className = 'toast';
            toast.textContent = message;
            
            // 添加样式
            toast.style.cssText = `
                position: fixed;
                bottom: 20px;
                left: 50%;
                transform: translateX(-50%) translateY(100px);
                background: rgba(0, 0, 0, 0.8);
                color: white;
                padding: 12px 24px;
                border-radius: 8px;
                font-size: 14px;
                z-index: 1000;
                transition: transform 0.3s ease;
            `;
            
            document.body.appendChild(toast);
            
            // 显示动画
            setTimeout(() => {
                toast.style.transform = 'translateX(-50%) translateY(0)';
            }, 10);
            
            // 自动消失
            setTimeout(() => {
                toast.style.transform = 'translateX(-50%) translateY(100px)';
                setTimeout(() => {
                    if (toast.parentNode) {
                        toast.parentNode.removeChild(toast);
                    }
                }, 300);
            }, 3000);
        }
        
        // HTML转义
        function escapeHtml(text) {
            const div = document.createElement('div');
            div.textContent = text;
            return div.innerHTML;
        }
        
        // 添加CSS样式
        const style = document.createElement('style');
        style.textContent = `
            .code-block {
                background: #282c34;
                border-radius: 8px;
                margin: 10px 0;
                overflow: hidden;
            }
            
            .code-header {
                display: flex;
                justify-content: space-between;
                align-items: center;
                padding: 8px 16px;
                background: #1e2227;
                color: #abb2bf;
                font-size: 12px;
                font-family: 'Consolas', monospace;
            }
            
            .copy-code-btn {
                background: none;
                border: none;
                color: #abb2bf;
                cursor: pointer;
                padding: 4px 8px;
                border-radius: 4px;
                transition: background 0.2s;
            }
            
            .copy-code-btn:hover {
                background: rgba(255, 255, 255, 0.1);
            }
            
            .code-block pre {
                margin: 0;
                padding: 16px;
                overflow-x: auto;
            }
            
            .code-block code {
                font-family: 'Consolas', 'Monaco', 'Courier New', monospace;
                font-size: 14px;
                line-height: 1.5;
                color: #abb2bf;
            }
            
            .inline-code {
                background: #f1f3f9;
                padding: 2px 6px;
                border-radius: 4px;
                font-family: 'Consolas', monospace;
                font-size: 0.9em;
                color: #e74c3c;
            }
            
            .message {
                position: relative;
            }
            
            .copy-btn {
                position: absolute;
                top: 10px;
                right: 10px;
                background: rgba(255, 255, 255, 0.8);
                border: none;
                border-radius: 4px;
                padding: 4px 8px;
                font-size: 12px;
                cursor: pointer;
                opacity: 0;
                transition: opacity 0.2s;
            }
            
            .message:hover .copy-btn {
                opacity: 1;
            }
            
            .message.error {
                background: #ffebee;
                color: #c62828;
                border: 1px solid #ffcdd2;
            }
            
            .status-dot.connecting {
                background: #ff9800;
                animation: pulse 1s infinite;
            }
            
            .status-dot.error {
                background: #f44336;
                animation: none;
            }
            
            @keyframes pulse {
                0% { opacity: 1; }
                50% { opacity: 0.5; }
                100% { opacity: 1; }
            }
            
            .help-text {
                font-size: 12px;
                color: #888;
                margin-top: 4px;
            }
            
            .model-select {
                width: 100%;
                padding: 10px;
                border: 2px solid #e4e8f7;
                border-radius: 8px;
                font-size: 14px;
                outline: none;
                background: white;
                cursor: pointer;
            }
            
            .model-select:focus {
                border-color: #667eea;
            }
            
            .connection-info {
                margin-top: 30px;
                padding-top: 20px;
                border-top: 1px solid #eee;
            }
            
            .input-hint {
                margin-top: 8px;
                text-align: right;
                color: #888;
                font-size: 12px;
            }
        `;
        document.head.appendChild(style);
    </script>
</body

4. 批量问答、多轮对话基础实现

我来详细介绍SpringAI中批量问答和多轮对话的基础实现。

环境配置

首先添加依赖:

XML 复制代码
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>

配置application.yml:

Crystal 复制代码
spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      chat:
        options:
          model: gpt-3.5-turbo
          temperature: 0.7

批量问答实现

2.1 基础批量问答
java 复制代码
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.messages.Message;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.stereotype.Service;
import reactor.core.publisher.Flux;

import java.util.List;
import java.util.ArrayList;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

@Service
public class BatchChatService {
    
    private final ChatClient chatClient;
    private final ExecutorService executorService;
    
    public BatchChatService(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
        this.executorService = Executors.newFixedThreadPool(10);
    }
    
    /**
     * 同步批量问答
     */
    public List<String> batchChatSync(List<String> questions) {
        List<String> answers = new ArrayList<>();
        
        for (String question : questions) {
            String answer = chatClient.prompt()
                .user(question)
                .call()
                .content();
            answers.add(answer);
        }
        
        return answers;
    }
    
    /**
     * 异步批量问答(提高性能)
     */
    public CompletableFuture<List<String>> batchChatAsync(List<String> questions) {
        List<CompletableFuture<String>> futures = questions.stream()
            .map(question -> CompletableFuture.supplyAsync(() -> 
                chatClient.prompt()
                    .user(question)
                    .call()
                    .content(),
                executorService))
            .toList();
        
        return CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]))
            .thenApply(v -> futures.stream()
                .map(CompletableFuture::join)
                .toList());
    }
    
    /**
     * 流式批量问答(逐个流式输出)
     */
    public Flux<String> batchChatStream(List<String> questions) {
        return Flux.fromIterable(questions)
            .flatMap(question -> 
                chatClient.prompt()
                    .user(question)
                    .stream()
                    .map(chatResponse -> chatResponse.getResult().getOutput().getContent())
                    .reduce("", (acc, content) -> acc + content)
            );
    }
    
    /**
     * 带上下文的批量问答
     */
    public List<String> batchChatWithContext(List<String> questions, String context) {
        List<String> answers = new ArrayList<>();
        
        for (String question : questions) {
            String answer = chatClient.prompt()
                .system(context)  // 设置系统上下文
                .user(question)
                .call()
                .content();
            answers.add(answer);
        }
        
        return answers;
    }
}
2.2 批量问答控制器
java 复制代码
import org.springframework.web.bind.annotation.*;
import org.springframework.http.ResponseEntity;

import java.util.List;
import java.util.Map;
import java.util.concurrent.CompletableFuture;

@RestController
@RequestMapping("/api/batch")
public class BatchChatController {
    
    private final BatchChatService batchChatService;
    
    public BatchChatController(BatchChatService batchChatService) {
        this.batchChatService = batchChatService;
    }
    
    @PostMapping("/sync")
    public ResponseEntity<Map<String, List<String>>> batchChatSync(
            @RequestBody List<String> questions) {
        List<String> answers = batchChatService.batchChatSync(questions);
        return ResponseEntity.ok(Map.of("answers", answers));
    }
    
    @PostMapping("/async")
    public CompletableFuture<ResponseEntity<Map<String, List<String>>>> batchChatAsync(
            @RequestBody List<String> questions) {
        return batchChatService.batchChatAsync(questions)
            .thenApply(answers -> 
                ResponseEntity.ok(Map.of("answers", answers)));
    }
    
    @PostMapping("/stream")
    public Flux<String> batchChatStream(@RequestBody List<String> questions) {
        return batchChatService.batchChatStream(questions);
    }
    
    @PostMapping("/with-context")
    public ResponseEntity<Map<String, List<String>>> batchChatWithContext(
            @RequestBody BatchChatRequest request) {
        List<String> answers = batchChatService.batchChatWithContext(
            request.getQuestions(), 
            request.getContext()
        );
        return ResponseEntity.ok(Map.of("answers", answers));
    }
    
    // 请求DTO
    public static class BatchChatRequest {
        private List<String> questions;
        private String context;
        
        // getters and setters
    }
}

多轮对话实现

3.1 对话会话管理
java 复制代码
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.messages.Message;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.chat.messages.AssistantMessage;
import org.springframework.ai.chat.messages.SystemMessage;
import org.springframework.stereotype.Service;

import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

@Service
public class MultiTurnChatService {
    
    private final ChatClient chatClient;
    private final Map<String, List<Message>> conversationHistory;
    
    public MultiTurnChatService(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
        this.conversationHistory = new ConcurrentHashMap<>();
    }
    
    /**
     * 开始新对话
     */
    public String startConversation(String sessionId, String systemPrompt) {
        List<Message> messages = new ArrayList<>();
        if (systemPrompt != null && !systemPrompt.isEmpty()) {
            messages.add(new SystemMessage(systemPrompt));
        }
        conversationHistory.put(sessionId, messages);
        return "对话已开始,sessionId: " + sessionId;
    }
    
    /**
     * 多轮对话(保持历史)
     */
    public String chat(String sessionId, String userMessage) {
        List<Message> messages = conversationHistory.get(sessionId);
        if (messages == null) {
            messages = new ArrayList<>();
            conversationHistory.put(sessionId, messages);
        }
        
        // 添加用户消息
        messages.add(new UserMessage(userMessage));
        
        // 调用AI
        ChatResponse response = chatClient.prompt()
            .messages(messages)
            .call()
            .chatResponse();
        
        // 获取AI回复
        String assistantReply = response.getResult().getOutput().getContent();
        
        // 添加AI回复到历史
        messages.add(new AssistantMessage(assistantReply));
        
        return assistantReply;
    }
    
    /**
     * 带上下文窗口的多轮对话(限制历史长度)
     */
    public String chatWithContextWindow(String sessionId, String userMessage, int maxHistory) {
        List<Message> messages = conversationHistory.get(sessionId);
        if (messages == null) {
            messages = new ArrayList<>();
            conversationHistory.put(sessionId, messages);
        }
        
        // 添加用户消息
        messages.add(new UserMessage(userMessage));
        
        // 限制历史消息长度
        if (messages.size() > maxHistory * 2) { // 每轮对话有user和assistant两条消息
            List<Message> recentMessages = messages.subList(
                messages.size() - maxHistory * 2, 
                messages.size()
            );
            messages = new ArrayList<>(recentMessages);
            conversationHistory.put(sessionId, messages);
        }
        
        // 调用AI
        String assistantReply = chatClient.prompt()
            .messages(messages)
            .call()
            .content();
        
        // 添加AI回复到历史
        messages.add(new AssistantMessage(assistantReply));
        
        return assistantReply;
    }
    
    /**
     * 获取对话历史
     */
    public List<Map<String, String>> getConversationHistory(String sessionId) {
        List<Message> messages = conversationHistory.get(sessionId);
        if (messages == null) {
            return new ArrayList<>();
        }
        
        List<Map<String, String>> history = new ArrayList<>();
        for (Message message : messages) {
            history.add(Map.of(
                "role", message.getMessageType().getValue(),
                "content", message.getContent()
            ));
        }
        return history;
    }
    
    /**
     * 清除对话历史
     */
    public void clearConversationHistory(String sessionId) {
        conversationHistory.remove(sessionId);
    }
    
    /**
     * 带记忆的对话(自动总结长对话)
     */
    public String chatWithMemory(String sessionId, String userMessage) {
        List<Message> messages = conversationHistory.get(sessionId);
        if (messages == null) {
            messages = new ArrayList<>();
            conversationHistory.put(sessionId, messages);
        }
        
        // 如果对话历史太长,进行总结
        if (messages.size() > 20) {
            summarizeConversation(sessionId, messages);
        }
        
        // 继续对话
        return chat(sessionId, userMessage);
    }
    
    private void summarizeConversation(String sessionId, List<Message> messages) {
        // 提取最近的对话进行总结
        List<Message> recentMessages = messages.subList(
            Math.max(0, messages.size() - 10), 
            messages.size()
        );
        
        String summaryPrompt = "请总结以下对话的主要内容:\n" +
            recentMessages.stream()
                .map(m -> m.getMessageType().getValue() + ": " + m.getContent())
                .reduce("", (a, b) -> a + "\n" + b);
        
        String summary = chatClient.prompt()
            .user(summaryPrompt)
            .call()
            .content();
        
        // 清空历史,只保留总结
        messages.clear();
        messages.add(new SystemMessage("之前的对话总结:" + summary));
    }
}
3.2 多轮对话控制器
java 复制代码
import org.springframework.web.bind.annotation.*;
import org.springframework.http.ResponseEntity;
import reactor.core.publisher.Flux;

import java.util.List;
import java.util.Map;

@RestController
@RequestMapping("/api/conversation")
public class MultiTurnChatController {
    
    private final MultiTurnChatService chatService;
    
    public MultiTurnChatController(MultiTurnChatService chatService) {
        this.chatService = chatService;
    }
    
    @PostMapping("/start")
    public ResponseEntity<Map<String, String>> startConversation(
            @RequestParam(required = false) String systemPrompt) {
        String sessionId = generateSessionId();
        String result = chatService.startConversation(sessionId, systemPrompt);
        return ResponseEntity.ok(Map.of(
            "sessionId", sessionId,
            "message", result
        ));
    }
    
    @PostMapping("/chat")
    public ResponseEntity<Map<String, String>> chat(
            @RequestParam String sessionId,
            @RequestBody ChatRequest request) {
        String reply = chatService.chat(sessionId, request.getMessage());
        return ResponseEntity.ok(Map.of(
            "sessionId", sessionId,
            "reply", reply
        ));
    }
    
    @PostMapping("/chat-stream")
    public Flux<String> chatStream(
            @RequestParam String sessionId,
            @RequestBody ChatRequest request) {
        // 流式对话实现
        return chatService.chatStream(sessionId, request.getMessage());
    }
    
    @PostMapping("/chat-with-context")
    public ResponseEntity<Map<String, String>> chatWithContext(
            @RequestParam String sessionId,
            @RequestParam(defaultValue = "10") int maxHistory,
            @RequestBody ChatRequest request) {
        String reply = chatService.chatWithContextWindow(
            sessionId, 
            request.getMessage(), 
            maxHistory
        );
        return ResponseEntity.ok(Map.of(
            "sessionId", sessionId,
            "reply", reply
        ));
    }
    
    @GetMapping("/history/{sessionId}")
    public ResponseEntity<Map<String, Object>> getHistory(
            @PathVariable String sessionId) {
        List<Map<String, String>> history = 
            chatService.getConversationHistory(sessionId);
        return ResponseEntity.ok(Map.of(
            "sessionId", sessionId,
            "history", history
        ));
    }
    
    @DeleteMapping("/history/{sessionId}")
    public ResponseEntity<Map<String, String>> clearHistory(
            @PathVariable String sessionId) {
        chatService.clearConversationHistory(sessionId);
        return ResponseEntity.ok(Map.of(
            "sessionId", sessionId,
            "message", "对话历史已清除"
        ));
    }
    
    private String generateSessionId() {
        return "session_" + System.currentTimeMillis() + "_" + 
               Math.abs((int) (Math.random() * 10000));
    }
    
    // 请求DTO
    public static class ChatRequest {
        private String message;
        
        // getters and setters
    }
}

高级功能实现

4.1 批量多轮对话
java 复制代码
@Service
public class AdvancedChatService {
    
    private final ChatClient chatClient;
    private final Map<String, ConversationSession> sessions;
    
    public AdvancedChatService(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
        this.sessions = new ConcurrentHashMap<>();
    }
    
    /**
     * 批量多轮对话
     */
    public Map<String, List<String>> batchMultiTurnChat(
            Map<String, List<String>> conversations) {
        
        Map<String, List<String>> results = new ConcurrentHashMap<>();
        
        conversations.entrySet().parallelStream().forEach(entry -> {
            String sessionId = entry.getKey();
            List<String> userMessages = entry.getValue();
            List<String> responses = new ArrayList<>();
            
            for (String message : userMessages) {
                String response = chat(sessionId, message);
                responses.add(response);
            }
            
            results.put(sessionId, responses);
        });
        
        return results;
    }
    
    /**
     * 带条件的批量处理
     */
    public List<ChatResult> processBatchWithConditions(
            List<ChatTask> tasks) {
        
        return tasks.parallelStream()
            .map(task -> {
                try {
                    String response = chatClient.prompt()
                        .system(task.getSystemPrompt())
                        .user(task.getUserMessage())
                        .call()
                        .content();
                    
                    return new ChatResult(task.getId(), response, "SUCCESS");
                } catch (Exception e) {
                    return new ChatResult(task.getId(), null, "ERROR: " + e.getMessage());
                }
            })
            .toList();
    }
    
    // 内部类定义
    @Data
    public static class ConversationSession {
        private String sessionId;
        private List<Message> history;
        private LocalDateTime createdAt;
        private LocalDateTime lastActive;
    }
    
    @Data
    public static class ChatTask {
        private String id;
        private String systemPrompt;
        private String userMessage;
    }
    
    @Data
    public static class ChatResult {
        private String taskId;
        private String response;
        private String status;
    }
}
4.2 配置类
java 复制代码
@Configuration
public class ChatConfig {
    
    @Bean
    public ChatClient chatClient(ChatClient.Builder builder) {
        return builder
            .defaultSystem("你是一个有帮助的AI助手")
            .defaultOptions(ChatOptions.builder()
                .temperature(0.7)
                .maxTokens(1000)
                .build())
            .build();
    }
    
    @Bean
    public TaskExecutor chatTaskExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(10);
        executor.setMaxPoolSize(50);
        executor.setQueueCapacity(100);
        executor.setThreadNamePrefix("chat-");
        executor.initialize();
        return executor;
    }
}

使用示例

5.1 批量问答调用
java 复制代码
@RestController
public class ExampleController {
    
    @Autowired
    private BatchChatService batchChatService;
    
    @Autowired
    private MultiTurnChatService chatService;
    
    // 批量问答示例
    public void batchExample() {
        List<String> questions = Arrays.asList(
            "什么是人工智能?",
            "Spring Boot有什么优势?",
            "如何学习编程?"
        );
        
        // 同步批量
        List<String> answers = batchChatService.batchChatSync(questions);
        
        // 异步批量
        CompletableFuture<List<String>> futureAnswers = 
            batchChatService.batchChatAsync(questions);
    }
    
    // 多轮对话示例
    public void multiTurnExample() {
        String sessionId = "user123";
        
        // 开始对话
        chatService.startConversation(sessionId, "你是一个编程助手");
        
        // 第一轮
        String reply1 = chatService.chat(sessionId, "如何学习Java?");
        
        // 第二轮(会记住上下文)
        String reply2 = chatService.chat(sessionId, "那Spring框架呢?");
        
        // 获取历史
        List<Map<String, String>> history = 
            chatService.getConversationHistory(sessionId);
    }
}

关键点总结

  1. 批量问答 :
  1. 使用同步/异步方式处理多个问题
  2. 考虑性能优化和资源管理
  3. 支持流式输出
  1. 多轮对话 :
  1. 会话管理(session管理)
  2. 对话历史维护
  3. 上下文窗口控制
  4. 记忆管理(自动总结)
  1. 最佳实践 :
  1. 使用线程池处理并发
  2. 实现会话超时清理
  3. 添加请求限流
  4. 记录对话日志
  5. 异常处理和重试机制

这个实现提供了基础的批量问答和多轮对话功能,可以根据实际需求进行扩展和优化。

5. 基础参数调优:温度、最大生成长度、TopP、惩罚系数

SpringAI 是一个基于 Spring 生态的 AI 应用开发框架,它抽象并集成了多种大语言模型(如 OpenAI GPT、Anthropic Claude、本地模型等)。在调用这些模型的 API 时,有几个核心参数直接影响生成文本的质量、多样性和可控性。

以下是 SpringAI 中这些基础参数的详细解释与调优指南:

温度(Temperature)

作用 :控制生成文本的随机性和创造性。值越高,输出越随机、多样;值越低,输出越确定、保守。

  • 取值范围 :通常为 0.0 ~ 2.0(不同模型范围可能不同)。
  • 调优建议 :
  • 低温度(0.0 ~ 0.3) :适合需要精确、可靠、事实性 回答的场景,如代码生成、数据提取、问答。输出确定性高,多次调用结果相似。
  • 中温度(0.5 ~ 0.7) :通用场景的平衡点,兼顾一定的创造性和连贯性,适合对话、内容摘要。
  • 高温度(0.8 ~ 1.2+) :适合需要创意、多样性 的场景,如写作、头脑风暴、生成广告语。但过高的温度可能导致文本不连贯或偏离主题。
  • 在 SpringAI 中的配置 :
Crystal 复制代码
spring:
  ai:
    openai:
      chat:
        options:
          temperature: 0.7

或在代码中动态设置:

java 复制代码
ChatOptions options = OpenAiChatOptions.builder()
        .withTemperature(0.7)
        .build();

最大生成长度(Max Tokens / Max Length)

作用 :限制模型生成的最大令牌数(Token),控制响应长度。

  • 注意 :Token 不是单词,对于英文,1个token约等于0.75个单词;中文通常1个汉字约1-2个token。
  • 调优建议 :
  • 根据场景设定合理上限,避免生成过长或过短的响应。
  • 太短可能导致回答不完整,太长可能冗余且增加成本。
  • 通常与 stop 参数(停止序列)结合使用,提前终止生成。
  • 在 SpringAI 中的配置 :
Crystal 复制代码
spring:
  ai:
    openai:
      chat:
        options:
          max-tokens: 500

Top-P(核采样)

作用 :从概率累积分布中动态选择词汇,控制生成文本的多样性。与温度类似,但方法不同。

  • 取值范围 :0.0 ~ 1.0
  • 工作原理 :模型会按概率从高到低累积,直到累积概率达到 Top-P 值,然后只从这些 token 中采样。
  • 调优建议 :
  • 低 Top-P(如 0.1) :仅考虑最可能的几个 token,输出确定性高。
  • 高 Top-P(如 0.9) :考虑更广泛的 token,输出更多样。
  • 通常与温度配合使用 :一般只调整其中一个 ,避免双重随机性导致输出不可控。
  • 常见组合:temperature=0.7top-p=0.9
  • 在 SpringAI 中的配置 :
Crystal 复制代码
spring:
  ai:
    openai:
      chat:
        options:
          top-p: 0.9

惩罚系数(Frequency & Presence Penalty)

作用 :通过惩罚减少重复内容,提高文本多样性。

  • 频率惩罚(Frequency Penalty) :降低已出现 token 的概率,惩罚重复用词。
  • 存在惩罚(Presence Penalty) :降低已出现主题(无论次数)的概率,鼓励引入新话题。
  • 取值范围 :通常 -2.0 ~ 2.0
  • 正值 :增加惩罚,减少重复。
  • 负值 :减少惩罚,增加重复(可能导致循环)。
  • 调优建议 :
  • 对于长文本生成(如文章、故事),可适当增加惩罚(如 0.5 ~ 1.0)避免重复。
  • 对于短对话或精确指令,可设为 0 或较小值。
  • 注意:过高的惩罚可能导致文本不自然或偏离主题。
  • 在 SpringAI 中的配置 :
Crystal 复制代码
spring:
  ai:
    openai:
      chat:
        options:
          frequency-penalty: 0.5
          presence-penalty: 0.5

停止序列(Stop Sequences)

作用 :指定一个或多个字符串,当模型生成到这些字符串时停止生成。

  • 用途 :
  • 控制输出格式(如生成到 "\n" 换行停止)。
  • 限制回答长度(如 "。""###")。
  • 在 SpringAI 中的配置 :
Crystal 复制代码
spring:
  ai:
    openai:
      chat:
        options:
          stop: ["。", "\n", "###"]

综合调优策略

表格 还在加载中,请等待加载完成后再尝试复制

在 SpringAI 中全局与动态配置

  • 全局配置 :在 application.yml 中设置,对所有请求生效。
  • 动态覆盖 :在每次请求时通过 ChatOptions 覆盖全局设置:
java 复制代码
UserMessage message = new UserMessage("Hello");
ChatOptions options = OpenAiChatOptions.builder()
        .withTemperature(0.5)
        .withMaxTokens(200)
        .build();

ChatResponse response = chatClient.call(
    new Prompt(message, options)
);

调试建议

  1. 从保守值开始 :如 temperature=0.5, top-p=0.9, penalty=0
  2. 小步调整 :每次只调整一个参数,观察输出变化。
  3. 结合业务场景 :不同任务对参数敏感度不同,需针对性测试。
  4. 记录实验 :记录参数组合与输出效果,建立自己的调优经验库。

通过合理调整这些参数,你可以更好地控制 LLM 的输出,使其更符合你的应用需求。

相关推荐
暴躁小师兄数据学院10 小时前
【AI大模型应用开发工程师特训笔记】第04讲(第五章):条件判断与流程控制
大数据·人工智能·python·学习
北京软秦科技有限公司10 小时前
档案复核联动文档核验,IACheck AI报告审核让资料管理体系真正闭环
人工智能
郝学胜-神的一滴10 小时前
系统设计 013:高并发系统缓存:从原理到实践全解析
java·开发语言·python·缓存·系统架构·php·软件构建
洛阳泰山10 小时前
MaxKB4j 近三月开发进展速览:从 RAG 引擎到全能 AI 工作流平台
人工智能·后端
一个儒雅随和的男子10 小时前
Spring cloud组件gateway网关详细剖析
spring·spring cloud·gateway
欧米欧10 小时前
C++进阶之AVL树
java·服务器·c++
小陶来咯10 小时前
agent × 豆包:端到端语音实时交互
网络·ai·机器人·bug·交互
战族狼魂10 小时前
Claude 大模型在真实业务场景中的落地应用指南
人工智能·chatgpt·大模型
学困昇10 小时前
Linux 信号机制详解:从 Ctrl+C 到 SIGCHLD,一文理解进程信号
linux·c语言·开发语言·人工智能·面试