Spring Boot + Spring AI 一体化实战全文档
适用版本:
Spring Boot 3.2.x/ JDK17+ / Spring AI 1.1.4
覆盖:入门、配置、核心 API、流式、多轮记忆、RAG、函数调用、异常、重试、监控、工具类、Prompt 模板、生产规范、踩坑总结。
目录
- 版本约束 & 前置规范
- Maven 完整依赖(BOM + 全模型)
- application.yml 全量生产配置
- 核心 API:ChatModel / ChatClient
- 基础对话 + 系统角色设定
- Prompt 模板化最佳实践
- SSE 流式输出(含异常兜底)
- 多轮对话(ChatMemory 防 Token 溢出)
- 多模型无缝切换方案
- Function Calling 函数调用
- RAG 检索增强生成(企业知识库)
- 全局异常统一处理
- 生产级:超时、重试、熔断、降级
- 全局拦截器|日志|耗时监控
- 企业级最佳实践规范
- 高频踩坑 & 避坑清单
- 生产级万能 Prompt 模板库
- AI 通用工具类全套
- 全局配置类|内存记忆|常量
- 统一返回实体 + 通用 Controller 模板
1. 版本约束 & 前置规范
1.1 版本强绑定
- Spring AI 1.1.x → Spring Boot 3.2.x | JDK 17+
- Spring AI 1.2.x → Spring Boot 3.3.x
- Spring AI 2.0.x → Spring Boot 3.4+ | JDK 21+
1.2 强制开发规范
- 必须引入 Spring AI BOM 统一版本,防止依赖冲突
- API Key 禁止硬编码,使用环境变量 / 配置中心
- 所有 AI 接口必须配置超时、异常兜底、重试策略
- 多轮对话强制限制上下文轮次,避免 Token 爆炸
- RAG 必须增加相似度阈值,杜绝 AI 幻觉编造
- 线上禁止打印完整 Prompt/Response,做好内容脱敏
2. Maven 完整依赖
2.1 依赖管理 BOM
xml
XML
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>1.1.4</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
2.2 核心依赖
xml
XML
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- OpenAI -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
<!-- 阿里云通义千问 -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-dashscope-spring-boot-starter</artifactId>
</dependency>
<!-- 本地离线 Ollama -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>
<!-- RAG 向量库 -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>
<!-- 重试容错 -->
<dependency>
<groupId>org.springframework.retry</groupId>
<artifactId>spring-retry</artifactId>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
</dependencies>
2.3 仓库配置
xml
XML
<repositories>
<repository>
<id>spring-milestones</id>
<name>Spring Milestones</name>
<url>https://repo.spring.io/milestone</url>
</repository>
</repositories>
3. application.yml 全量生产配置
yaml
XML
spring:
ai:
# 全局公共AI参数
chat:
options:
temperature: 0.5
max-tokens: 2048
top-p: 0.9
timeout: 30s
system-prompt: 你是资深Java架构师,回答精简专业、附带代码示例、贴合生产实战
# OpenAI 配置
openai:
api-key: ${OPENAI_API_KEY:}
base-url: https://api.openai.com/v1
connection-timeout: 15000
read-timeout: 30000
chat:
options:
model: gpt-3.5-turbo
# 通义千问配置
dashscope:
api-key: ${DASHSCOPE_API_KEY:}
chat:
options:
model: qwen-plus
# 本地 Ollama
ollama:
base-url: http://localhost:11434
chat:
options:
model: qwen2.5
# 手动关闭指定自动配置(排错使用)
autoconfigure:
exclude:
- org.springframework.ai.openai.autoconfigure.OpenAiChatAutoConfiguration
# 日志级别
logging:
level:
org.springframework.ai: INFO
4. 核心两大 API
4.1 底层原生 ChatModel(高自由度)
java
import lombok.RequiredArgsConstructor;
import org.springframework.ai.chat.ChatModel;
import org.springframework.ai.chat.ChatRequest;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import java.util.List;
@RestController
@RequiredArgsConstructor
public class AiBaseController {
private final ChatModel chatModel;
@GetMapping("/model/chat")
public String modelChat(@RequestParam String msg) {
ChatRequest request = ChatRequest.builder()
.messages(List.of(new UserMessage(msg)))
.temperature(0.5)
.build();
return chatModel.call(request).getResult().getOutput().getContent();
}
}
4.2 高层推荐 ChatClient(企业首选)
java
import lombok.RequiredArgsConstructor;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
@RestController
@RequiredArgsConstructor
public class AiClientController {
private final ChatClient chatClient;
@GetMapping("/client/chat")
public String clientChat(@RequestParam String msg) {
return chatClient.prompt()
.user(msg)
.call()
.content();
}
}
5. 系统角色设定
java
@GetMapping("/system/chat")
public String systemChat(@RequestParam String msg) {
return chatClient.prompt()
.system("你是Redis高级专家,只讲核心原理、生产优化、避坑要点")
.user(msg)
.call()
.content();
}
6. Prompt 模板化(杜绝硬编码)
java
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.prompt.PromptTemplate;
import java.util.Map;
@GetMapping("/template/chat")
public String templateChat(@RequestParam String question) {
String promptStr = """
请以Java后端工程师视角解答:
问题:{question}
要求:精简回答、附带可运行代码、说明生产注意事项
""";
PromptTemplate template = new PromptTemplate(promptStr);
Prompt prompt = template.create(Map.of("question", question));
return chatClient.call(prompt).getResult().getOutput().getContent();
}
7. SSE 流式输出(生产标准)
java
import org.springframework.http.MediaType;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import reactor.core.publisher.Flux;
import java.util.Optional;
@GetMapping(value = "/ai/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<String> streamChat(@RequestParam String prompt) {
return chatClient.stream()
.user(prompt)
.stream()
.map(res -> Optional.ofNullable(res.getResult().getOutput().getContent()).orElse(""))
.onErrorResume(e -> Flux.just("AI服务调用异常,请稍后重试"));
}
8. 多轮对话|ChatMemory 全局记忆
8.1 内存记忆配置
java
import org.springframework.ai.chat.memory.InMemoryChatMemory;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class AiMemoryConfig {
@Bean
public InMemoryChatMemory chatMemory() {
// 限制5轮,防止上下文膨胀
return new InMemoryChatMemory(5);
}
}
8.2 多轮对话接口
java
private final InMemoryChatMemory chatMemory;
@GetMapping("/multi/chat")
public String multiChat(@RequestParam String msg) {
return chatClient.prompt()
.user(msg)
.chatMemory(chatMemory)
.call()
.content();
}
@PostMapping("/memory/clear")
public void clearMemory() {
chatMemory.clear();
}
9. 多模型无缝切换
- 替换 Starter 依赖
- 修改 yml 对应模型配置
- 业务代码 ChatClient 完全无需改动支持:OpenAI / 通义千问 / 文心一言 / 讯飞星火 / Ollama
10. Function Calling 函数调用
10.1 业务工具配置
java
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.util.function.Function;
@Configuration
public class AiFunctionConfig {
@Bean
@Description("根据用户ID查询订单信息")
public Function<OrderQueryReq, OrderResp> getOrderInfo() {
return req -> new OrderResp("OD20260428001", "已完成", 299.00);
}
}
// 入参出参
record OrderQueryReq(String userId) {}
record OrderResp(String orderNo, String status, Double amount) {}
10.2 调用入口
java
@GetMapping("/function/chat")
public String functionChat(@RequestParam String msg) {
return chatClient.prompt()
.user(msg)
.functions("getOrderInfo")
.call()
.content();
}
11. RAG 检索增强生成
11.1 RAG 核心服务
java
import lombok.RequiredArgsConstructor;
import org.springframework.ai.document.Document;
import org.springframework.ai.reader.pdf.PagePdfDocumentReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.core.io.ClassPathResource;
import org.springframework.stereotype.Service;
import javax.annotation.PostConstruct;
import java.util.List;
import java.util.stream.Collectors;
@Service
@RequiredArgsConstructor
public class RagService {
private final ChatClient chatClient;
private final VectorStore vectorStore;
@PostConstruct
public void loadDoc() {
List<Document> docs = new PagePdfDocumentReader(new ClassPathResource("doc.pdf")).get();
TokenTextSplitter splitter = new TokenTextSplitter(500, 50, 10, 100);
vectorStore.add(splitter.apply(docs));
}
public String ragChat(String question) {
List<Document> docs = vectorStore.similaritySearch(
SearchRequest.query(question)
.withTopK(3)
.withSimilarityThreshold(0.7)
);
String context = docs.stream().map(Document::getContent).collect(Collectors.joining("\n"));
String prompt = """
仅根据参考资料回答,无相关内容则回复:暂无相关资料
参考资料:%s
问题:%s
""".formatted(context, question);
return chatClient.prompt().user(prompt).call().content();
}
}
12. 全局异常处理
java
import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.autoconfigure.openai.OpenAiChatAutoConfiguration;
import org.springframework.web.bind.annotation.ExceptionHandler;
import org.springframework.web.bind.annotation.RestControllerAdvice;
import java.util.concurrent.TimeoutException;
@Slf4j
@RestControllerAdvice
public class AiGlobalExceptionHandler {
@ExceptionHandler(TimeoutException.class)
public AiResult<Void> timeoutError() {
return AiResult.fail(5002, "大模型请求超时");
}
@ExceptionHandler(Exception.class)
public AiResult<Void> commonError(Exception e) {
log.error("AI调用异常:", e);
return AiResult.fail(5000, "AI服务暂时不可用");
}
}
13. 生产级容错|重试
13.1 开启重试
启动类添加:
java
@EnableRetry
@SpringBootApplication
public class AiApplication {}
13.2 重试注解使用
java
import org.springframework.retry.annotation.Retryable;
@Retryable(maxAttempts = 3, delay = 1000, retryFor = TimeoutException.class)
public String aiChat(String msg) {
return chatClient.prompt().user(msg).call().content();
}
14. 全局拦截器|耗时监控
java
import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.chat.client.ChatClientCustomizer;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Slf4j
@Configuration
public class ChatClientInterceptorConfig {
@Bean
public ChatClientCustomizer aiChatClientCustomizer() {
return builder -> builder.defaultInterceptors(context -> {
long start = System.currentTimeMillis();
context.advance();
long cost = System.currentTimeMillis() - start;
String model = context.getPrompt().getChatOptions().getModel();
log.info("[AI调用] 模型:{},耗时:{}ms", model, cost);
});
}
}
15. 企业级最佳实践
- 密钥环境变量托管,杜绝明文
- 全局统一 system-prompt,标准化输出
- 强制超时配置,避免线程阻塞
- 上下文限制轮次,控制 Token 成本
- RAG 增加相似度阈值,防幻觉
- 流式接口标准 SSE + 异常熔断
- 接入监控:调用量、耗时、失败率
- 输出内容脱敏、敏感词过滤
16. 高频踩坑汇总
- 版本不匹配 → 自动配置失效、Bean 不存在
- 国内网络 → 切换国产模型 / 配置代理
- 多轮上下文无限累积 → 使用 InMemoryChatMemory 限制
- RAG 回答编造 → 增加阈值 + 强约束 Prompt
- 函数调用失效 → @Description 注解缺失
- 日志打印明文密钥 → 统一脱敏处理
17. 生产级万能 Prompt 模板
17.1 Java 技术问答
java
你是10年Java架构师,回答精简、直击核心,附带极简可运行代码,结合生产踩坑与性能优化,禁止废话堆砌。
问题:{question}
17.2 代码审查优化
java
请做企业级代码审查:修复NPE、循环查库、N+1、低效写法;精简嵌套、优化事务、规范日志;输出优化后完整代码+优化点说明。
待优化代码:{code}
17.3 RAG 防幻觉专用
java
只能基于给定参考资料回答,禁止编造、禁止拓展外部知识,无答案直接回复:暂无相关资料。
参考资料:{context}
问题:{question}
17.4 线上日志排错
java
你是Java线上故障排查专家,分析异常日志:定位根因、给出立即修复方案、长期规避优化方案。
日志内容:{errorLog}
18. AI 通用工具类
18.1 提示词工具
java
import org.springframework.ai.chat.prompt.PromptTemplate;
import java.util.Map;
public class AiPromptUtil {
public static String build(String template, Map<String, Object> paramMap) {
return new PromptTemplate(template).render(paramMap);
}
public static String build(String template, String key, String value) {
return build(template, Map.of(key, value));
}
}
18.2 内容脱敏工具
java
import org.springframework.util.StringUtils;
public class AiContentUtil {
public static String filterSensitive(String content) {
if (!StringUtils.hasText(content)) return content;
return content.replaceAll("密钥|password|token|accessKey", "***");
}
}
18.3 全局常量
java
public final class AiConstant {
public static final int MAX_CHAT_MEMORY = 5;
public static final int RAG_TOP_K = 3;
public static final double RAG_THRESHOLD = 0.7;
public static final int AI_RETRY_COUNT = 3;
}
19. 统一返回实体
java
import lombok.Data;
@Data
public class AiResult<T> {
private Integer code;
private String msg;
private T data;
public static <T> AiResult<T> success(T data) {
AiResult<T> r = new AiResult<>();
r.setCode(200);
r.setMsg("请求成功");
r.setData(data);
return r;
}
public static <T> AiResult<T> fail(Integer code, String msg) {
AiResult<T> r = new AiResult<>();
r.setCode(code);
r.setMsg(msg);
return r;
}
}
20. 通用完整 Controller 模板
java
import lombok.RequiredArgsConstructor;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.memory.InMemoryChatMemory;
import org.springframework.http.MediaType;
import org.springframework.web.bind.annotation.*;
import reactor.core.publisher.Flux;
@RestController
@RequestMapping("/ai")
@RequiredArgsConstructor
public class AiAllController {
private final ChatClient chatClient;
private final InMemoryChatMemory chatMemory;
private final RagService ragService;
@GetMapping("/chat")
public AiResult<String> chat(@RequestParam String question) {
return AiResult.success(chatClient.prompt().user(question).call().content());
}
@GetMapping("/multi")
public AiResult<String> multi(@RequestParam String question) {
return AiResult.success(chatClient.prompt().user(question).chatMemory(chatMemory).call().content());
}
@GetMapping(value = "/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<String> stream(@RequestParam String question) {
return chatClient.stream().user(question).stream()
.map(item -> AiContentUtil.filterSensitive(item.getResult().getOutput().getContent()))
.onErrorResume(e -> Flux.just("AI服务异常"));
}
@GetMapping("/rag")
public AiResult<String> rag(@RequestParam String question) {
return AiResult.success(ragService.ragChat(question));
}
@PostMapping("/memory/clear")
public AiResult<Void> clear() {
chatMemory.clear();
return AiResult.success(null);
}
}
✅ 文档已全部合并完毕