最近AI聊天机器人火得不行,我也想用Java搞一个玩玩。说实话,刚开始觉得应该挺简单的,不就是调个API吗?结果踩了一堆坑。今天就把整个从零到一的过程,还有那些让人崩溃的坑,都分享出来。

准备工作:选框架还是直接调API?
刚开始我纠结了一下,是用Spring-AI这种框架,还是直接调用OpenAI的API?后来想想,既然要学就学个完整的,直接上Spring-AI吧,好歹是Spring官方的,出了问题也能找到人问。
项目结构大概是这样:
chatbot/
├── src/main/java/
│ └── com/moyulao/chatbot/
│ ├── ChatbotApplication.java
│ ├── controller/ChatController.java
│ ├── service/ChatService.java
│ └── config/OpenAIConfig.java
└── pom.xml
第一步:搭建基础项目
先用Spring Initializr生成个项目,加上Spring Web和Spring AI的依赖。
xml
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
<version>1.0.0</version>
</dependency>
</dependencies>
这里第一个坑就来了:Spring-AI的版本号要写清楚,我当时没写版本号,结果Maven拉不下来依赖,折腾了半天才发现是版本的问题。
第二步:配置OpenAI
在application.yml里配置OpenAI的API Key:
yaml
spring:
ai:
openai:
api-key: ${OPENAI_API_KEY}
chat:
options:
model: gpt-3.5-turbo
temperature: 0.7
注意,API Key最好用环境变量,别直接写死在配置文件里,不然代码提交到GitHub就尴尬了。我见过有人的Key泄露了,然后账单被刷爆的。
第三步:写个简单的聊天接口
刚开始我写了个最简单的版本:
java
@RestController
@RequestMapping("/api/chat")
public class ChatController {
@Autowired
private ChatClient chatClient;
@PostMapping
public ResponseEntity<String> chat(@RequestBody String message) {
String response = chatClient.call(message);
return ResponseEntity.ok(response);
}
}
写完后美滋滋地测试了一下,结果发现每次对话都是独立的,没有上下文。这明显不行啊,聊天机器人得有记忆才行。
第四步:实现对话记忆
这就是第二个坑。刚开始我以为Spring-AI会自动处理上下文,结果发现需要自己维护。后来查了文档,发现可以用ChatClient的流式调用,或者自己维护消息历史。
我选择自己维护,虽然麻烦点,但更灵活:
java
@Service
public class ChatService {
@Autowired
private ChatClient chatClient;
// 用ThreadLocal存每个会话的历史,实际项目中建议用Redis
private final Map<String, List<Message>> conversationHistory = new ConcurrentHashMap<>();
public String chat(String sessionId, String userMessage) {
// 获取历史消息
List<Message> history = conversationHistory.getOrDefault(
sessionId,
new ArrayList<>()
);
// 添加用户消息
history.add(new Message("user", userMessage));
// 组装完整的对话上下文
String context = buildContext(history);
// 调用AI
String response = chatClient.call(context);
// 添加AI回复
history.add(new Message("assistant", response));
// 保存历史(限制长度,避免太长)
if (history.size() > 20) {
history = history.subList(history.size() - 20, history.size());
}
conversationHistory.put(sessionId, history);
return response;
}
private String buildContext(List<Message> history) {
StringBuilder sb = new StringBuilder();
sb.append("你是一个友好的AI助手。以下是对话历史:\n\n");
for (Message msg : history) {
sb.append(msg.getRole())
.append(": ")
.append(msg.getContent())
.append("\n");
}
return sb.toString();
}
}
这里又踩了个坑:刚开始我没限制历史长度,结果对话多了之后,Token消耗剧增,而且API调用还超时了。后来加了长度限制,只保留最近10轮对话。
第五步:处理流式响应
用户说想要看到AI一个字一个字地回复,就像ChatGPT那样。这个需求还挺合理的,用户体验确实更好。
Spring-AI支持流式调用,但需要处理得稍微复杂一点:
java
@PostMapping("/stream")
public SseEmitter streamChat(
@RequestBody String message,
@RequestParam String sessionId) {
SseEmitter emitter = new SseEmitter(60000L);
// 异步处理,避免阻塞
CompletableFuture.runAsync(() -> {
try {
chatClient.stream(message)
.doOnNext(chunk -> {
try {
emitter.send(SseEmitter.event()
.data(chunk.getResult().getOutput().getContent())
.name("message"));
} catch (IOException e) {
emitter.completeWithError(e);
}
})
.doOnComplete(emitter::complete)
.doOnError(emitter::completeWithError)
.subscribe();
} catch (Exception e) {
emitter.completeWithError(e);
}
});
return emitter;
}
前端用EventSource接收:
javascript
const eventSource = new EventSource('/api/chat/stream?sessionId=123&message=你好');
eventSource.onmessage = function(event) {
const data = event.data;
// 追加到聊天窗口
appendToChat(data);
};
eventSource.addEventListener('message', function(event) {
const chunk = event.data;
appendToChat(chunk);
});
第六步:添加错误处理和重试
实际使用中,API调用可能会失败,网络问题、限流、超时等等。必须得加上重试机制。
java
@Service
public class ChatService {
@Retryable(
value = {Exception.class},
maxAttempts = 3,
backoff = @Backoff(delay = 1000, multiplier = 2)
)
public String chatWithRetry(String sessionId, String message) {
try {
return chat(sessionId, message);
} catch (Exception e) {
log.error("Chat failed, retrying...", e);
throw e;
}
}
@Recover
public String recover(Exception e, String sessionId, String message) {
log.error("Chat failed after retries", e);
return "抱歉,我现在有点忙,请稍后再试。";
}
}
第七步:性能优化
用户多了之后,发现响应有点慢。分析了一下,主要有几个瓶颈:
- API调用延迟:这个没办法,毕竟是调用外部服务
- 同步调用阻塞:改成异步处理
- 历史消息处理:优化字符串拼接
改成了异步处理:
java
@Async
public CompletableFuture<String> chatAsync(String sessionId, String message) {
return CompletableFuture.supplyAsync(() -> {
return chat(sessionId, message);
});
}
遇到的坑总结
-
Token消耗:刚开始没注意,对话历史越积越多,Token消耗吓人。后来加了长度限制和Token计算。
-
并发问题:用Map存会话历史,并发访问有问题。后来改成了ConcurrentHashMap,但更好的方案是用Redis。
-
超时问题:API调用有时候很慢,默认超时时间不够。需要手动设置。
-
成本控制:没注意API调用次数,结果账单有点吓人。后来加了限流和缓存。
完整的Controller代码
最后贴个完整的Controller,给需要的同学参考:
java
@RestController
@RequestMapping("/api/chat")
@Slf4j
public class ChatController {
@Autowired
private ChatService chatService;
@PostMapping
public ResponseEntity<ChatResponse> chat(
@RequestBody ChatRequest request) {
try {
String response = chatService.chat(
request.getSessionId(),
request.getMessage()
);
return ResponseEntity.ok(new ChatResponse(response));
} catch (Exception e) {
log.error("Chat error", e);
return ResponseEntity.status(500)
.body(new ChatResponse("服务暂时不可用,请稍后再试"));
}
}
@PostMapping("/stream")
public SseEmitter streamChat(@RequestBody ChatRequest request) {
// 流式处理的代码前面有,这里就不重复了
// ...
}
}
第八步:使用Redis持久化会话历史
用内存Map存会话历史,服务重启就没了,这肯定不行。改造成用Redis:
java
@Service
@Slf4j
public class RedisChatService {
@Autowired
private ChatClient chatClient;
@Autowired
private RedisTemplate<String, String> redisTemplate;
@Autowired
private ObjectMapper objectMapper;
private static final String HISTORY_KEY_PREFIX = "chat:history:";
private static final int MAX_HISTORY_LENGTH = 20;
private static final int HISTORY_TTL_HOURS = 24;
public String chat(String sessionId, String userMessage) {
try {
// 1. 获取历史消息
List<Message> history = getHistory(sessionId);
// 2. 添加用户消息
history.add(new Message("user", userMessage));
// 3. 限制历史长度
if (history.size() > MAX_HISTORY_LENGTH) {
history = history.subList(history.size() - MAX_HISTORY_LENGTH, history.size());
}
// 4. 构建上下文
String context = buildContext(history);
// 5. 调用AI
String response = chatClient.call(context);
// 6. 添加AI回复
history.add(new Message("assistant", response));
// 7. 保存历史
saveHistory(sessionId, history);
return response;
} catch (Exception e) {
log.error("Chat failed for session: {}", sessionId, e);
throw new RuntimeException("Chat failed", e);
}
}
private List<Message> getHistory(String sessionId) {
String key = HISTORY_KEY_PREFIX + sessionId;
String historyJson = redisTemplate.opsForValue().get(key);
if (historyJson == null) {
return new ArrayList<>();
}
try {
List<Map<String, String>> list = objectMapper.readValue(
historyJson,
new TypeReference<List<Map<String, String>>>() {}
);
return list.stream()
.map(map -> new Message(map.get("role"), map.get("content")))
.collect(Collectors.toList());
} catch (Exception e) {
log.error("Failed to parse history", e);
return new ArrayList<>();
}
}
private void saveHistory(String sessionId, List<Message> history) {
String key = HISTORY_KEY_PREFIX + sessionId;
try {
List<Map<String, String>> list = history.stream()
.map(msg -> Map.of("role", msg.getRole(), "content", msg.getContent()))
.collect(Collectors.toList());
String historyJson = objectMapper.writeValueAsString(list);
redisTemplate.opsForValue().set(
key,
historyJson,
HISTORY_TTL_HOURS,
TimeUnit.HOURS
);
} catch (Exception e) {
log.error("Failed to save history", e);
}
}
private String buildContext(List<Message> history) {
StringBuilder sb = new StringBuilder();
sb.append("你是一个友好的AI助手。以下是对话历史:\n\n");
for (Message msg : history) {
sb.append(msg.getRole())
.append(": ")
.append(msg.getContent())
.append("\n");
}
return sb.toString();
}
}
第九步:完整的流式处理实现
前面提到了流式处理,这里给出完整实现:
java
@RestController
@RequestMapping("/api/chat")
@Slf4j
public class StreamChatController {
@Autowired
private ChatClient chatClient;
@Autowired
private ChatService chatService;
@PostMapping("/stream")
public SseEmitter streamChat(
@RequestBody ChatRequest request,
@RequestParam(required = false, defaultValue = "false") boolean useHistory) {
SseEmitter emitter = new SseEmitter(120000L); // 2分钟超时
CompletableFuture.runAsync(() -> {
try {
String message = request.getMessage();
String sessionId = request.getSessionId();
// 如果需要历史上下文
String prompt = message;
if (useHistory && sessionId != null) {
prompt = chatService.buildContextWithHistory(sessionId, message);
}
// 流式调用
chatClient.stream(prompt)
.doOnNext(chunk -> {
try {
String content = extractContent(chunk);
if (content != null && !content.isEmpty()) {
emitter.send(SseEmitter.event()
.data(content)
.name("message"));
}
} catch (IOException e) {
log.error("Failed to send SSE", e);
emitter.completeWithError(e);
}
})
.doOnComplete(() -> {
try {
// 发送结束标记
emitter.send(SseEmitter.event()
.data("[DONE]")
.name("done"));
emitter.complete();
} catch (IOException e) {
emitter.completeWithError(e);
}
})
.doOnError(error -> {
log.error("Stream error", error);
try {
emitter.send(SseEmitter.event()
.data("错误:" + error.getMessage())
.name("error"));
} catch (IOException e) {
// ignore
}
emitter.completeWithError(error);
})
.subscribe();
} catch (Exception e) {
log.error("Stream chat failed", e);
emitter.completeWithError(e);
}
});
// 处理客户端断开连接
emitter.onCompletion(() -> log.info("SSE connection completed"));
emitter.onTimeout(() -> log.info("SSE connection timeout"));
emitter.onError((ex) -> log.error("SSE connection error", ex));
return emitter;
}
private String extractContent(org.springframework.ai.chat.ChatResponse chunk) {
if (chunk.getResult() != null &&
chunk.getResult().getOutput() != null &&
chunk.getResult().getOutput().getContent() != null) {
return chunk.getResult().getOutput().getContent();
}
return null;
}
}
前端集成示例:
javascript
// 流式聊天前端代码
class StreamChatClient {
constructor(baseUrl) {
this.baseUrl = baseUrl;
}
async streamChat(message, sessionId, onChunk, onComplete, onError) {
try {
const response = await fetch(`${this.baseUrl}/api/chat/stream?useHistory=true`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
message: message,
sessionId: sessionId
})
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) {
onComplete();
break;
}
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n\n');
buffer = lines.pop(); // 保留最后一行(可能不完整)
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.substring(6);
if (data === '[DONE]') {
onComplete();
return;
}
try {
const parsed = JSON.parse(data);
if (parsed.type === 'message' && parsed.data) {
onChunk(parsed.data);
}
} catch (e) {
// 如果不是JSON,直接当作文本
if (data) {
onChunk(data);
}
}
}
}
}
} catch (error) {
onError(error);
}
}
}
// 使用示例
const chatClient = new StreamChatClient('http://localhost:8080');
chatClient.streamChat(
'你好,介绍一下Java',
'session-123',
(chunk) => {
// 实时追加内容
document.getElementById('chat-content').innerText += chunk;
},
() => {
console.log('Chat completed');
},
(error) => {
console.error('Chat error:', error);
}
);
第十步:添加用户认证和权限控制
生产环境必须要有认证和权限控制:
java
@RestController
@RequestMapping("/api/chat")
@Slf4j
public class SecureChatController {
@Autowired
private ChatService chatService;
@Autowired
private RateLimitService rateLimitService;
@PostMapping
@PreAuthorize("hasRole('USER')")
public ResponseEntity<ChatResponse> chat(
@RequestBody ChatRequest request,
@AuthenticationPrincipal UserDetails userDetails) {
try {
// 1. 限流检查
String userId = userDetails.getUsername();
if (!rateLimitService.allowRequest(userId)) {
return ResponseEntity.status(429)
.body(new ChatResponse("请求过于频繁,请稍后再试"));
}
// 2. 内容安全检查
if (!contentSafetyCheck(request.getMessage())) {
return ResponseEntity.status(400)
.body(new ChatResponse("内容不合规,请重新输入"));
}
// 3. 调用聊天服务
String sessionId = generateSessionId(userId, request.getSessionId());
String response = chatService.chat(sessionId, request.getMessage());
// 4. 记录日志
logChatRequest(userId, request.getMessage(), response);
return ResponseEntity.ok(new ChatResponse(response));
} catch (Exception e) {
log.error("Chat error for user: {}", userDetails.getUsername(), e);
return ResponseEntity.status(500)
.body(new ChatResponse("服务暂时不可用,请稍后再试"));
}
}
private boolean contentSafetyCheck(String message) {
// 简单的敏感词检查,实际应该用专业的内容审核服务
List<String> sensitiveWords = Arrays.asList("暴力", "色情", "政治");
String lowerMessage = message.toLowerCase();
return sensitiveWords.stream()
.noneMatch(word -> lowerMessage.contains(word.toLowerCase()));
}
private String generateSessionId(String userId, String clientSessionId) {
if (clientSessionId != null && !clientSessionId.isEmpty()) {
return userId + ":" + clientSessionId;
}
return userId + ":" + UUID.randomUUID().toString();
}
private void logChatRequest(String userId, String question, String answer) {
// 记录到数据库或日志系统
ChatLog log = new ChatLog();
log.setUserId(userId);
log.setQuestion(question);
log.setAnswer(answer);
log.setCreateTime(new Date());
// chatLogRepository.save(log);
}
}
// 限流服务
@Service
public class RateLimitService {
@Autowired
private RedisTemplate<String, Long> redisTemplate;
private static final int MAX_REQUESTS_PER_MINUTE = 20;
public boolean allowRequest(String userId) {
String key = "ratelimit:chat:" + userId;
Long count = redisTemplate.opsForValue().increment(key);
if (count == 1) {
redisTemplate.expire(key, 60, TimeUnit.SECONDS);
}
return count <= MAX_REQUESTS_PER_MINUTE;
}
}
第十一步:监控和指标收集
添加监控,了解系统运行情况:
java
@Component
@Slf4j
public class ChatMetrics {
private final MeterRegistry meterRegistry;
private final Counter chatRequestCounter;
private final Timer chatResponseTimer;
private final Counter chatErrorCounter;
public ChatMetrics(MeterRegistry meterRegistry) {
this.meterRegistry = meterRegistry;
this.chatRequestCounter = Counter.builder("chat.requests")
.description("Total chat requests")
.register(meterRegistry);
this.chatResponseTimer = Timer.builder("chat.response.time")
.description("Chat response time")
.register(meterRegistry);
this.chatErrorCounter = Counter.builder("chat.errors")
.description("Chat errors")
.register(meterRegistry);
}
public void recordRequest() {
chatRequestCounter.increment();
}
public void recordResponseTime(Duration duration) {
chatResponseTimer.record(duration);
}
public void recordError() {
chatErrorCounter.increment();
}
}
// 在Service中使用
@Service
public class MonitoredChatService {
@Autowired
private ChatService chatService;
@Autowired
private ChatMetrics metrics;
public String chat(String sessionId, String message) {
metrics.recordRequest();
long startTime = System.currentTimeMillis();
try {
String response = chatService.chat(sessionId, message);
long duration = System.currentTimeMillis() - startTime;
metrics.recordResponseTime(Duration.ofMillis(duration));
return response;
} catch (Exception e) {
metrics.recordError();
throw e;
}
}
}
第十二步:完整的项目结构
最终的项目结构:
chatbot/
├── src/main/java/com/moyulao/chatbot/
│ ├── ChatbotApplication.java
│ ├── config/
│ │ ├── OpenAIConfig.java
│ │ ├── RedisConfig.java
│ │ └── SecurityConfig.java
│ ├── controller/
│ │ ├── ChatController.java
│ │ └── StreamChatController.java
│ ├── service/
│ │ ├── ChatService.java
│ │ ├── RedisChatService.java
│ │ ├── RateLimitService.java
│ │ └── ContentSafetyService.java
│ ├── model/
│ │ ├── Message.java
│ │ ├── ChatRequest.java
│ │ └── ChatResponse.java
│ ├── metrics/
│ │ └── ChatMetrics.java
│ └── exception/
│ └── ChatException.java
├── src/main/resources/
│ ├── application.yml
│ └── application-prod.yml
└── pom.xml
完整的配置文件示例
yaml
# application.yml
spring:
application:
name: chatbot
ai:
openai:
api-key: ${OPENAI_API_KEY}
chat:
options:
model: gpt-3.5-turbo
temperature: 0.7
max-tokens: 1000
data:
redis:
host: localhost
port: 6379
password: ${REDIS_PASSWORD:}
timeout: 2000ms
lettuce:
pool:
max-active: 8
max-idle: 8
min-idle: 0
security:
oauth2:
resourceserver:
jwt:
issuer-uri: ${JWT_ISSUER_URI}
# 应用配置
app:
chat:
max-history-length: 20
history-ttl-hours: 24
rate-limit:
max-requests-per-minute: 20
stream:
timeout-seconds: 120
# 监控
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus
metrics:
export:
prometheus:
enabled: true
测试用例
java
@SpringBootTest
@AutoConfigureMockMvc
class ChatControllerTest {
@Autowired
private MockMvc mockMvc;
@MockBean
private ChatClient chatClient;
@Test
void testChat() throws Exception {
// Mock AI响应
when(chatClient.call(anyString())).thenReturn("这是AI的回复");
// 发送请求
mockMvc.perform(post("/api/chat")
.contentType(MediaType.APPLICATION_JSON)
.content("{\"message\":\"你好\",\"sessionId\":\"test-123\"}"))
.andExpect(status().isOk())
.andExpect(jsonPath("$.response").exists());
}
@Test
void testRateLimit() throws Exception {
// 快速发送21个请求
for (int i = 0; i < 21; i++) {
mockMvc.perform(post("/api/chat")
.contentType(MediaType.APPLICATION_JSON)
.content("{\"message\":\"test\"}"));
}
// 第21个请求应该被限流
mockMvc.perform(post("/api/chat")
.contentType(MediaType.APPLICATION_JSON)
.content("{\"message\":\"test\"}"))
.andExpect(status().is(429));
}
}
下一步优化方向
现在这个版本已经比较完善了,但还有很多可以优化的地方:
1. 接入向量数据库,实现RAG
让机器人能基于知识库回答:
java
@Service
public class RAGChatService {
@Autowired
private VectorStore vectorStore;
@Autowired
private EmbeddingClient embeddingClient;
public String chatWithRAG(String question, String sessionId) {
// 1. 向量检索
List<Document> relevantDocs = vectorStore.similaritySearch(question, 5);
// 2. 构建上下文
String context = buildContext(relevantDocs);
// 3. 结合历史对话
String history = getHistoryContext(sessionId);
// 4. 调用AI
String prompt = String.format(
"基于以下知识库内容和对话历史回答问题:\n\n" +
"知识库:\n%s\n\n" +
"对话历史:\n%s\n\n" +
"问题:%s\n\n回答:",
context, history, question
);
return chatClient.call(prompt);
}
}
2. 多轮对话优化
优化上下文理解,减少重复:
java
public String optimizeContext(List<Message> history) {
// 1. 提取关键信息
Map<String, String> extractedInfo = extractKeyInfo(history);
// 2. 总结历史对话
String summary = summarizeHistory(history.subList(0, history.size() - 5));
// 3. 保留最近几轮对话
List<Message> recentHistory = history.subList(history.size() - 5, history.size());
// 4. 组合
return buildOptimizedContext(extractedInfo, summary, recentHistory);
}
3. 用户画像和个性化
根据用户历史对话,个性化回复:
java
@Service
public class PersonalizedChatService {
public String chatWithPersonalization(String userId, String message) {
// 1. 获取用户画像
UserProfile profile = userProfileService.getProfile(userId);
// 2. 个性化Prompt
String personalizedPrompt = buildPersonalizedPrompt(profile, message);
// 3. 调用AI
return chatClient.call(personalizedPrompt);
}
private String buildPersonalizedPrompt(UserProfile profile, String message) {
return String.format(
"你是一个AI助手。用户信息:%s,偏好:%s\n\n" +
"请根据用户特点回答以下问题:%s",
profile.getInfo(),
profile.getPreferences(),
message
);
}
}
4. 部署优化
容器化部署,支持水平扩展:
dockerfile
FROM eclipse-temurin:17-jre-alpine
WORKDIR /app
COPY target/chatbot-*.jar app.jar
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
好了,今天就到这里。这个项目虽然看起来简单,但实际开发中遇到的问题还是挺典型的。从基础功能到性能优化,从错误处理到安全控制,每个环节都有很多细节要考虑。如果你也在做类似的项目,欢迎交流踩坑经验。
完整代码我放在GitHub上了,需要的同学可以看看。链接在评论区,记得给个Star哈哈。