SpringAI 模型 API 调用中的错误处理、重试与熔断降级实战

调用外部大模型 API 时,什么网络抖动、限流、服务端 500......简直家常便饭。如果代码一遇到错误就直接把堆栈甩给用户,体验基本为零。本文就从最基础的异常捕获讲起,一路到重试、熔断降级,帮大家把这套防御体系搭完整。

一、先认清"敌人":常见错误类型

面对错误,首先要搞清楚是什么错,才能对症下药。我用一张表把常见情况归好类:

HTTP 状态码 错误类型 处理方式
401 API Key 无效或过期 报警,不重试
403 无权限(未开通某模型) 报警,不重试
429 限流(请求太频繁) 等待后重试,或切备用模型
500/502/503 服务端错误 重试 1-3 次
超时 网络或模型响应太慢 重试,或降级
余额不足 API 账户额度用完 报警,切备用

Spring AI 中,这些 HTTP 错误会被包装成 HttpClientErrorExceptionHttpServerErrorException 的子类,结构清晰,分支处理起来很方便。

二、基础异常处理:别再用一个大 catch 吞一切

此方法不优雅,了解下即可:

最基础的一层:根据异常类型分别处理,别一个 catch (Exception e) 把所有情况都混在一起。

复制代码
@RestController
public class ChatController {

    private final ChatClient chatClient;

    public ChatController(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    @GetMapping("/chat")
    public ResponseEntity<String> chat(@RequestParam String prompt) {
        try {
            String response = chatClient.call(prompt);
            return ResponseEntity.ok(response);
        } catch (HttpClientErrorException.Unauthorized e) {
            // 401,告警
            log.error("API Key 无效", e);
            return ResponseEntity.status(401).body("认证失败,请检查 API Key");
        } catch (HttpClientErrorException.Forbidden e) {
            // 403
            log.error("无权限访问", e);
            return ResponseEntity.status(403).body("无权使用该模型");
        } catch (HttpClientErrorException.TooManyRequests e) {
            // 429,重试或切备用
            log.warn("触发限流");
            return ResponseEntity.status(429).body("请求太频繁,请稍后重试");
        } catch (HttpServerErrorException e) {
            // 5xx,重试
            log.error("服务端错误", e);
            return ResponseEntity.status(502).body("AI 服务暂时不可用");
        }
    }
}

但这只能做到"有错就报",对于 429 和 5xx 这类偶发性错误,更优雅的做法是等一等再试。

三、重试机制:给请求多一次机会

方案一:Spring Retry(推荐,声明式重试)

引入依赖:

复制代码
<dependency>
    <groupId>org.springframework.retry</groupId>
    <artifactId>spring-retry</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework</groupId>
    <artifactId>spring-aspects</artifactId>
</dependency>

启动类加 @EnableRetry

复制代码
@SpringBootApplication
@EnableRetry
public class AiApplication {
    //...
}

在需要重试的 Service 方法上加 @Retryable,失败全部由 @Recover 兜底:

复制代码
@Service
public class AiService {

    private final ChatClient chatClient;

    public AiService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    @Retryable(
        retryFor = {HttpServerErrorException.class, HttpClientErrorException.TooManyRequests.class},
        maxAttempts = 3,
        backoff = @Backoff(delay = 1000, multiplier = 2)
    )
    public String callWithRetry(String prompt) {
        return chatClient.call(prompt);
    }

    @Recover
    public String recover(Exception e, String prompt) {
        log.error("重试全部失败,降级处理", e);
        return "AI 服务暂时不可用,请稍后重试";
    }
}

Controller 只需调用 Service 即可:

复制代码
@GetMapping("/chat-retry")
public ResponseEntity<String> chatRetry(@RequestParam String prompt) {
    String result = aiService.callWithRetry(prompt);
    return ResponseEntity.ok(result);
}

方案二:手动重试,处理 Retry-After 头

某些 API 在返回 429 时,响应头里会有 Retry-After,告诉你需要等多少秒。@Retryable 无法直接读取响应头,这时手动循环更合适:

复制代码
package com.jichi.springaialibaba.service;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.stereotype.Service;
import org.springframework.web.client.HttpClientErrorException;
import org.springframework.web.client.HttpServerErrorException;

@Service
public class ManualRetryChatService {

    private static final Logger log = LoggerFactory.getLogger(ManualRetryChatService.class);

    private final ChatClient chatClient;

    public ManualRetryChatService(@Qualifier("primaryChatClient") ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    public String chatWithManualRetry(String message) {
        int maxAttempts = 3;
        long delayMs = 1000;

        for (int attempt = 1; attempt <= maxAttempts; attempt++) {
            try {
                return chatClient.prompt()
                        .user(message)
                        .call()
                        .content();

            } catch (HttpClientErrorException.TooManyRequests e) {
                if (attempt == maxAttempts) {
                    throw new RuntimeException("请求频率超限,请稍后再试", e);
                }
                // 优先读响应头里的 Retry-After,没有就用默认等待时间
                String retryAfter = e.getResponseHeaders() != null
                        ? e.getResponseHeaders().getFirst("Retry-After") : null;
                long waitMs = retryAfter != null ? Long.parseLong(retryAfter) * 1000 : delayMs;
                log.warn("触发限流,等待 {}ms 后重试(第 {}/{} 次)", waitMs, attempt, maxAttempts);
                sleep(waitMs);

            } catch (HttpServerErrorException e) {
                if (attempt == maxAttempts) throw new RuntimeException("AI 服务异常", e);
                log.warn("服务端错误,{}ms 后重试(第 {}/{} 次)", delayMs, attempt, maxAttempts);
                sleep(delayMs);
                delayMs *= 2; // 指数退避

            } catch (Exception e) {
                throw new RuntimeException("AI 调用失败", e);
            }
        }

        return "AI 服务暂时不可用";
    }

    private void sleep(long ms) {
        try {
            Thread.sleep(ms);
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            throw new RuntimeException("重试被中断", e);
        }
    }
}

对应的 Controller:

复制代码
package com.jichi.springaialibaba.controller;

import com.jichi.springaialibaba.service.ManualRetryChatService;
import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/api/manual-retry")
public class ManualRetryChatController {

    private final ManualRetryChatService manualRetryChatService;

    public ManualRetryChatController(ManualRetryChatService manualRetryChatService) {
        this.manualRetryChatService = manualRetryChatService;
    }

    @GetMapping
    public String chat(@RequestParam String message) {
        return manualRetryChatService.chatWithManualRetry(message);
    }
}

四、熔断降级:避免雪崩效应

重试只能应对偶发问题。如果模型 API 持续不可用(比如连续 5 分钟),每次还重试 3 次,会严重拖慢整个系统。这时候需要熔断器------错误率超过阈值就直接快速失败,不再调用真实 API,等一段时间后自动尝试恢复。

Resilience4j 是 Spring 生态中的首选。引入依赖:

复制代码
<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-spring-boot3</artifactId>
    <version>2.3.0</version>
</dependency>

配置熔断器:

复制代码
resilience4j:
  circuitbreaker:
    instances:
      aiService:
        sliding-window-size: 10           # 统计最近 10 次调用
        failure-rate-threshold: 50        # 失败率超过 50% 触发熔断
        wait-duration-in-open-state: 30s  # 熔断后等待 30s 再尝试半开
        permitted-number-of-calls-in-half-open-state: 3  # 半开状态测试 3 次
  retry:
    instances:
      aiService:
        max-attempts: 3
        wait-duration: 1s
        retry-exceptions:
          - org.springframework.web.client.HttpServerErrorException
复制代码
package com.jichi.springaialibaba.service;

import io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker;
import io.github.resilience4j.retry.annotation.Retry;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.stereotype.Service;

@Service
public class ResilientChatService {

    private static final Logger log = LoggerFactory.getLogger(ResilientChatService.class);

    private final ChatClient primaryChatClient;
    private final ChatClient backupChatClient;

    public ResilientChatService(
            @Qualifier("primaryChatClient") ChatClient primaryChatClient,
            @Qualifier("backupChatClient") ChatClient backupChatClient) {
        this.primaryChatClient = primaryChatClient;
        this.backupChatClient = backupChatClient;
    }

    /**
     * 先重试,重试都失败后触发熔断,熔断后走降级方法
     */
    @CircuitBreaker(name = "aiService", fallbackMethod = "fallbackChat")
    @Retry(name = "aiService")
    public String chat(String message) {
        return primaryChatClient.prompt()
                .user(message)
                .call()
                .content();
    }

    /**
     * 降级方法:主模型熔断时切换到备用模型
     * 签名必须和原方法一致,最后加一个 Throwable 参数
     */
    public String fallbackChat(String message, Throwable throwable) {
        log.warn("主模型不可用({}),切换备用模型", throwable.getMessage());
        try {
            return backupChatClient.prompt()
                    .user(message)
                    .call()
                    .content();
        } catch (Exception e) {
            log.error("备用模型也不可用", e);
            return "AI 服务暂时不可用,请稍后重试。如有紧急需求,请联系客服。";
        }
    }
}

对应的 Controller:

复制代码
package com.jichi.springaialibaba.controller;

import com.jichi.springaialibaba.service.ResilientChatService;
import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/api/resilient")
public class ResilientChatController {

    private final ResilientChatService resilientChatService;

    public ResilientChatController(ResilientChatService resilientChatService) {
        this.resilientChatService = resilientChatService;
    }

    @GetMapping
    public String chat(@RequestParam String message) {
        return resilientChatService.chat(message);
    }
}

五、全局异常处理:让 Controller 清爽起来

前面我们在 Controller 里写了不少 try-catch,重复又难看。用 @RestControllerAdvice 统一拦截 AI 相关异常,返回统一的错误结构。

复制代码
package com.jichi.springaialibaba.exception;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.http.HttpStatus;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.ExceptionHandler;
import org.springframework.web.bind.annotation.RestControllerAdvice;
import org.springframework.web.client.HttpClientErrorException;
import org.springframework.web.client.HttpServerErrorException;

import java.util.concurrent.TimeoutException;

@RestControllerAdvice
public class AiExceptionHandler {

    private static final Logger log = LoggerFactory.getLogger(AiExceptionHandler.class);

    record ErrorResponse(String code, String message) {}

    @ExceptionHandler(HttpClientErrorException.Unauthorized.class)
    public ResponseEntity<ErrorResponse> handleUnauthorized(HttpClientErrorException.Unauthorized e) {
        log.error("API Key 无效", e);
        return ResponseEntity.status(HttpStatus.UNAUTHORIZED)
                .body(new ErrorResponse("UNAUTHORIZED", "API Key 无效或已过期"));
    }

    @ExceptionHandler(HttpClientErrorException.Forbidden.class)
    public ResponseEntity<ErrorResponse> handleForbidden(HttpClientErrorException.Forbidden e) {
        log.error("权限不足", e);
        return ResponseEntity.status(HttpStatus.FORBIDDEN)
                .body(new ErrorResponse("FORBIDDEN", "无权访问该模型"));
    }

    @ExceptionHandler(HttpClientErrorException.TooManyRequests.class)
    public ResponseEntity<ErrorResponse> handleRateLimit(HttpClientErrorException.TooManyRequests e) {
        return ResponseEntity.status(HttpStatus.TOO_MANY_REQUESTS)
                .body(new ErrorResponse("RATE_LIMIT", "请求过于频繁,请稍后再试"));
    }

    @ExceptionHandler(HttpServerErrorException.class)
    public ResponseEntity<ErrorResponse> handleServerError(HttpServerErrorException e) {
        log.error("AI 服务端错误", e);
        return ResponseEntity.status(HttpStatus.SERVICE_UNAVAILABLE)
                .body(new ErrorResponse("AI_SERVICE_ERROR", "AI 服务暂时不可用"));
    }

    @ExceptionHandler(TimeoutException.class)
    public ResponseEntity<ErrorResponse> handleTimeout(TimeoutException e) {
        return ResponseEntity.status(HttpStatus.GATEWAY_TIMEOUT)
                .body(new ErrorResponse("TIMEOUT", "AI 响应超时,请重试"));
    }
}

这样一来,Controller 里只需要专注业务逻辑,异常全被统一处理,代码干净多了。

六、总结

一套完整的 AI 调用防御体系,应该分层考虑:

  • 基础异常处理:分支响应不同类型错误

  • 重试:偶发性错误的即时补偿(Spring Retry 声明式 / 手动读取 Retry-After)

  • 熔断降级:长期故障下的快速失败(Resilience4j)

  • 超时控制:避免线程阻塞

  • 全局异常拦截:统一对外输出

把这套组合拳加到你的项目里,AI 调用就能从"一碰就碎"变成"稳如老狗"。鸡哥博客里还会持续更新更多实战细节,欢迎继续关注!

相关推荐
装不满的克莱因瓶3 天前
SpringAI Alibaba Tool工具调用机制实战-注解注册与函数调用全流程
人工智能·ai·tools·智能体·springai·tool
小当家.1054 天前
Spring AI vs LangChain4j:Java 后端接大模型,两条路线怎么选
java·人工智能·spring·langchain·springai
装不满的克莱因瓶6 天前
新版AI开发框架SpringAIAlibaba vs AgentScope 选型指南
java·开发语言·人工智能·ai·agent·alibaba·springai
奋斗的老史13 天前
Spring AI + Docling 企业级文档解析完全指南
springai·langchain4j·ai应用开发
Maiko Star13 天前
* SpringAI多模型共存指南(如何配置多模型)
人工智能·springai
架构源启13 天前
Spring AI 进阶系列- Agent 智能体开发:ReAct模式、多步推理与自主Agent实战
人工智能·spring·react·ai agent·智能体·springai
奋斗的老史15 天前
基于SpringAI开发的通用RAG脚手框架,适配各种场景
springai·ai应用开发
隔窗听雨眠17 天前
SpringAI全流程实战手册
springai
流放深圳21 天前
抓住 AI 人工智能的风口之第 1 章 —— 8万字熟练掌握 SpringAI 编程核心概念
人工智能·大模型·ollama·springai