企业级java+LangChain4j-RAG系统 限流熔断降级
1. 文档说明
本文档基于 SpringBoot3 + LangChain4j + Milvus/Chroma + MySQL + Redis 企业级AI知识库RAG项目,整合了目前业界所有主流接口限流、熔断、降级方案,包含完整可运行源码、配置、场景选型规范、生产落地标准、面试核心知识点。
所有代码无缝替换Sentinel、零冲突、可直接部署上线,适配AI问答、文档解析、大文件分片上传全业务场景。
2. 核心概念区分(生产必备)
2.1 限流(RateLimit)
控制接口QPS,防止请求量过大压垮服务,解决流量风暴、恶意刷接口问题。
适用:AI高频问答、文档上传、批量解析接口。
2.2 熔断(CircuitBreaker)
依赖服务(大模型API、向量库、数据库)超时/报错率过高时,自动切断请求,避免请求堆积、线程阻塞,防止服务雪崩。
AI项目刚需:大模型接口不稳定、响应慢、极易超时。
2.3 降级(Fallback)
服务熔断/异常时,返回预设兜底结果,保证服务可用、不报错、不雪崩。
3. 主流技术方案能力全景对比
| 技术方案 | 限流 | 熔断降级 | 分布式集群 | 性能 | 运维成本 | 适用场景 |
|---|---|---|---|---|---|---|
| Resilience4j | ✅ | ✅ 全能 | 需Redis配合 | 极高 | 极低 | 单体/微服务、AI项目首选 |
| Sentinel | ✅ | ✅ | ✅ | 高 | 中(需控制台) | 阿里生态、需动态规则可视化 |
| Redisson | ✅ 分布式 | ❌ | ✅ 强适配 | 高 | 极低 | 集群全局限流 |
| Bucket4j | ✅ 高性能 | ❌ | 支持Redis | 极致高 | 极低 | 大文件上传、高并发接口 |
| Guava RateLimiter | ✅ 单机 | ❌ | ❌ | 极高 | 0 | 小型内部项目、简单防刷 |
| Redis+Lua | ✅ 原生 | ❌ | ✅ | 高 | 极低 | 零框架、极简技术栈 |
| Hystrix | ✅ | ✅ | ✅ | 低 | 高 | 老旧项目,新项目禁用 |
4. 项目选型决策流程图(通用复用)
开始选型 → 判断是否需要熔断降级/超时防雪崩
4.1 不需要熔断(仅限流)
-
单机小项目 → Guava RateLimiter
-
高并发/大文件上传 → Bucket4j
-
集群多实例 → Redisson / Redis+Lua
-
零框架自研 → Redis+Lua
4.2 需要熔断降级(AI项目必选)
-
阿里生态、需要可视化控制台 → Sentinel
-
非阿里生态、轻量无绑定 → Resilience4j
-
单体部署 → 直接 Resilience4j 全套
-
集群部署 → Resilience4j(熔断) + Redisson(分布式限流)
5. 本RAG项目 最终生产标准技术栈
永久固定方案,无需反复选型
-
熔断、降级、超时、防雪崩:Resilience4j(Spring官方、无厂商绑定)
-
单机接口QPS限流:Resilience4j 内置限流
-
集群分布式限流:Redisson
-
大文件分片上传高并发限流:Bucket4j
-
极简备选:Guava、Redis+Lua
6. 完整项目依赖(pom.xml)
整合所有限流熔断方案、RAG核心依赖,可直接覆盖原有pom
XML
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.2.6</version>
<relativePath/>
</parent>
<groupId>com.ai</groupId>
<artifactId>langchain4j-enterprise-rag</artifactId>
<version>1.0.0</version>
<name>LangChain4j企业级RAG系统</name>
<properties>
<java.version>17</java.version>
<langchain4j.version>0.32.0</langchain4j.version>
<mybatis-plus.version>3.5.5</mybatis-plus.version>
<fastjson2.version>2.0.52</fastjson2.version>
</properties>
<dependencies>
<!-- SpringBoot基础 -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<!-- 数据库 -->
<dependency>
<groupId>com.mysql</groupId>
<artifactId>mysql-connector-j</artifactId>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>com.baomidou</groupId>
<artifactId>mybatis-plus-boot-starter</artifactId>
<version>${mybatis-plus.version}</version>
</dependency>
<!-- LangChain4j RAG核心 -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-spring-boot-starter</artifactId>
<version>${langchain4j.version}</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-tongyi</artifactId>
<version>${langchain4j.version}</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-milvus</artifactId>
<version>${langchain4j.version}</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-chroma</artifactId>
<version>${langchain4j.version}</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-document-parser-apache-tika</artifactId>
<version>${langchain4j.version}</version>
</dependency>
<!-- 限流熔断全套方案 -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-circuitbreaker-resilience4j</artifactId>
<version>3.2.1</version>
</dependency>
<dependency>
<groupId>org.redisson</groupId>
<artifactId>redisson-spring-boot-starter</artifactId>
<version>3.27.0</version>
</dependency>
<dependency>
<groupId>com.github.vladimir-bukhtoyarov</groupId>
<artifactId>bucket4j-core</artifactId>
<version>7.6.0</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>32.1.3-jre</version>
</dependency>
<!-- 工具类 -->
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>com.alibaba.fastjson2</groupId>
<artifactId>fastjson2</artifactId>
<version>${fastjson2.version}</version>
</dependency>
<dependency>
<groupId>cn.hutool</groupId>
<artifactId>hutool-all</artifactId>
<version>5.8.30</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<configuration>
<excludes>
<exclude>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
</exclude>
</excludes>
</configuration>
</plugin>
</plugins>
</build>
</project>
7. 全局配置文件(application-dev.yml)
java
spring:
datasource:
url: jdbc:mysql://127.0.0.1:3306/rag_db?useUnicode=true&characterEncoding=utf8&serverTimezone=Asia/Shanghai&allowMultiQueries=true
username: root
password: 123456
driver-class-name: com.mysql.cj.jdbc.Driver
data:
redis:
host: 127.0.0.1
port: 6379
password:
database: 0
# 通义千问大模型配置
langchain4j:
tongyi:
api-key: sk-xxx你的keyxxx
model-name: qwen-turbo
timeout: 60s
# 向量库配置
milvus:
host: 127.0.0.1
port: 19530
collection-name: enterprise_knowledge
chroma:
host: 127.0.0.1
port: 8000
# Resilience4j 熔断+限流核心配置
resilience4j:
circuitbreaker:
instances:
aiChatCircuit:
slidingWindowSize: 10
failureRateThreshold: 50
waitDurationInOpenState: 10000
permittedNumberOfCallsInHalfOpen: 3
uploadCircuit:
slidingWindowSize: 10
failureRateThreshold: 50
ratelimiter:
instances:
aiChatLimit:
limitForPeriod: 5
limitRefreshPeriod: 1000
timeoutDuration: 2000
uploadLimit:
limitForPeriod: 2
limitRefreshPeriod: 1000
8. 全套配置类源码
8.1 Bucket4jConfig.java 高性能限流配置
java
package com.ai.rag.config;
import io.github.bucket4j.Bandwidth;
import io.github.bucket4j.Bucket;
import io.github.bucket4j.Refill;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.time.Duration;
@Configuration
public class Bucket4jConfig {
@Bean
public Bucket aiChatBucket() {
Bandwidth bandwidth = Bandwidth.classic(5, Refill.greedy(5, Duration.ofSeconds(1)));
return Bucket.builder().addLimit(bandwidth).build();
}
@Bean
public Bucket uploadBucket() {
Bandwidth bandwidth = Bandwidth.classic(2, Refill.greedy(2, Duration.ofSeconds(1)));
return Bucket.builder().addLimit(bandwidth).build();
}
}
8.2 GuavaLimitConfig.java 轻量单机限流配置
java
package com.ai.rag.config;
import com.google.common.util.concurrent.RateLimiter;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class GuavaLimitConfig {
@Bean
public RateLimiter aiGuavaLimiter() {
return RateLimiter.create(5.0);
}
@Bean
public RateLimiter uploadGuavaLimiter() {
return RateLimiter.create(2.0);
}
}
9. 全套限流工具类
9.1 RedisLimitUtil.java Redisson分布式限流
java
package com.ai.rag.util;
import org.redisson.api.RRateLimiter;
import org.redisson.api.RateIntervalUnit;
import org.redisson.api.RateType;
import org.redisson.api.RedissonClient;
import org.springframework.stereotype.Component;
import javax.annotation.Resource;
@Component
public class RedisLimitUtil {
@Resource
private RedissonClient redissonClient;
public boolean tryLimit(String key, int qps) {
RRateLimiter limiter = redissonClient.getRateLimiter(key);
limiter.trySetRate(RateType.OVERALL, qps, 1, RateIntervalUnit.SECONDS);
return limiter.tryAcquire(1);
}
}
9.2 LuaLimitUtil.java Redis+Lua原生限流
java
package com.ai.rag.util;
import org.springframework.data.redis.core.StringRedisTemplate;
import org.springframework.data.redis.core.script.DefaultRedisScript;
import org.springframework.stereotype.Component;
import javax.annotation.Resource;
import java.util.List;
@Component
public class LuaLimitUtil {
@Resource
private StringRedisTemplate stringRedisTemplate;
private static final String LUA_SCRIPT =
"local key = KEYS[1] " +
"local limit = tonumber(ARGV[1]) " +
"local curr = redis.call('get', key) or 0 " +
"if curr + 1 > limit then " +
" return 0 " +
"else " +
" redis.call('incr', key) " +
" redis.call('expire', key, 1) " +
" return 1 " +
"end";
public boolean tryLimit(String key, int limit) {
DefaultRedisScript<Long> script = new DefaultRedisScript<>();
script.setScriptText(LUA_SCRIPT);
script.setResultType(Long.class);
Long result = stringRedisTemplate.execute(script, List.of(key), String.valueOf(limit));
return result != null && result == 1;
}
}
10. 终极整合Controller(5套方案全覆盖)
默认启用 Resilience4j 熔断限流,同时预留其余4套方案接口,无缝切换
java
package com.ai.rag.controller;
import com.ai.rag.common.R;
import com.ai.rag.service.DocumentService;
import com.ai.rag.service.RagQaService;
import com.ai.rag.util.LuaLimitUtil;
import com.ai.rag.util.RedisLimitUtil;
import com.google.common.util.concurrent.RateLimiter;
import io.github.bucket4j.Bucket;
import lombok.RequiredArgsConstructor;
import org.springframework.web.bind.annotation.*;
import org.springframework.web.multipart.MultipartFile;
import javax.annotation.Resource;
@RestController
@RequestMapping("/api/rag")
@RequiredArgsConstructor
public class RagController {
private final DocumentService documentService;
private final RagQaService ragQaService;
@Resource
private Bucket aiChatBucket;
@Resource
private Bucket uploadBucket;
@Resource
private RateLimiter aiGuavaLimiter;
@Resource
private RateLimiter uploadGuavaLimiter;
@Resource
private RedisLimitUtil redisLimitUtil;
@Resource
private LuaLimitUtil luaLimitUtil;
// 1. Resilience4j 熔断+限流 【生产默认主方案】
@GetMapping("/chat")
@io.github.resilience4j.ratelimiter.annotation.RateLimiter(name = "aiChatLimit")
@io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker(name = "aiChatCircuit", fallbackMethod = "chatFallback")
public R<String> chat(@RequestParam String sessionId,
@RequestParam String question) {
return R.ok(ragQaService.chat(sessionId, question));
}
@PostMapping("/upload")
@io.github.resilience4j.ratelimiter.annotation.RateLimiter(name = "uploadLimit")
@io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker(name = "uploadCircuit", fallbackMethod = "uploadFallback")
public R<String> upload(@RequestParam MultipartFile file) throws Exception {
documentService.uploadAndEmbed(file);
return R.ok("文档上传并完成向量化");
}
// 2. Bucket4j 高性能限流接口
@GetMapping("/chat/bucket4j")
public R<String> chatByBucket4j(@RequestParam String sessionId, @RequestParam String question) {
if (!aiChatBucket.tryConsume(1)) {
return R.fail("【Bucket4j】AI问答接口访问限流");
}
return R.ok(ragQaService.chat(sessionId, question));
}
// 3. Guava 轻量限流接口
@GetMapping("/chat/guava")
public R<String> chatByGuava(@RequestParam String sessionId, @RequestParam String question) {
if (!aiGuavaLimiter.tryAcquire()) {
return R.fail("【Guava】AI问答访问频繁,请稍后");
}
return R.ok(ragQaService.chat(sessionId, question));
}
// 4. Redisson 分布式限流接口
@GetMapping("/chat/redisson")
public R<String> chatByRedisson(@RequestParam String sessionId, @RequestParam String question) {
if (!redisLimitUtil.tryLimit("ai:chat:cluster:limit", 5)) {
return R.fail("【Redisson】集群访问限流");
}
return R.ok(ragQaService.chat(sessionId, question));
}
// 5. Redis-Lua 原生限流接口
@GetMapping("/chat/lua")
public R<String> chatByLua(@RequestParam String sessionId, @RequestParam String question) {
if (!luaLimitUtil.tryLimit("ai:chat:lua:limit", 5)) {
return R.fail("【Lua】接口请求受限");
}
return R.ok(ragQaService.chat(sessionId, question));
}
@DeleteMapping("/clear")
public R<String> clear() {
documentService.clearVectorStore();
return R.ok("向量库清空成功");
}
// 统一降级兜底方法
public R<String> chatFallback(String sessionId, String question, Throwable e) {
return R.fail("AI服务繁忙,已熔断降级保护,请稍后重试");
}
public R<String> uploadFallback(MultipartFile file, Throwable e) {
return R.fail("文档上传服务异常,已降级");
}
}
11. 生产避坑规范
-
❌ 禁止仅用限流无熔断:AI大模型超时堆积必雪崩
-
❌ 集群环境禁止使用Guava/本地Bucket4j,限流失效
-
❌ 新项目禁止使用Hystrix、老旧自研熔断
-
✅ 职责拆分:熔断统一Resilience4j,限流按场景拆分
-
✅ 大文件上传独立限流,不占用问答接口流量配额
-
✅ 集群环境必须搭配Redisson实现全局流量管控
12. 项目启动顺序
-
启动 Redis、MySQL、Milvus/Chroma 向量库
-
修改yml中大通义千问API密钥
-
刷新Maven依赖
-
启动SpringBoot项目
-
默认接口 /api/rag/chat 自带熔断+限流降级
13. 接口测试清单
-
Resilience4j默认:GET /api/rag/chat
-
Bucket4j:GET /api/rag/chat/bucket4j
-
Guava:GET /api/rag/chat/guava
-
Redisson分布式:GET /api/rag/chat/redisson
-
Lua原生:GET /api/rag/chat/lua
-
文档上传:POST /api/rag/upload
-
清空向量库:DELETE /api/rag/clear