企业级java+LangChain4j-RAG系统 限流熔断降级

企业级java+LangChain4j-RAG系统 限流熔断降级

1. 文档说明

本文档基于 SpringBoot3 + LangChain4j + Milvus/Chroma + MySQL + Redis 企业级AI知识库RAG项目,整合了目前业界所有主流接口限流、熔断、降级方案,包含完整可运行源码、配置、场景选型规范、生产落地标准、面试核心知识点。

所有代码无缝替换Sentinel、零冲突、可直接部署上线,适配AI问答、文档解析、大文件分片上传全业务场景。

2. 核心概念区分(生产必备)

2.1 限流(RateLimit)

控制接口QPS,防止请求量过大压垮服务,解决流量风暴、恶意刷接口问题。

适用:AI高频问答、文档上传、批量解析接口。

2.2 熔断(CircuitBreaker)

依赖服务(大模型API、向量库、数据库)超时/报错率过高时,自动切断请求,避免请求堆积、线程阻塞,防止服务雪崩

AI项目刚需:大模型接口不稳定、响应慢、极易超时。

2.3 降级(Fallback)

服务熔断/异常时,返回预设兜底结果,保证服务可用、不报错、不雪崩

3. 主流技术方案能力全景对比

技术方案 限流 熔断降级 分布式集群 性能 运维成本 适用场景
Resilience4j ✅ 全能 需Redis配合 极高 极低 单体/微服务、AI项目首选
Sentinel 中(需控制台) 阿里生态、需动态规则可视化
Redisson ✅ 分布式 ✅ 强适配 极低 集群全局限流
Bucket4j ✅ 高性能 支持Redis 极致高 极低 大文件上传、高并发接口
Guava RateLimiter ✅ 单机 极高 0 小型内部项目、简单防刷
Redis+Lua ✅ 原生 极低 零框架、极简技术栈
Hystrix 老旧项目,新项目禁用

4. 项目选型决策流程图(通用复用)

开始选型 → 判断是否需要熔断降级/超时防雪崩

4.1 不需要熔断(仅限流)

  • 单机小项目 → Guava RateLimiter

  • 高并发/大文件上传 → Bucket4j

  • 集群多实例 → Redisson / Redis+Lua

  • 零框架自研 → Redis+Lua

4.2 需要熔断降级(AI项目必选)

  • 阿里生态、需要可视化控制台 → Sentinel

  • 非阿里生态、轻量无绑定 → Resilience4j

  • 单体部署 → 直接 Resilience4j 全套

  • 集群部署 → Resilience4j(熔断) + Redisson(分布式限流)

5. 本RAG项目 最终生产标准技术栈

永久固定方案,无需反复选型

  • 熔断、降级、超时、防雪崩:Resilience4j(Spring官方、无厂商绑定)

  • 单机接口QPS限流:Resilience4j 内置限流

  • 集群分布式限流:Redisson

  • 大文件分片上传高并发限流:Bucket4j

  • 极简备选:Guava、Redis+Lua

6. 完整项目依赖(pom.xml)

整合所有限流熔断方案、RAG核心依赖,可直接覆盖原有pom

XML 复制代码
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.2.6</version>
        <relativePath/>
    </parent>

    <groupId>com.ai</groupId>
    <artifactId>langchain4j-enterprise-rag</artifactId>
    <version>1.0.0</version>
    <name>LangChain4j企业级RAG系统</name>

    <properties>
        <java.version>17</java.version>
        <langchain4j.version>0.32.0</langchain4j.version>
        <mybatis-plus.version>3.5.5</mybatis-plus.version>
        <fastjson2.version>2.0.52</fastjson2.version>
    </properties>

    <dependencies>
        <!-- SpringBoot基础 -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>

        <!-- 数据库 -->
        <dependency>
            <groupId>com.mysql</groupId>
            <artifactId>mysql-connector-j</artifactId>
            <scope>runtime</scope>
        </dependency>
        <dependency>
            <groupId>com.baomidou</groupId>
            <artifactId>mybatis-plus-boot-starter</artifactId>
            <version>${mybatis-plus.version}</version>
        </dependency>

        <!-- LangChain4j RAG核心 -->
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-spring-boot-starter</artifactId>
            <version>${langchain4j.version}</version>
        </dependency>
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-tongyi</artifactId>
            <version>${langchain4j.version}</version>
        </dependency>
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-milvus</artifactId>
            <version>${langchain4j.version}</version>
        </dependency>
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-chroma</artifactId>
            <version>${langchain4j.version}</version>
        </dependency>
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-document-parser-apache-tika</artifactId>
            <version>${langchain4j.version}</version>
        </dependency>

        <!-- 限流熔断全套方案 -->
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-starter-circuitbreaker-resilience4j</artifactId>
            <version>3.2.1</version>
        </dependency>
        <dependency>
            <groupId>org.redisson</groupId>
            <artifactId>redisson-spring-boot-starter</artifactId>
            <version>3.27.0</version>
        </dependency>
        <dependency>
            <groupId>com.github.vladimir-bukhtoyarov</groupId>
            <artifactId>bucket4j-core</artifactId>
            <version>7.6.0</version>
        </dependency>
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>32.1.3-jre</version>
        </dependency>

        <!-- 工具类 -->
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>com.alibaba.fastjson2</groupId>
            <artifactId>fastjson2</artifactId>
            <version>${fastjson2.version}</version>
        </dependency>
        <dependency>
            <groupId>cn.hutool</groupId>
            <artifactId>hutool-all</artifactId>
            <version>5.8.30</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
                <configuration>
                    <excludes>
                        <exclude>
                            <groupId>org.projectlombok</groupId>
                            <artifactId>lombok</artifactId>
                        </exclude>
                    </excludes>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

7. 全局配置文件(application-dev.yml)

java 复制代码
spring:
  datasource:
    url: jdbc:mysql://127.0.0.1:3306/rag_db?useUnicode=true&characterEncoding=utf8&serverTimezone=Asia/Shanghai&allowMultiQueries=true
    username: root
    password: 123456
    driver-class-name: com.mysql.cj.jdbc.Driver
  data:
    redis:
      host: 127.0.0.1
      port: 6379
      password:
      database: 0

# 通义千问大模型配置
langchain4j:
  tongyi:
    api-key: sk-xxx你的keyxxx
    model-name: qwen-turbo
    timeout: 60s

# 向量库配置
milvus:
  host: 127.0.0.1
  port: 19530
  collection-name: enterprise_knowledge

chroma:
  host: 127.0.0.1
  port: 8000

# Resilience4j 熔断+限流核心配置
resilience4j:
  circuitbreaker:
    instances:
      aiChatCircuit:
        slidingWindowSize: 10
        failureRateThreshold: 50
        waitDurationInOpenState: 10000
        permittedNumberOfCallsInHalfOpen: 3
      uploadCircuit:
        slidingWindowSize: 10
        failureRateThreshold: 50
  ratelimiter:
    instances:
      aiChatLimit:
        limitForPeriod: 5
        limitRefreshPeriod: 1000
        timeoutDuration: 2000
      uploadLimit:
        limitForPeriod: 2
        limitRefreshPeriod: 1000

8. 全套配置类源码

8.1 Bucket4jConfig.java 高性能限流配置

java 复制代码
package com.ai.rag.config;

import io.github.bucket4j.Bandwidth;
import io.github.bucket4j.Bucket;
import io.github.bucket4j.Refill;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.time.Duration;

@Configuration
public class Bucket4jConfig {
    @Bean
    public Bucket aiChatBucket() {
        Bandwidth bandwidth = Bandwidth.classic(5, Refill.greedy(5, Duration.ofSeconds(1)));
        return Bucket.builder().addLimit(bandwidth).build();
    }

    @Bean
    public Bucket uploadBucket() {
        Bandwidth bandwidth = Bandwidth.classic(2, Refill.greedy(2, Duration.ofSeconds(1)));
        return Bucket.builder().addLimit(bandwidth).build();
    }
}

8.2 GuavaLimitConfig.java 轻量单机限流配置

java 复制代码
package com.ai.rag.config;

import com.google.common.util.concurrent.RateLimiter;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class GuavaLimitConfig {
    @Bean
    public RateLimiter aiGuavaLimiter() {
        return RateLimiter.create(5.0);
    }

    @Bean
    public RateLimiter uploadGuavaLimiter() {
        return RateLimiter.create(2.0);
    }
}

9. 全套限流工具类

9.1 RedisLimitUtil.java Redisson分布式限流

java 复制代码
package com.ai.rag.util;

import org.redisson.api.RRateLimiter;
import org.redisson.api.RateIntervalUnit;
import org.redisson.api.RateType;
import org.redisson.api.RedissonClient;
import org.springframework.stereotype.Component;
import javax.annotation.Resource;

@Component
public class RedisLimitUtil {
    @Resource
    private RedissonClient redissonClient;

    public boolean tryLimit(String key, int qps) {
        RRateLimiter limiter = redissonClient.getRateLimiter(key);
        limiter.trySetRate(RateType.OVERALL, qps, 1, RateIntervalUnit.SECONDS);
        return limiter.tryAcquire(1);
    }
}

9.2 LuaLimitUtil.java Redis+Lua原生限流

java 复制代码
package com.ai.rag.util;

import org.springframework.data.redis.core.StringRedisTemplate;
import org.springframework.data.redis.core.script.DefaultRedisScript;
import org.springframework.stereotype.Component;
import javax.annotation.Resource;
import java.util.List;

@Component
public class LuaLimitUtil {
    @Resource
    private StringRedisTemplate stringRedisTemplate;

    private static final String LUA_SCRIPT =
            "local key = KEYS[1] " +
            "local limit = tonumber(ARGV[1]) " +
            "local curr = redis.call('get', key) or 0 " +
            "if curr + 1 > limit then " +
            "  return 0 " +
            "else " +
            "  redis.call('incr', key) " +
            "  redis.call('expire', key, 1) " +
            "  return 1 " +
            "end";

    public boolean tryLimit(String key, int limit) {
        DefaultRedisScript<Long> script = new DefaultRedisScript<>();
        script.setScriptText(LUA_SCRIPT);
        script.setResultType(Long.class);
        Long result = stringRedisTemplate.execute(script, List.of(key), String.valueOf(limit));
        return result != null && result == 1;
    }
}

10. 终极整合Controller(5套方案全覆盖)

默认启用 Resilience4j 熔断限流,同时预留其余4套方案接口,无缝切换

java 复制代码
package com.ai.rag.controller;

import com.ai.rag.common.R;
import com.ai.rag.service.DocumentService;
import com.ai.rag.service.RagQaService;
import com.ai.rag.util.LuaLimitUtil;
import com.ai.rag.util.RedisLimitUtil;
import com.google.common.util.concurrent.RateLimiter;
import io.github.bucket4j.Bucket;
import lombok.RequiredArgsConstructor;
import org.springframework.web.bind.annotation.*;
import org.springframework.web.multipart.MultipartFile;
import javax.annotation.Resource;

@RestController
@RequestMapping("/api/rag")
@RequiredArgsConstructor
public class RagController {
    private final DocumentService documentService;
    private final RagQaService ragQaService;

    @Resource
    private Bucket aiChatBucket;
    @Resource
    private Bucket uploadBucket;

    @Resource
    private RateLimiter aiGuavaLimiter;
    @Resource
    private RateLimiter uploadGuavaLimiter;

    @Resource
    private RedisLimitUtil redisLimitUtil;
    @Resource
    private LuaLimitUtil luaLimitUtil;

    // 1. Resilience4j 熔断+限流 【生产默认主方案】
    @GetMapping("/chat")
    @io.github.resilience4j.ratelimiter.annotation.RateLimiter(name = "aiChatLimit")
    @io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker(name = "aiChatCircuit", fallbackMethod = "chatFallback")
    public R<String> chat(@RequestParam String sessionId,
                           @RequestParam String question) {
        return R.ok(ragQaService.chat(sessionId, question));
    }

    @PostMapping("/upload")
    @io.github.resilience4j.ratelimiter.annotation.RateLimiter(name = "uploadLimit")
    @io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker(name = "uploadCircuit", fallbackMethod = "uploadFallback")
    public R<String> upload(@RequestParam MultipartFile file) throws Exception {
        documentService.uploadAndEmbed(file);
        return R.ok("文档上传并完成向量化");
    }

    // 2. Bucket4j 高性能限流接口
    @GetMapping("/chat/bucket4j")
    public R<String> chatByBucket4j(@RequestParam String sessionId, @RequestParam String question) {
        if (!aiChatBucket.tryConsume(1)) {
            return R.fail("【Bucket4j】AI问答接口访问限流");
        }
        return R.ok(ragQaService.chat(sessionId, question));
    }

    // 3. Guava 轻量限流接口
    @GetMapping("/chat/guava")
    public R<String> chatByGuava(@RequestParam String sessionId, @RequestParam String question) {
        if (!aiGuavaLimiter.tryAcquire()) {
            return R.fail("【Guava】AI问答访问频繁,请稍后");
        }
        return R.ok(ragQaService.chat(sessionId, question));
    }

    // 4. Redisson 分布式限流接口
    @GetMapping("/chat/redisson")
    public R<String> chatByRedisson(@RequestParam String sessionId, @RequestParam String question) {
        if (!redisLimitUtil.tryLimit("ai:chat:cluster:limit", 5)) {
            return R.fail("【Redisson】集群访问限流");
        }
        return R.ok(ragQaService.chat(sessionId, question));
    }

    // 5. Redis-Lua 原生限流接口
    @GetMapping("/chat/lua")
    public R<String> chatByLua(@RequestParam String sessionId, @RequestParam String question) {
        if (!luaLimitUtil.tryLimit("ai:chat:lua:limit", 5)) {
            return R.fail("【Lua】接口请求受限");
        }
        return R.ok(ragQaService.chat(sessionId, question));
    }

    @DeleteMapping("/clear")
    public R<String> clear() {
        documentService.clearVectorStore();
        return R.ok("向量库清空成功");
    }

    // 统一降级兜底方法
    public R<String> chatFallback(String sessionId, String question, Throwable e) {
        return R.fail("AI服务繁忙,已熔断降级保护,请稍后重试");
    }

    public R<String> uploadFallback(MultipartFile file, Throwable e) {
        return R.fail("文档上传服务异常,已降级");
    }
}

11. 生产避坑规范

  • ❌ 禁止仅用限流无熔断:AI大模型超时堆积必雪崩

  • ❌ 集群环境禁止使用Guava/本地Bucket4j,限流失效

  • ❌ 新项目禁止使用Hystrix、老旧自研熔断

  • ✅ 职责拆分:熔断统一Resilience4j,限流按场景拆分

  • ✅ 大文件上传独立限流,不占用问答接口流量配额

  • ✅ 集群环境必须搭配Redisson实现全局流量管控

12. 项目启动顺序

  1. 启动 Redis、MySQL、Milvus/Chroma 向量库

  2. 修改yml中大通义千问API密钥

  3. 刷新Maven依赖

  4. 启动SpringBoot项目

  5. 默认接口 /api/rag/chat 自带熔断+限流降级

13. 接口测试清单

  • Resilience4j默认:GET /api/rag/chat

  • Bucket4j:GET /api/rag/chat/bucket4j

  • Guava:GET /api/rag/chat/guava

  • Redisson分布式:GET /api/rag/chat/redisson

  • Lua原生:GET /api/rag/chat/lua

  • 文档上传:POST /api/rag/upload

  • 清空向量库:DELETE /api/rag/clear

相关推荐
Drug1 小时前
Struts2 从入门到放弃?不,这些核心知识你依然需要掌握
java
Slow菜鸟1 小时前
Codex CLI 教程(五)| AI 驱动项目从零到一:面向 Java 全栈工程师打造个人 ECC(V2版)
java·开发语言·人工智能
lsx2024061 小时前
Julia 基本运算符
开发语言
月落归舟2 小时前
java基础之拷贝、单例
java·单例·拷贝
鬼蛟2 小时前
什么是 Git
java
2501_921649492 小时前
企业定制金融数据 API:从架构设计到 Python 接入实战
大数据·开发语言·python·websocket·金融·量化
直奔標竿2 小时前
SpringAI + RAG + MCP + Agent 零基础全栈实战(完结篇)| 27课完整汇总,Java开发者AI转型必看
java·开发语言·人工智能·spring boot·后端·spring
云烟成雨TD2 小时前
Spring AI 1.x 系列【31】向量数据库:进阶使用指南
java·人工智能·spring
hjxu20162 小时前
【LangGraph入门 3】精细控制之图的运行时配置和map-reduce
langchain·langgraph