📅 难度:⭐⭐⭐⭐⭐ 高级 | 阅读约 30 分钟 | 适用:Spring Boot 3.x | Java 17+ | Redis
目录
[三、JWT 鉴权:Spring Security 集成](#三、JWT 鉴权:Spring Security 集成)
[3.1 依赖配置](#3.1 依赖配置)
[3.2 JWT 工具类](#3.2 JWT 工具类)
[3.3 JWT 认证过滤器](#3.3 JWT 认证过滤器)
[3.4 用户主体对象](#3.4 用户主体对象)
[3.5 Security 配置](#3.5 Security 配置)
[3.6 认证接口(登录 / 刷新 Token)](#3.6 认证接口(登录 / 刷新 Token))
[四、多维度限流:Redis 滑动窗口算法](#四、多维度限流:Redis 滑动窗口算法)
[4.1 限流配置](#4.1 限流配置)
[4.2 Lua 脚本:原子滑动窗口](#4.2 Lua 脚本:原子滑动窗口)
[4.3 限流拦截器](#4.3 限流拦截器)
[4.4 注册拦截器](#4.4 注册拦截器)
[五、Token 成本追踪](#五、Token 成本追踪)
[5.1 Token 计费配置](#5.1 Token 计费配置)
[5.2 成本计算 + Redis 累计](#5.2 成本计算 + Redis 累计)
[六、Micrometer 监控:AOP 全链路埋点](#六、Micrometer 监控:AOP 全链路埋点)
[6.1 自定义监控切面](#6.1 自定义监控切面)
[6.2 暴露 Prometheus 端点](#6.2 暴露 Prometheus 端点)
[6.3 Grafana 仪表盘核心指标(PromQL)](#6.3 Grafana 仪表盘核心指标(PromQL))
[7.1 审计日志实体](#7.1 审计日志实体)
[7.2 异步审计服务](#7.2 异步审计服务)
[7.3 审计线程池配置](#7.3 审计线程池配置)
[十一、Docker Compose:一键启动依赖](#十一、Docker Compose:一键启动依赖)
一、为什么前五篇还不够上线?
经过前五篇的学习,我们已经有了:
- ✅ 可调用的 AI 接口
- ✅ 完善的服务层封装
- ✅ 流式输出体验
- ✅ 多轮对话能力
- ✅ 工具调用框架
但直接暴露到公网,会遇到灾难性的问题:
| 风险 | 后果 |
|---|---|
| 无鉴权 | 任何人都能调用,API 费用被恶意消耗 |
| 无限流 | 单用户恶意刷接口,导致 Anthropic 限速整个账号 |
| 无监控 | 接口变慢/出错无人感知,故障发现靠用户投诉 |
| 无成本追踪 | Token 消耗不透明,月底账单才发现超支 |
| 无审计日志 | 出了问题无法溯源,合规压力大 |
本篇是系列终章,将把前面所有代码武装成真正可以上线的生产级服务,涵盖:
- JWT 鉴权:基于 Spring Security,接口级别权限控制
- 多维度限流:用户级 + IP 级 + 全局级,基于 Redis + Lua 脚本
- Token 成本追踪:实时统计每个用户的 Token 消耗和费用
- Micrometer 监控:接入 Prometheus + Grafana,可视化 AI 调用指标
- 结构化审计日志:完整记录每次 AI 调用的上下文,支持问题溯源
- 优雅停机:正在进行的 AI 对话不被强制中断
二、整体架构:安全防护链
bash
外部请求
│
┌────────────▼─────────────┐
│ Nginx / Gateway │ ← IP 封禁、SSL 终止
└────────────┬─────────────┘
│
┌────────────▼─────────────┐
│ Spring Security 过滤链 │
│ ① JWT 认证过滤器 │ ← 验证 Token 合法性
│ ② 权限校验 │ ← ROLE_USER / ROLE_VIP
└────────────┬─────────────┘
│
┌────────────▼─────────────┐
│ RateLimitInterceptor │ ← Redis 滑动窗口限流
│ · 用户级:60次/分钟 │
│ · IP 级:100次/分钟 │
│ · 全局:1000次/分钟 │
└────────────┬─────────────┘
│
┌────────────▼─────────────┐
│ AiMetricsAspect │ ← AOP 埋点:耗时、Token、费用
└────────────┬─────────────┘
│
┌────────────▼─────────────┐
│ ToolChatService / │
│ ClawChatService │ ← 业务层(前几篇实现)
└────────────┬─────────────┘
│
┌────────────▼─────────────┐
│ AuditLogService │ ← 异步写入审计日志
└──────────────────────────┘
三、JWT 鉴权:Spring Security 集成
3.1 依赖配置
XML
<!-- pom.xml -->
<!-- Spring Security -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-security</artifactId>
</dependency>
<!-- JWT(使用 JJWT,主流 Java JWT 库) -->
<dependency>
<groupId>io.jsonwebtoken</groupId>
<artifactId>jjwt-api</artifactId>
<version>0.12.6</version>
</dependency>
<dependency>
<groupId>io.jsonwebtoken</groupId>
<artifactId>jjwt-impl</artifactId>
<version>0.12.6</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>io.jsonwebtoken</groupId>
<artifactId>jjwt-jackson</artifactId>
<version>0.12.6</version>
<scope>runtime</scope>
</dependency>
<!-- Redis(限流 + 会话) -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
<!-- Micrometer + Prometheus -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
3.2 JWT 工具类
java
package com.example.openclaw_demo.security;
import io.jsonwebtoken.*;
import io.jsonwebtoken.security.Keys;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Component;
import javax.crypto.SecretKey;
import java.nio.charset.StandardCharsets;
import java.util.Date;
import java.util.List;
import java.util.Map;
@Slf4j
@Component
public class JwtTokenProvider {
@Value("${app.jwt.secret}")
private String jwtSecret;
@Value("${app.jwt.expiration-ms:86400000}") // 默认 24 小时
private long jwtExpirationMs;
/**
* 生成 JWT Token
*
* @param userId 用户 ID
* @param username 用户名
* @param roles 角色列表(如 ["ROLE_USER", "ROLE_VIP"])
*/
public String generateToken(String userId, String username, List<String> roles) {
Date now = new Date();
Date expiry = new Date(now.getTime() + jwtExpirationMs);
return Jwts.builder()
.subject(userId)
.claim("username", username)
.claim("roles", roles)
.issuedAt(now)
.expiration(expiry)
.signWith(getSignKey())
.compact();
}
/** 从 Token 中解析用户 ID */
public String getUserId(String token) {
return parseClaims(token).getSubject();
}
/** 从 Token 中解析用户名 */
public String getUsername(String token) {
return (String) parseClaims(token).get("username");
}
/** 从 Token 中解析角色列表 */
@SuppressWarnings("unchecked")
public List<String> getRoles(String token) {
return (List<String>) parseClaims(token).get("roles");
}
/** 验证 Token 是否合法 */
public boolean validateToken(String token) {
try {
parseClaims(token);
return true;
} catch (ExpiredJwtException e) {
log.warn("[JWT] Token 已过期: {}", e.getMessage());
} catch (MalformedJwtException e) {
log.warn("[JWT] Token 格式错误: {}", e.getMessage());
} catch (SecurityException e) {
log.warn("[JWT] Token 签名无效: {}", e.getMessage());
} catch (Exception e) {
log.warn("[JWT] Token 验证失败: {}", e.getMessage());
}
return false;
}
private Claims parseClaims(String token) {
return Jwts.parser()
.verifyWith(getSignKey())
.build()
.parseSignedClaims(token)
.getPayload();
}
private SecretKey getSignKey() {
byte[] keyBytes = jwtSecret.getBytes(StandardCharsets.UTF_8);
return Keys.hmacShaKeyFor(keyBytes);
}
}
3.3 JWT 认证过滤器
java
package com.example.openclaw_demo.security;
import jakarta.servlet.FilterChain;
import jakarta.servlet.ServletException;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.security.authentication.UsernamePasswordAuthenticationToken;
import org.springframework.security.core.authority.SimpleGrantedAuthority;
import org.springframework.security.core.context.SecurityContextHolder;
import org.springframework.stereotype.Component;
import org.springframework.util.StringUtils;
import org.springframework.web.filter.OncePerRequestFilter;
import java.io.IOException;
import java.util.List;
import java.util.stream.Collectors;
@Slf4j
@Component
@RequiredArgsConstructor
public class JwtAuthFilter extends OncePerRequestFilter {
private final JwtTokenProvider jwtTokenProvider;
@Override
protected void doFilterInternal(HttpServletRequest request,
HttpServletResponse response,
FilterChain filterChain)
throws ServletException, IOException {
String token = extractToken(request);
if (StringUtils.hasText(token) && jwtTokenProvider.validateToken(token)) {
try {
// 解析用户信息
String userId = jwtTokenProvider.getUserId(token);
String username = jwtTokenProvider.getUsername(token);
List<String> roles = jwtTokenProvider.getRoles(token);
// 构建权限列表
List<SimpleGrantedAuthority> authorities = roles.stream()
.map(SimpleGrantedAuthority::new)
.collect(Collectors.toList());
// 创建认证对象,存入 Security 上下文
// principal 用 AiUserPrincipal 携带完整用户信息,供后续限流、日志使用
AiUserPrincipal principal = new AiUserPrincipal(userId, username, roles);
UsernamePasswordAuthenticationToken auth =
new UsernamePasswordAuthenticationToken(principal, null, authorities);
SecurityContextHolder.getContext().setAuthentication(auth);
log.debug("[JWT] 认证成功: userId={}, roles={}", userId, roles);
} catch (Exception e) {
log.warn("[JWT] 认证信息解析失败: {}", e.getMessage());
SecurityContextHolder.clearContext();
}
}
filterChain.doFilter(request, response);
}
/** 从 Authorization 头或 Query 参数中提取 Token */
private String extractToken(HttpServletRequest request) {
// 优先从 Header 提取(Bearer Token 标准方式)
String bearerToken = request.getHeader("Authorization");
if (StringUtils.hasText(bearerToken) && bearerToken.startsWith("Bearer ")) {
return bearerToken.substring(7);
}
// 降级:从 Query 参数提取(适合 EventSource,因为其不支持自定义 Header)
return request.getParameter("token");
}
}
3.4 用户主体对象
java
package com.example.openclaw_demo.security;
import lombok.AllArgsConstructor;
import lombok.Data;
import java.util.List;
/**
* 认证后存储在 SecurityContext 中的用户信息
* 可在任意地方通过 SecurityContextHolder 获取
*/
@Data
@AllArgsConstructor
public class AiUserPrincipal {
private String userId;
private String username;
private List<String> roles;
/** 是否是 VIP 用户(享有更高限流配额) */
public boolean isVip() {
return roles != null && roles.contains("ROLE_VIP");
}
/** 是否是管理员 */
public boolean isAdmin() {
return roles != null && roles.contains("ROLE_ADMIN");
}
/** 从 SecurityContext 中获取当前用户(静态工具方法) */
public static AiUserPrincipal current() {
var auth = org.springframework.security.core.context.SecurityContextHolder
.getContext().getAuthentication();
if (auth == null || !(auth.getPrincipal() instanceof AiUserPrincipal p)) {
return null;
}
return p;
}
}
3.5 Security 配置
java
package com.example.openclaw_demo.security;
import lombok.RequiredArgsConstructor;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.http.HttpMethod;
import org.springframework.security.config.annotation.method.configuration.EnableMethodSecurity;
import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.config.annotation.web.configuration.EnableWebSecurity;
import org.springframework.security.config.http.SessionCreationPolicy;
import org.springframework.security.web.SecurityFilterChain;
import org.springframework.security.web.authentication.UsernamePasswordAuthenticationFilter;
@Configuration
@EnableWebSecurity
@EnableMethodSecurity // 开启方法级别权限注解(@PreAuthorize)
@RequiredArgsConstructor
public class SecurityConfig {
private final JwtAuthFilter jwtAuthFilter;
@Bean
public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
return http
// 关闭 CSRF(REST API 使用 Token 认证,不需要 CSRF 保护)
.csrf(csrf -> csrf.disable())
// 无状态会话(每次请求通过 JWT 认证)
.sessionManagement(sm -> sm.sessionCreationPolicy(SessionCreationPolicy.STATELESS))
// 路由权限配置
.authorizeHttpRequests(auth -> auth
// 完全公开:登录、注册、健康检查
.requestMatchers("/api/auth/**").permitAll()
.requestMatchers("/actuator/health").permitAll()
// Prometheus 指标端点:仅限内网/管理员(生产环境建议配 IP 白名单)
.requestMatchers("/actuator/prometheus").hasRole("ADMIN")
// SSE 流式接口:支持 Query 参数传 token(EventSource 不支持自定义 Header)
.requestMatchers("/api/v1/stream/**").authenticated()
// AI 聊天接口:需要登录
.requestMatchers("/api/v1/**").authenticated()
// VIP 专属接口:需要 VIP 角色
.requestMatchers("/api/vip/**").hasRole("VIP")
// 其他:拒绝
.anyRequest().denyAll()
)
// 统一认证失败响应(返回 JSON 而非默认的 HTML 页面)
.exceptionHandling(ex -> ex
.authenticationEntryPoint((req, res, e) -> {
res.setStatus(401);
res.setContentType("application/json;charset=UTF-8");
res.getWriter().write("{\"code\":401,\"message\":\"未登录或 Token 已过期\"}");
})
.accessDeniedHandler((req, res, e) -> {
res.setStatus(403);
res.setContentType("application/json;charset=UTF-8");
res.getWriter().write("{\"code\":403,\"message\":\"权限不足\"}");
})
)
// 将 JWT 过滤器插在 UsernamePasswordAuthenticationFilter 之前
.addFilterBefore(jwtAuthFilter, UsernamePasswordAuthenticationFilter.class)
.build();
}
}
3.6 认证接口(登录 / 刷新 Token)
java
package com.example.openclaw_demo.controller;
import com.example.openclaw_demo.common.ApiResult;
import com.example.openclaw_demo.security.JwtTokenProvider;
import lombok.RequiredArgsConstructor;
import org.springframework.web.bind.annotation.*;
import java.util.List;
import java.util.Map;
@RestController
@RequestMapping("/api/auth")
@RequiredArgsConstructor
public class AuthController {
private final JwtTokenProvider jwtTokenProvider;
// private final UserService userService; // 实际项目注入用户服务
/**
* POST /api/auth/login
* Body: { "username": "alice", "password": "xxx" }
*/
@PostMapping("/login")
public ApiResult<Map<String, Object>> login(@RequestBody LoginRequest req) {
// === 实际项目替换此处 ===
// User user = userService.authenticate(req.username(), req.password());
// if (user == null) return ApiResult.fail(401, "用户名或密码错误");
// String userId = user.getId();
// List<String> roles = user.getRoles();
// =======================
// 演示:硬编码用户
if (!"admin".equals(req.username()) || !"123456".equals(req.password())) {
return ApiResult.fail(401, "用户名或密码错误");
}
String userId = "user_001";
List<String> roles = List.of("ROLE_USER", "ROLE_VIP");
String token = jwtTokenProvider.generateToken(userId, req.username(), roles);
return ApiResult.ok(Map.of(
"token", token,
"userId", userId,
"username", req.username(),
"roles", roles,
"expiresIn", 86400 // 秒
));
}
public record LoginRequest(String username, String password) {}
}
四、多维度限流:Redis 滑动窗口算法
简单计数器(固定窗口)存在临界点问题(如前一分钟末 + 后一分钟初各打 60 次,实为 120 次/分钟)。生产环境推荐使用滑动窗口,基于 Redis + Lua 脚本实现原子操作。
4.1 限流配置
bash
# application.yml
app:
rate-limit:
enabled: true
rules:
# 普通用户:每分钟 20 次,每天 500 次
user:
per-minute: 20
per-day: 500
# VIP 用户:每分钟 60 次,每天 2000 次
vip:
per-minute: 60
per-day: 2000
# IP 级别:每分钟 100 次(防刷)
ip:
per-minute: 100
# 全局:每分钟 1000 次(保护 Anthropic 账号免被封禁)
global:
per-minute: 1000
java
package com.example.openclaw_demo.ratelimit;
import lombok.Data;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.stereotype.Component;
@Data
@Component
@ConfigurationProperties(prefix = "app.rate-limit")
public class RateLimitProperties {
private boolean enabled = true;
private UserRule user = new UserRule();
private UserRule vip = new UserRule();
private SimpleRule ip = new SimpleRule();
private SimpleRule global = new SimpleRule();
@Data
public static class UserRule {
private int perMinute = 20;
private int perDay = 500;
}
@Data
public static class SimpleRule {
private int perMinute = 100;
}
}
4.2 Lua 脚本:原子滑动窗口
java
package com.example.openclaw_demo.ratelimit;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.data.redis.core.StringRedisTemplate;
import org.springframework.data.redis.core.script.DefaultRedisScript;
import org.springframework.stereotype.Component;
import java.util.List;
@Slf4j
@Component
@RequiredArgsConstructor
public class SlidingWindowRateLimiter {
private final StringRedisTemplate redisTemplate;
/**
* 滑动窗口限流 Lua 脚本
*
* 参数:
* KEYS[1] = Redis key(如 "rl:user:user_001:minute")
* ARGV[1] = 当前时间戳(毫秒)
* ARGV[2] = 窗口大小(毫秒)
* ARGV[3] = 最大请求次数
*
* 原理:
* 1. 用 Sorted Set 存储请求时间戳,score = 时间戳
* 2. 删除窗口外的旧请求
* 3. 统计窗口内的请求数
* 4. 若超限返回 0,否则添加本次请求并返回 1
*/
private static final String SLIDING_WINDOW_SCRIPT = """
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])
local clearBefore = now - window
-- 删除过期的请求记录
redis.call('ZREMRANGEBYSCORE', key, '-inf', clearBefore)
-- 获取当前窗口内的请求数
local count = redis.call('ZCARD', key)
if count >= limit then
-- 超限:返回剩余等待时间(毫秒)
local oldest = redis.call('ZRANGE', key, 0, 0, 'WITHSCORES')
if oldest and #oldest > 0 then
return -(window - (now - tonumber(oldest[2])))
end
return -window
end
-- 未超限:记录本次请求(用 now + 随机后缀作为 member,避免重复)
local member = now .. '-' .. math.random(1000000)
redis.call('ZADD', key, now, member)
redis.call('PEXPIRE', key, window)
-- 返回当前窗口剩余可用次数
return limit - count - 1
""";
private final DefaultRedisScript<Long> rateLimitScript = new DefaultRedisScript<>(
SLIDING_WINDOW_SCRIPT, Long.class
);
/**
* 执行限流检查
*
* @param key 限流 key
* @param windowMs 窗口大小(毫秒)
* @param maxRequests 最大请求次数
* @return RateLimitResult(包含是否通过、剩余次数、等待时间)
*/
public RateLimitResult check(String key, long windowMs, int maxRequests) {
long now = System.currentTimeMillis();
Long result = redisTemplate.execute(
rateLimitScript,
List.of(key),
String.valueOf(now),
String.valueOf(windowMs),
String.valueOf(maxRequests)
);
if (result == null) {
// Redis 不可用时降级放行(可按业务调整为拒绝)
log.warn("[RateLimit] Redis 不可用,限流降级放行: key={}", key);
return RateLimitResult.allowed(maxRequests);
}
if (result < 0) {
// 超限:result 的绝对值是建议等待时间
long waitMs = Math.abs(result);
log.warn("[RateLimit] 限流触发: key={}, waitMs={}", key, waitMs);
return RateLimitResult.rejected(waitMs);
}
return RateLimitResult.allowed((int) (long) result);
}
/** 限流结果 */
public record RateLimitResult(boolean allowed, int remaining, long retryAfterMs) {
public static RateLimitResult allowed(int remaining) {
return new RateLimitResult(true, remaining, 0);
}
public static RateLimitResult rejected(long retryAfterMs) {
return new RateLimitResult(false, 0, retryAfterMs);
}
}
}
4.3 限流拦截器
java
package com.example.openclaw_demo.ratelimit;
import com.example.openclaw_demo.security.AiUserPrincipal;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Component;
import org.springframework.web.servlet.HandlerInterceptor;
import java.io.IOException;
@Slf4j
@Component
@RequiredArgsConstructor
public class RateLimitInterceptor implements HandlerInterceptor {
private final SlidingWindowRateLimiter rateLimiter;
private final RateLimitProperties props;
@Override
public boolean preHandle(HttpServletRequest request,
HttpServletResponse response,
Object handler) throws IOException {
if (!props.isEnabled()) return true;
AiUserPrincipal user = AiUserPrincipal.current();
String clientIp = getClientIp(request);
// ── 1. 全局限流(保护 Anthropic 账号不被封禁) ──────────────────
var globalResult = rateLimiter.check(
"rl:global:minute",
60_000L,
props.getGlobal().getPerMinute()
);
if (!globalResult.allowed()) {
return rejectWith(response, 503,
"服务器繁忙,请稍后重试",
globalResult.retryAfterMs() / 1000);
}
// ── 2. IP 限流(防止单 IP 恶意刷接口) ─────────────────────────
var ipResult = rateLimiter.check(
"rl:ip:" + clientIp + ":minute",
60_000L,
props.getIp().getPerMinute()
);
if (!ipResult.allowed()) {
return rejectWith(response, 429,
"请求过于频繁,请 " + (ipResult.retryAfterMs() / 1000) + " 秒后重试",
ipResult.retryAfterMs() / 1000);
}
// ── 3. 用户级限流(已登录用户) ─────────────────────────────────
if (user != null) {
// 按 VIP / 普通用户给不同配额
int minuteLimit = user.isVip()
? props.getVip().getPerMinute()
: props.getUser().getPerMinute();
int dayLimit = user.isVip()
? props.getVip().getPerDay()
: props.getUser().getPerDay();
// 分钟级限流
var minuteResult = rateLimiter.check(
"rl:user:" + user.getUserId() + ":minute",
60_000L,
minuteLimit
);
if (!minuteResult.allowed()) {
return rejectWith(response, 429,
"您的请求太频繁,请 " + (minuteResult.retryAfterMs() / 1000) + " 秒后重试",
minuteResult.retryAfterMs() / 1000);
}
// 每日限流
var dayResult = rateLimiter.check(
"rl:user:" + user.getUserId() + ":day",
86_400_000L,
dayLimit
);
if (!dayResult.allowed()) {
return rejectWith(response, 429,
"您今日的 AI 调用额度已用完,明日 0 点重置",
dayResult.retryAfterMs() / 1000);
}
// 把剩余次数写入响应头,方便前端展示
response.setHeader("X-RateLimit-Remaining", String.valueOf(minuteResult.remaining()));
response.setHeader("X-RateLimit-Day-Remaining", String.valueOf(dayResult.remaining()));
}
return true;
}
private boolean rejectWith(HttpServletResponse response,
int status, String message, long retryAfterSec)
throws IOException {
response.setStatus(status);
response.setContentType("application/json;charset=UTF-8");
response.setHeader("Retry-After", String.valueOf(retryAfterSec));
response.getWriter().write(
String.format("{\"code\":%d,\"message\":\"%s\",\"retryAfter\":%d}",
status, message, retryAfterSec)
);
return false;
}
/** 获取真实客户端 IP(处理 Nginx 反代) */
private String getClientIp(HttpServletRequest request) {
String xff = request.getHeader("X-Forwarded-For");
if (xff != null && !xff.isBlank() && !"unknown".equalsIgnoreCase(xff)) {
return xff.split(",")[0].trim();
}
String xri = request.getHeader("X-Real-IP");
if (xri != null && !xri.isBlank() && !"unknown".equalsIgnoreCase(xri)) {
return xri;
}
return request.getRemoteAddr();
}
}
4.4 注册拦截器
java
package com.example.openclaw_demo.config;
import com.example.openclaw_demo.ratelimit.RateLimitInterceptor;
import lombok.RequiredArgsConstructor;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.servlet.config.annotation.InterceptorRegistry;
import org.springframework.web.servlet.config.annotation.WebMvcConfigurer;
@Configuration
@RequiredArgsConstructor
public class WebConfig implements WebMvcConfigurer {
private final RateLimitInterceptor rateLimitInterceptor;
@Override
public void addInterceptors(InterceptorRegistry registry) {
registry.addInterceptor(rateLimitInterceptor)
// 只拦截 AI 接口
.addPathPatterns("/api/v1/**", "/api/vip/**")
// 放行登录接口
.excludePathPatterns("/api/auth/**");
}
}
五、Token 成本追踪
Claude API 按 Token 计费,精确统计每个用户的消耗是成本控制的基础。
5.1 Token 计费配置
bash
# application.yml
app:
billing:
# Claude Sonnet 4 价格(每百万 token,美元)
# 实际价格以 Anthropic 官网为准
models:
claude-sonnet-4-20250514:
input-price-per-million: 3.0 # $3 / 1M input tokens
output-price-per-million: 15.0 # $15 / 1M output tokens
claude-opus-4-20250514:
input-price-per-million: 15.0
output-price-per-million: 75.0
claude-haiku-4-5-20251001:
input-price-per-million: 0.8
output-price-per-million: 4.0
# 汇率(美元 → 人民币),定期更新
usd-to-cny: 7.25
5.2 成本计算 + Redis 累计
java
package com.example.openclaw_demo.billing;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.data.redis.core.StringRedisTemplate;
import org.springframework.stereotype.Service;
import java.math.BigDecimal;
import java.math.RoundingMode;
import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import java.util.Map;
import java.util.concurrent.TimeUnit;
@Slf4j
@Service
@RequiredArgsConstructor
public class TokenBillingService {
private final StringRedisTemplate redisTemplate;
private final BillingProperties billingProps;
/**
* 记录一次 AI 调用的 Token 消耗
*
* @param userId 用户 ID
* @param model 模型名称
* @param inputTokens 输入 token 数
* @param outputTokens 输出 token 数
*/
public void record(String userId, String model, int inputTokens, int outputTokens) {
String today = LocalDate.now().format(DateTimeFormatter.BASIC_ISO_DATE); // 20250421
// 计算本次费用(美元)
BigDecimal costUsd = calculateCost(model, inputTokens, outputTokens);
BigDecimal costCny = costUsd.multiply(BigDecimal.valueOf(billingProps.getUsdToCny()))
.setScale(6, RoundingMode.HALF_UP);
// 使用 Redis Pipeline 批量写入,减少网络往返
redisTemplate.executePipelined(conn -> {
String prefix = "billing:" + userId + ":" + today;
// 累计 token 数
conn.stringCommands().incrBy((prefix + ":input_tokens").getBytes(), inputTokens);
conn.stringCommands().incrBy((prefix + ":output_tokens").getBytes(), outputTokens);
// 累计调用次数
conn.stringCommands().incr((prefix + ":calls").getBytes());
// 累计费用(存为分,避免浮点精度问题:1美分 = 0.01美元)
long costMicro = costUsd.multiply(BigDecimal.valueOf(1_000_000)).longValue();
conn.stringCommands().incrBy((prefix + ":cost_micro_usd").getBytes(), costMicro);
// 设置过期时间:保留 90 天
conn.keyCommands().expire((prefix + ":calls").getBytes(), 90 * 24 * 3600);
return null;
});
log.info("[Billing] userId={}, model={}, input={}, output={}, cost=¥{}",
userId, model, inputTokens, outputTokens, costCny.toPlainString());
}
/**
* 查询用户今日统计
*/
public UserDailyStats getDailyStats(String userId, String date) {
String prefix = "billing:" + userId + ":" + date;
long inputTokens = parseLong(redisTemplate.opsForValue().get(prefix + ":input_tokens"));
long outputTokens = parseLong(redisTemplate.opsForValue().get(prefix + ":output_tokens"));
long calls = parseLong(redisTemplate.opsForValue().get(prefix + ":calls"));
long costMicroUsd = parseLong(redisTemplate.opsForValue().get(prefix + ":cost_micro_usd"));
BigDecimal costUsd = BigDecimal.valueOf(costMicroUsd)
.divide(BigDecimal.valueOf(1_000_000), 6, RoundingMode.HALF_UP);
BigDecimal costCny = costUsd.multiply(BigDecimal.valueOf(billingProps.getUsdToCny()))
.setScale(4, RoundingMode.HALF_UP);
return new UserDailyStats(userId, date, calls, inputTokens, outputTokens,
costUsd.toPlainString(), costCny.toPlainString());
}
/** 计算单次调用费用(美元) */
private BigDecimal calculateCost(String model, int inputTokens, int outputTokens) {
BillingProperties.ModelPrice price = billingProps.getModels()
.getOrDefault(model, billingProps.getModels().get("claude-sonnet-4-20250514"));
BigDecimal inputCost = BigDecimal.valueOf(inputTokens)
.multiply(BigDecimal.valueOf(price.getInputPricePerMillion()))
.divide(BigDecimal.valueOf(1_000_000), 8, RoundingMode.HALF_UP);
BigDecimal outputCost = BigDecimal.valueOf(outputTokens)
.multiply(BigDecimal.valueOf(price.getOutputPricePerMillion()))
.divide(BigDecimal.valueOf(1_000_000), 8, RoundingMode.HALF_UP);
return inputCost.add(outputCost);
}
private long parseLong(String value) {
if (value == null || value.isBlank()) return 0L;
try { return Long.parseLong(value); } catch (NumberFormatException e) { return 0L; }
}
/** 每日统计 DTO */
public record UserDailyStats(
String userId, String date, long calls,
long inputTokens, long outputTokens,
String costUsd, String costCny
) {}
}
六、Micrometer 监控:AOP 全链路埋点
用 AOP 切面自动埋点,业务代码零侵入。
6.1 自定义监控切面
java
package com.example.openclaw_demo.metrics;
import com.example.openclaw_demo.billing.TokenBillingService;
import com.example.openclaw_demo.dto.ChatResponseDTO;
import com.example.openclaw_demo.security.AiUserPrincipal;
import io.micrometer.core.instrument.*;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.Around;
import org.aspectj.lang.annotation.Aspect;
import org.springframework.stereotype.Component;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
@Slf4j
@Aspect
@Component
@RequiredArgsConstructor
public class AiMetricsAspect {
private final MeterRegistry registry;
private final TokenBillingService billingService;
// 正在进行中的请求数(实时并发计数)
private final AtomicInteger activeRequests = new AtomicInteger(0);
/** 切点:拦截所有 ClawChatService 的 chat 方法 */
@Around("execution(* com.example.openclaw_demo.service.impl.ClawChatServiceImpl.chat(..))" +
"|| execution(* com.example.openclaw_demo.service.impl.ToolChatServiceImpl.chatWithTools(..))")
public Object aroundAiCall(ProceedingJoinPoint pjp) throws Throwable {
String methodName = pjp.getSignature().getName();
AiUserPrincipal user = AiUserPrincipal.current();
String userId = user != null ? user.getUserId() : "anonymous";
// 并发计数
activeRequests.incrementAndGet();
Gauge.builder("ai.requests.active", activeRequests, AtomicInteger::get)
.description("当前正在处理的 AI 请求数")
.register(registry);
long startMs = System.currentTimeMillis();
String status = "success";
try {
Object result = pjp.proceed();
// 提取 Token 用量(如果返回 ChatResponseDTO)
if (result instanceof ChatResponseDTO dto) {
recordTokenMetrics(userId, dto);
billingService.record(userId, dto.getModel(),
dto.getInputTokens(), dto.getOutputTokens());
}
return result;
} catch (Exception e) {
status = "error";
// 记录错误计数
Counter.builder("ai.requests.errors")
.tag("method", methodName)
.tag("error_type", e.getClass().getSimpleName())
.description("AI 请求错误总数")
.register(registry)
.increment();
throw e;
} finally {
activeRequests.decrementAndGet();
long elapsedMs = System.currentTimeMillis() - startMs;
// 记录请求耗时(Histogram,用于计算 P50/P95/P99)
Timer.builder("ai.request.duration")
.tag("method", methodName)
.tag("user_type", user != null && user.isVip() ? "vip" : "normal")
.tag("status", status)
.description("AI 请求耗时分布")
.register(registry)
.record(elapsedMs, TimeUnit.MILLISECONDS);
// 记录请求总数
Counter.builder("ai.requests.total")
.tag("method", methodName)
.tag("status", status)
.description("AI 请求总数")
.register(registry)
.increment();
log.debug("[Metrics] method={}, userId={}, status={}, cost={}ms",
methodName, userId, status, elapsedMs);
}
}
/** 记录 Token 使用量指标 */
private void recordTokenMetrics(String userId, ChatResponseDTO dto) {
// 输入 Token 累计(Counter)
Counter.builder("ai.tokens.input")
.tag("model", dto.getModel())
.tag("user_id", userId)
.description("累计输入 Token 数")
.register(registry)
.increment(dto.getInputTokens());
// 输出 Token 累计(Counter)
Counter.builder("ai.tokens.output")
.tag("model", dto.getModel())
.tag("user_id", userId)
.description("累计输出 Token 数")
.register(registry)
.increment(dto.getOutputTokens());
}
}
6.2 暴露 Prometheus 端点
bash
# application.yml
management:
endpoints:
web:
exposure:
include: health, info, prometheus, metrics
endpoint:
prometheus:
enabled: true
health:
show-details: when-authorized
metrics:
tags:
# 全局标签:所有指标都带上应用名和环境
application: ${spring.application.name}
environment: ${APP_ENV:dev}
distribution:
# 为 ai.request.duration 开启百分位数(P50/P95/P99)
percentiles-histogram:
ai.request.duration: true
percentiles:
ai.request.duration: 0.5, 0.95, 0.99
6.3 Grafana 仪表盘核心指标(PromQL)
bash
# ── 请求量 ────────────────────────────────────────────────
# 每分钟 AI 请求总量(QPS)
rate(ai_requests_total[1m]) * 60
# 错误率
rate(ai_requests_errors_total[5m]) / rate(ai_requests_total[5m]) * 100
# ── 延迟 ──────────────────────────────────────────────────
# P50 响应时间(毫秒)
histogram_quantile(0.50, rate(ai_request_duration_bucket[5m])) / 1e6
# P95 响应时间(毫秒)------ 重点监控指标
histogram_quantile(0.95, rate(ai_request_duration_bucket[5m])) / 1e6
# P99 响应时间(毫秒)
histogram_quantile(0.99, rate(ai_request_duration_bucket[5m])) / 1e6
# ── Token 消耗 ────────────────────────────────────────────
# 每分钟 Token 消耗量
rate(ai_tokens_input_total[1m]) * 60 + rate(ai_tokens_output_total[1m]) * 60
# 按模型分组的 Token 消耗
sum by (model) (rate(ai_tokens_output_total[5m]))
# ── 并发 ──────────────────────────────────────────────────
# 当前并发请求数
ai_requests_active
七、结构化审计日志
审计日志是合规和排障的生命线,记录每次 AI 调用的完整上下文。
7.1 审计日志实体
java
package com.example.openclaw_demo.audit;
import lombok.Builder;
import lombok.Data;
import java.time.LocalDateTime;
@Data
@Builder
public class AiAuditLog {
private String logId; // 日志 ID(UUID)
private String userId; // 用户 ID
private String username; // 用户名
private String clientIp; // 客户端 IP
private String requestUri; // 请求路径
private String userMessage; // 用户输入(脱敏后)
private String aiResponse; // AI 回复摘要(前 200 字)
private String model; // 使用的模型
private int inputTokens; // 输入 token 数
private int outputTokens; // 输出 token 数
private long elapsedMs; // 耗时(毫秒)
private String status; // SUCCESS / ERROR
private String errorMessage; // 错误信息(可选)
private boolean toolCalled; // 是否调用了工具
private String toolNames; // 调用的工具名(逗号分隔)
private LocalDateTime createdAt; // 请求时间
}
7.2 异步审计服务
java
package com.example.openclaw_demo.audit;
import com.fasterxml.jackson.databind.ObjectMapper;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Service;
import java.time.LocalDateTime;
import java.util.UUID;
@Slf4j
@Service
@RequiredArgsConstructor
public class AiAuditService {
private final ObjectMapper objectMapper;
// 实际项目可注入:
// private final AiAuditLogRepository repository; // 写入数据库
// private final KafkaTemplate<String, String> kafkaTemplate; // 写入消息队列
/**
* 异步写入审计日志(@Async 避免阻塞主流程)
* 需要在启动类添加 @EnableAsync
*/
@Async("auditExecutor")
public void log(AiAuditLog auditLog) {
try {
// 方案1:写入本地日志文件(结构化 JSON,方便 ELK 采集)
String jsonLog = objectMapper.writeValueAsString(auditLog);
log.info("[AUDIT] {}", jsonLog);
// 方案2:写入数据库
// repository.save(convertToEntity(auditLog));
// 方案3:写入 Kafka(适合高并发场景)
// kafkaTemplate.send("ai-audit-logs", auditLog.getLogId(), jsonLog);
} catch (Exception e) {
// 审计失败不应影响主流程,只记录警告
log.warn("[AUDIT] 审计日志写入失败: {}", e.getMessage());
}
}
/** 构建审计日志对象 */
public AiAuditLog build(String userId, String username, String clientIp,
String uri, String message, String response,
String model, int inputTokens, int outputTokens,
long elapsedMs, String status, String errorMsg,
boolean toolCalled, String toolNames) {
return AiAuditLog.builder()
.logId(UUID.randomUUID().toString())
.userId(userId)
.username(username)
.clientIp(clientIp)
.requestUri(uri)
// 脱敏:超过 500 字符只保留前 500 字
.userMessage(truncate(desensitize(message), 500))
// 回复只保留摘要
.aiResponse(truncate(response, 200))
.model(model)
.inputTokens(inputTokens)
.outputTokens(outputTokens)
.elapsedMs(elapsedMs)
.status(status)
.errorMessage(errorMsg)
.toolCalled(toolCalled)
.toolNames(toolNames)
.createdAt(LocalDateTime.now())
.build();
}
/** 简单脱敏:替换手机号、身份证号等敏感信息 */
private String desensitize(String text) {
if (text == null) return null;
return text
// 手机号:138****8888
.replaceAll("(1[3-9]\\d)\\d{4}(\\d{4})", "$1****$2")
// 身份证:110101****1234
.replaceAll("(\\d{6})\\d{8}(\\d{4})", "$1********$2");
}
private String truncate(String text, int maxLen) {
if (text == null) return null;
return text.length() <= maxLen ? text : text.substring(0, maxLen) + "...";
}
}
7.3 审计线程池配置
java
package com.example.openclaw_demo.config;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.scheduling.annotation.EnableAsync;
import org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor;
import java.util.concurrent.Executor;
@Configuration
@EnableAsync
public class AsyncConfig {
/** 审计日志专用线程池:异步、低优先级、不阻塞主线程 */
@Bean("auditExecutor")
public Executor auditExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(2);
executor.setMaxPoolSize(5);
executor.setQueueCapacity(1000); // 队列满则丢弃(审计不阻塞业务)
executor.setThreadNamePrefix("audit-");
executor.setRejectedExecutionHandler(
(r, e) -> {
// 队列满时记录警告,丢弃任务(不阻塞主流程)
org.slf4j.LoggerFactory.getLogger(AsyncConfig.class)
.warn("[AUDIT] 审计队列已满,丢弃日志任务");
}
);
executor.initialize();
return executor;
}
}
八、优雅停机
AI 请求耗时长(3~30秒),默认的 JVM 关机信号会强制中断进行中的请求,导致前端收到异常。
java
package com.example.openclaw_demo.config;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.web.embedded.tomcat.TomcatServletWebServerFactory;
import org.springframework.boot.web.server.WebServerFactoryCustomizer;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.apache.catalina.connector.Connector;
import org.apache.coyote.ProtocolHandler;
import org.apache.coyote.AbstractProtocol;
@Slf4j
@Configuration
public class GracefulShutdownConfig {
@Value("${server.shutdown-timeout:60}")
private int shutdownTimeoutSeconds;
/**
* 配置 Tomcat 优雅停机
* 收到 SIGTERM 后:
* 1. 停止接受新请求
* 2. 等待正在处理的请求完成(最多 shutdownTimeoutSeconds 秒)
* 3. 强制关闭
*/
@Bean
public WebServerFactoryCustomizer<TomcatServletWebServerFactory> gracefulShutdownCustomizer() {
return factory -> factory.addConnectorCustomizers(connector -> {
ProtocolHandler protocolHandler = connector.getProtocolHandler();
if (protocolHandler instanceof AbstractProtocol<?> protocol) {
// 连接超时:收到关机信号后,等待当前请求处理完成的最长时间
protocol.setConnectionTimeout(shutdownTimeoutSeconds * 1000);
}
});
}
}
bash
# application.yml
server:
# Spring Boot 内置优雅停机(2.3+ 支持)
shutdown: graceful
shutdown-timeout: 60 # 最长等待 60 秒
spring:
lifecycle:
timeout-per-shutdown-phase: 65s # 比 shutdown-timeout 多 5 秒,留出余量
九、完整配置文件总览
bash
# application.yml(生产环境完整版)
spring:
application:
name: openclaw-demo
data:
redis:
host: ${REDIS_HOST:localhost}
port: ${REDIS_PORT:6379}
password: ${REDIS_PASSWORD:}
lettuce:
pool:
max-active: 20
max-idle: 10
lifecycle:
timeout-per-shutdown-phase: 65s
server:
port: 8080
shutdown: graceful
tomcat:
connection-timeout: 300000
threads:
max: 200
min-spare: 10
# ── OpenClAW ──────────────────────────────────────────────
openclaw:
api-key: ${ANTHROPIC_API_KEY}
timeout: 180
# ── 业务配置 ──────────────────────────────────────────────
app:
jwt:
secret: ${JWT_SECRET} # 生产环境必须是 256 位随机字符串
expiration-ms: 86400000 # 24 小时
claw:
default-model: claude-sonnet-4-20250514
default-max-tokens: 1024
global-system-prompt: "你是一名专业的助手,回答简洁准确,使用中文。"
max-retries: 2
log-requests: false # 生产环境关闭请求日志
rate-limit:
enabled: true
rules:
user: { per-minute: 20, per-day: 500 }
vip: { per-minute: 60, per-day: 2000 }
ip: { per-minute: 100 }
global:{ per-minute: 1000 }
billing:
usd-to-cny: 7.25
models:
claude-sonnet-4-20250514:
input-price-per-million: 3.0
output-price-per-million: 15.0
claude-haiku-4-5-20251001:
input-price-per-million: 0.8
output-price-per-million: 4.0
# ── 监控 ──────────────────────────────────────────────────
management:
endpoints:
web:
exposure:
include: health, info, prometheus, metrics
metrics:
tags:
application: ${spring.application.name}
environment: ${APP_ENV:prod}
distribution:
percentiles-histogram:
ai.request.duration: true
percentiles:
ai.request.duration: 0.5, 0.95, 0.99
# ── 日志 ──────────────────────────────────────────────────
logging:
level:
com.example.openclaw_demo: INFO
io.openclaw: WARN
pattern:
console: "%d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36} - %msg%n"
file:
name: logs/openclaw-demo.log
十、完整项目结构(系列终章)
bash
openclaw-demo/
├── src/main/java/com/example/openclaw_demo/
│ │
│ ├── OpenClawDemoApplication.java
│ │
│ ├── config/
│ │ ├── ClawProperties.java
│ │ ├── StreamConfig.java
│ │ ├── WebConfig.java ★ 注册限流拦截器
│ │ ├── AsyncConfig.java ★ 审计线程池
│ │ └── GracefulShutdownConfig.java ★ 优雅停机
│ │
│ ├── security/ ★ 鉴权模块
│ │ ├── JwtTokenProvider.java ★ JWT 工具类
│ │ ├── JwtAuthFilter.java ★ JWT 过滤器
│ │ ├── AiUserPrincipal.java ★ 用户主体
│ │ └── SecurityConfig.java ★ Security 配置
│ │
│ ├── ratelimit/ ★ 限流模块
│ │ ├── RateLimitProperties.java ★ 限流配置
│ │ ├── SlidingWindowRateLimiter.java ★ Redis 滑动窗口
│ │ └── RateLimitInterceptor.java ★ 限流拦截器
│ │
│ ├── billing/ ★ 计费模块
│ │ ├── BillingProperties.java ★ 计费配置
│ │ └── TokenBillingService.java ★ Token 成本追踪
│ │
│ ├── metrics/ ★ 监控模块
│ │ └── AiMetricsAspect.java ★ AOP 埋点切面
│ │
│ ├── audit/ ★ 审计模块
│ │ ├── AiAuditLog.java ★ 审计日志实体
│ │ └── AiAuditService.java ★ 异步审计服务
│ │
│ ├── controller/
│ │ ├── AuthController.java ★ 登录/Token 接口
│ │ ├── ChatController.java
│ │ ├── ChatStreamController.java
│ │ └── ToolChatController.java
│ │
│ ├── service/ ...(前几篇实现)
│ ├── tool/ ...(第5篇实现)
│ ├── dto/ ...(第2篇实现)
│ ├── common/ ...(第2篇实现)
│ └── exception/ ...(第2篇实现)
│
├── src/main/resources/
│ └── application.yml 完整生产配置
│
└── docker-compose.yml ★ 一键启动依赖服务
十一、Docker Compose:一键启动依赖
bash
# docker-compose.yml
version: '3.8'
services:
redis:
image: redis:7-alpine
ports:
- "6379:6379"
command: redis-server --appendonly yes
volumes:
- redis_data:/data
restart: unless-stopped
# 可选:Prometheus + Grafana 监控栈
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
restart: unless-stopped
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin123
volumes:
- grafana_data:/var/lib/grafana
depends_on:
- prometheus
restart: unless-stopped
volumes:
redis_data:
grafana_data:
bash
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'openclaw-demo'
static_configs:
- targets: ['host.docker.internal:8080'] # macOS/Windows
# Linux 改为:['172.17.0.1:8080']
metrics_path: '/actuator/prometheus'
# Prometheus 采集需要认证(管理员 Token)
# bearer_token: 'your_admin_jwt_token'
十二、上线前检查清单
bash
安全
☐ JWT secret 使用 256 位随机字符串,不能是简单密码
☐ API Key 通过环境变量注入,未出现在代码或配置文件中
☐ /actuator/prometheus 端点设置了权限控制
☐ 生产环境 log-requests: false(避免用户输入写入日志)
☐ HTTPS 已配置(Nginx SSL 终止)
限流
☐ 用户级限流阈值与 Anthropic 账号限额匹配
☐ 全局限流已开启,防止账号级别被封禁
☐ 限流响应包含 Retry-After 头,前端展示倒计时
监控
☐ Prometheus 端点可正常访问
☐ Grafana 仪表盘已导入,P95 延迟告警已配置
☐ 错误率告警已配置(>5% 触发通知)
☐ 并发数告警已配置(防止线程池耗尽)
成本
☐ Token 计费模型价格与 Anthropic 官网一致
☐ 每日成本监控已配置(超阈值发邮件/钉钉通知)
☐ 按用户的 Token 用量统计已可查询
运维
☐ 优雅停机已配置(shutdown: graceful)
☐ 审计日志持久化存储(数据库或对象存储)
☐ 日志文件轮转已配置(避免磁盘打满)
☐ Redis 持久化已开启(AOF 模式)
☐ 健康检查端点 /actuator/health 已配置到负载均衡探针
十三、系列总结
至此,《OpenClAW + Spring Boot》系列全部完结。回顾六篇文章完成的工作:
| 篇章 | 核心产出 |
|---|---|
| 第1篇 | Spring Boot 接入 OpenClAW,跑通第一个 Hello World |
| 第2篇 | Service 层封装、异常体系、统一响应、重试机制、单元测试 |
| 第3篇 | SSE 流式输出,前后端完整联调,Nginx 配置 |
| 第4篇 | 多轮对话、Redis 会话管理、多用户隔离 |
| 第5篇 | Function Calling 框架,注解驱动工具注册,三类工具实现 |
| 第6篇 | JWT 鉴权、Redis 滑动窗口限流、Token 计费、Micrometer 监控、审计日志、优雅停机 |
这套架构从第1篇的 Hello World 到第6篇的生产级服务,具备了真实项目上线所需的完整能力。你可以以此为基础,结合自己的业务场景进行裁剪和扩展。
📌 系列全部完结!如果这个系列对你有帮助,欢迎点赞 👍 收藏 ⭐ 关注,感谢六篇的陪伴!