SpringBoot Redis 全方位监控实现

前言

线上服务缓存故障是高频问题:缓存击穿、缓存雪崩、Redis 连接池耗尽、大 Key 阻塞、命中率暴跌、Redis 响应超时,都会直接导致数据库压力飙升、接口大面积超时。

本文补充 Redis 完整监控方案,分三类实现:

  • Spring Data Redis 内置连接池指标(Lettuce/Jedis)
  • 原生 Redis Info 运行状态(内存、持久化、命中率、客户端、命令耗时)
  • 自定义监控接口,统一接入服务大盘,配套告警规则

一、区分两大 Redis 客户端:Jedis / Lettuce

SpringBoot2.x 默认 Lettuce(Netty 异步无锁连接池),老项目多使用 Jedis(同步连接池),两者监控获取方式不同,分开说明。

二、方案 1:Jedis 连接池监控(同步客户端)

1. Maven 依赖

|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| xml <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-redis</artifactId> </dependency> <!-- jedis 客户端 --> <dependency> <groupId>redis.clients</groupId> <artifactId>jedis</artifactId> </dependency> |

yml 配置连接池

|--------------------------------------------------------------------------------------------------------------------|
| yaml spring: redis: host: 127.0.0.1 port: 6379 jedis: pool: max-active: 20 max-idle: 8 min-idle: 2 max-wait: 300ms |

2. 监控代码(带完整注释)

|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| java import org.springframework.data.redis.connection.jedis.JedisConnectionFactory; import redis.clients.jedis.JedisPool; import redis.clients.jedis.JedisPoolConfig; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.RestController; import javax.annotation.Resource; import java.util.HashMap; import java.util.Map; @RestController public class RedisMonitorController { @Resource private JedisConnectionFactory jedisConnectionFactory; @GetMapping("/monitor/redis/jedis-pool") public Map<String, Object> jedisPoolStat() { Map<String, Object> res = new HashMap<>(); JedisPool pool = jedisConnectionFactory.getPool(); JedisPoolConfig poolConfig = pool.getPoolConfig(); // 连接池配置阈值 res.put("maxActive", poolConfig.getMaxTotal()); // 连接池最大连接数 res.put("maxIdle", poolConfig.getMaxIdle()); // 最大空闲连接 res.put("minIdle", poolConfig.getMinIdle()); // 最小空闲连接 res.put("maxWaitMs", poolConfig.getMaxWaitDuration().toMillis()); // 获取连接最大等待时间 // 实时运行水位(核心告警指标) res.put("activeNum", pool.getNumActive()); // 当前正在使用的活跃连接 res.put("idleNum", pool.getNumIdle()); // 当前空闲可用连接 res.put("waitingCount", pool.getNumWaiters()); // 等待获取Redis连接的排队请求数 res.put("createdCount", pool.getCreatedCount()); // 累计创建连接总数 res.put("destroyedCount", pool.getDestroyedCount()); // 累计销毁连接总数 return res; } } |

核心指标风险判断

  • waitingCount > 0:Redis 连接池耗尽,请求排队等待,接口大量超时;
  • activeNum == maxActive && idleNum == 0:连接池完全打满;
  • createdCount / destroyedCount 持续上涨:连接频繁重建,网络不稳定或空闲销毁参数不合理。

三、方案 2:Lettuce 连接池监控(SpringBoot 默认)

Lettuce 基于 Netty,使用 GenericObjectPool,同样可读取池水位指标

|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| java import io.lettuce.core.cluster.ClusterClientOptions; import org.apache.commons.pool2.impl.GenericObjectPool; import org.apache.commons.pool2.impl.GenericObjectPoolConfig; import org.springframework.data.redis.connection.lettuce.LettuceConnectionFactory; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.RestController; import javax.annotation.Resource; import java.util.HashMap; import java.util.Map; @RestController public class RedisMonitorController { @Resource private LettuceConnectionFactory lettuceConnectionFactory; @GetMapping("/monitor/redis/lettuce-pool") public Map<String, Object> lettucePoolStat() { Map<String, Object> res = new HashMap<>(); GenericObjectPool<?> pool = lettuceConnectionFactory.getPool(); GenericObjectPoolConfig<?> config = pool.getConfig(); // 池配置 res.put("maxTotal", config.getMaxTotal()); res.put("maxIdle", config.getMaxIdle()); res.put("minIdle", config.getMinIdle()); // 实时运行指标 res.put("active", pool.getNumActive()); // 活跃占用连接 res.put("idle", pool.getNumIdle()); // 空闲连接 res.put("waiters", pool.getNumWaiters()); // 排队等待连接的请求数 res.put("created", pool.getCreatedCount()); res.put("destroyed", pool.getDestroyedCount()); res.put("borrowed", pool.getBorrowedCount()); // 累计借出连接次数 res.put("returned", pool.getReturnedCount()); // 累计归还连接次数 return res; } } |

四、方案 3:读取 Redis 服务运行状态 INFO(最重要业务指标)

通过 redisTemplate 执行 INFO 命令,获取 Redis 服务全局状态:内存、缓存命中率、持久化、客户端、命令统计、磁盘、过期键。

完整监控接口

|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| java import org.springframework.data.redis.core.RedisTemplate; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.RestController; import javax.annotation.Resource; import java.util.HashMap; import java.util.Map; import java.util.Properties; @RestController public class RedisInfoMonitorController { @Resource private RedisTemplate<String, Object> redisTemplate; @GetMapping("/monitor/redis/info") public Map<String, Object> redisInfo() { Map<String, Object> result = new HashMap<>(); // 执行INFO命令获取全量状态 Properties info = (Properties) redisTemplate.execute((connection) -> connection.info()); // 1. 内存指标 result.put("usedMemory", info.getProperty("used_memory")); // Redis已使用内存(字节) result.put("usedMemoryHuman", info.getProperty("used_memory_human")); // 可读内存 result.put("maxMemory", info.getProperty("maxmemory_human")); // 最大内存限制 // 2. 缓存命中率(核心业务指标) long keyHits = Long.parseLong(info.getProperty("keyspace_hits", "0")); long keyMisses = Long.parseLong(info.getProperty("keyspace_misses", "0")); double hitRate = keyHits + keyMisses == 0 ? 1.0 : (double) keyHits / (keyHits + keyMisses); result.put("keyspace_hits", keyHits); result.put("keyspace_misses", keyMisses); result.put("cacheHitRate", String.format("%.2f%%", hitRate * 100)); // 3. 客户端连接 result.put("connectedClients", info.getProperty("connected_clients")); // 当前客户端连接数 // 4. 键统计 result.put("totalKeys", info.getProperty("db0")); // db0总键数量 result.put("expiredKeys", info.getProperty("expired_keys")); // 累计过期删除key result.put("evictedKeys", info.getProperty("evicted_keys")); // 内存淘汰删除key(出现即内存不足) // 5. 命令执行 result.put("totalCommandsProcessed", info.getProperty("total_commands_processed")); // 6. 持久化RDB/AOF result.put("rdbLastSaveTime", info.getProperty("rdb_last_save_time")); result.put("aofEnabled", info.getProperty("aof_enabled")); return result; } } |

INFO 核心指标告警说明

  • 缓存命中率 cacheHitRate < 90%
    缓存设计不合理,大量穿透,数据库压力上涨;
  • evictedKeys > 0 持续增长
    Redis 内存打满,触发淘汰策略,缓存丢失,极易雪崩;
  • connectedClients 持续暴涨
    存在连接泄漏,客户端未关闭连接;
  • usedMemory 逼近 maxmemory
    内存即将溢出,触发淘汰 / 阻塞写入;
  • keyspace_misses 突增:热点缓存失效、批量过期。

五、补充:监控大 Key、慢命令

1. 慢查询监控

|---------------------------------------------------------------------------|
| java // 获取Redis慢查询列表 redisTemplate.execute(conn -> conn.slowLogGet(10)); |

记录超过阈值的慢命令(如大 Hash 全量查询、keys *、hgetall),阻塞 Redis 主线程。

2. 扫描大 Key

线上不建议频繁执行 keys *,可定时后台线程使用 scan 分片扫描,记录超过阈值的大 key(value > 10kb)。

六、Actuator 标准化监控(生产长期大盘推荐)

引入 micrometer 自动采集 Redis 指标,对接 Prometheus/Grafana,无需手写接口:

|-------------------------------------------------------------------------------------------------------------------------------------|
| xml <dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-registry-prometheus</artifactId> </dependency> |

自动暴露指标:

  • redis.operations.*:各类命令执行次数、耗时
  • redis.client.connections.active:活跃连接
  • redis.cache.hit.ratio:缓存命中率
  • redis.memory.used:内存占用

七、完整监控分层汇总(Tomcat / Druid / Redis 整套体系)

  • 接入层(Tomcat)
    业务并发、活跃线程、请求排队堆积,判断流量是否打满;
  • 缓存层(Redis)
    连接池排队数、缓存命中率、内存占用、内存淘汰、慢命令;
  • 数据库层(Druid)
    DB 连接排队、活跃连接、慢 SQL、SQL 报错、长事务;

三层指标联动,可快速定位故障层级:

  • Tomcat 排队上涨,Redis waitingCount 上涨 → 缓存瓶颈;
  • Tomcat 排队上涨,Druid pendingCount 上涨 → 数据库瓶颈;
  • 仅 Tomcat 活跃线程高,中间件无排队 → 应用本地代码阻塞。

八、线上生产告警规则整理

1.紧急告警

  • Jedis/Lettuce waiters > 0:Redis 连接池耗尽;
  • evictedKeys 每分钟增量 > 0:Redis 内存溢出淘汰 Key;

2.性能告警

  • 缓存命中率低于 90%;
  • 慢查询数量持续增加;

3.资源告警

  • connectedClients 接近 Redis 最大客户端限制;
  • usedMemory 达到 maxmemory 90% 以上。

九、踩坑总结

  • Lettuce 默认多路复用,单连接承载多请求,连接池耗尽概率低于 Jedis,但waiters指标同样有效;
  • INFO 命令不要高频轮询(建议 30s~1min 一次),避免加重 Redis 主线程负担;
  • 禁止线上频繁执行keys *扫描全量 key,会阻塞 Redis;
  • 缓存命中率低优先检查过期时间、热点 key、缓存空值策略。

十、整合统一监控大盘

将 Tomcat、Druid、Redis 监控全部聚合到 /monitor/system/stat 一个接口,一次性输出 Web、数据库、缓存三层全量性能指标,便于运维快速排查线上故障。