所遇问题
对应项目中使用redis cluster搭建主从集群,为3主3从,spring boot版本为2.1.3,当其中一台主节点宕机后,遇到io.lettuce.core.RedisCommandExecutionException: CLUSTERDOWN Hash slot not served的问题
问题查询及解决
遇到上述报错后,redis cluster理论上会自动进行主从切换,保证redis cluster集群的正常运行且每个节点均有数据备份,不会造成数据丢失。上服务器查看集群信息及状态,均显示正常且可以获取到宕机节点的数据。
继续查找问题,发现应用程序在请求redis节点时还是会将请求打至已经宕机的服务器节点,所以考虑是否存在本地缓存的问题,经查资料证实,客户端本地会缓存redis cluster的拓扑结构,即客户端如Lettuce通常会缓存集群的路由信息以提高性能,但如果节点状态发生变化而客户端未及时更新路由表,且在spring-boot 2.3.0之前不支持拓扑刷新,返回查看spring-boot-starter-data-redis的包,发现其中并没有刷新拓扑结构的属性,且在springboot中redis默认使用lettuce,默认不开启拓扑刷新,且版本过低,无法配置对应的刷新规则,查询lettuce官网
得知拓扑结构的更新策略
lettuce 在3.1 和 4.0版本就支持刷新的配置,只是在低版本的springboot中一直没有跟上,直到2.3.0版本
vbnet
5.3.4. Refreshing the cluster topology view
The Redis Cluster configuration may change at runtime. New nodes can be added, the master for a specific slot can change. Lettuce handles MOVED and ASK redirects transparently but in case too many commands run into redirects, you should refresh the cluster topology view. The topology is bound to a RedisClusterClient instance. All cluster connections that are created by one RedisClusterClient instance share the same cluster topology view. The view can be updated in three ways:
\1. Either by calling RedisClusterClient.reloadPartitions
\2. Periodic updates in the background based on an interval -> 定期更新
\3. Adaptive updates in the background based on persistent disconnects and MOVED/ASK redirections -> 自适应更新
By default, commands follow -ASK and -MOVED redirects up to 5 times until the command execution is considered to be failed. Background topology updating starts with the first connection obtained through RedisClusterClient.
随后升级springboot至2.3.0之后的版本,在yml文件中添加相应配置解决问题
当然,也可以不进行springboot的升级的,只需要修改redis对应的config文件,即spring的redis配置bean文件
yaml
redis:
cluster:
nodes: 192.168.1.1:7002,192.168.1.1:7003,192.168.1.2:7004,192.168.1.2:7005,192.168.1.3:7000,192.168.1.3:7001
max-redirects: 3
password:
timeout: 5000
lettuce:
cluster:
refresh:
• adaptive: true
• period: 10000
Ps:需要升级springboot的版本,无法单独替换相应的依赖对spring-boot-starter-data-redis
,在spring-autoconfiguration
中会约定好相关的属性文件,详细了解可以参考springboot的自动装配原理
查看springboot文档及相应的源码文件,得知在springboot2.3.0添加了adaptive、period两个属性提供运算,默认开启动态刷新源,dynamicRefreshSources=true
,在springboot2.4.0支持关闭数据源刷新
Allow disabling Redis Cluster dynamic sources refresh using
spring.redis.lettuce.cluster.refresh.dynamic-sources
.
顺便对比下springboot中RedisProperties的源码文件,这里分别展示springboot2.3.x和2.4.x的版本
springboot 2.3.x版本的部分内容
typescript
/**
* Lettuce client properties.
*/
public static class Lettuce {
public static class Cluster {
private final Refresh refresh = new Refresh();
public Refresh getRefresh() {
return this.refresh;
}
public static class Refresh {
/**
* Cluster topology refresh period.
*/
private Duration period;
/**
* Whether adaptive topology refreshing using all available refresh
* triggers should be used.
*/
private boolean adaptive;
public Duration getPeriod() {
return this.period;
}
public void setPeriod(Duration period) {
this.period = period;
}
public boolean isAdaptive() {
return this.adaptive;
}
public void setAdaptive(boolean adaptive) {
this.adaptive = adaptive;
}
}
}
}
springboot 2.4.x
typescript
/**
* Lettuce client properties.
*/
public static class Lettuce {
public static class Cluster {
private final Refresh refresh = new Refresh();
public Refresh getRefresh() {
return this.refresh;
}
public static class Refresh {
/**
* Whether to discover and query all cluster nodes for obtaining the
* cluster topology. When set to false, only the initial seed nodes are
* used as sources for topology discovery.
*/
private boolean dynamicRefreshSources = true;
/**
* Cluster topology refresh period.
*/
private Duration period;
/**
* Whether adaptive topology refreshing using all available refresh
* triggers should be used.
*/
private boolean adaptive;
public boolean isDynamicRefreshSources() {
return this.dynamicRefreshSources;
}
public void setDynamicRefreshSources(boolean dynamicRefreshSources) {
this.dynamicRefreshSources = dynamicRefreshSources;
}
public Duration getPeriod() {
return this.period;
}
public void setPeriod(Duration period) {
this.period = period;
}
public boolean isAdaptive() {
return this.adaptive;
}
public void setAdaptive(boolean adaptive) {
this.adaptive = adaptive;
}
}
}
}
这里顺便在解释下dynamicRefreshSources
和adaptive
两个属性,比较容易混淆
dynamicRefreshSources
:这个属性控制是否允许动态刷新源。如果设置为true
,则表示允许客户端动态地刷新 Redis Cluster 的拓扑信息。也就是说,当 Redis Cluster 的节点发生变化时,客户端会自动更新拓扑信息。默认情况下,这个属性是开启的。如果设置为false
,则表示禁止客户端在运行时动态地更新拓扑信息。adaptive
:这个属性控制是否启用自适应刷新。如果设置为true
,则表示客户端会根据持久的断开连接和 MOVED/ASK 重定向来自动调整刷新策略。换句话说,客户端会根据连接状态和重定向情况动态调整刷新策略,以提高性能和稳定性。默认情况下,这个属性也是开启的。如果设置为false
,则表示禁用自适应刷新,客户端将不会根据连接状态和重定向情况调整刷新策略。
两个属性的差别看起来的确不是很大,一个是可以更根据redis cluster节点信息的改变动态刷新,另一个则需要通过重定向或断开连接来重新感知。