Redis Subscribe timeout 是什么报错?

🐛 介绍

Redisson版本 2.8.2

最近公司系统偶尔报出org.redisson.client.RedisTimeoutException: Subscribe timeout: (7500ms)的错误,观察堆栈信息看到报错是一段使用Redisson的redis锁的地方,去除业务逻辑代码基本如下

java 复制代码
public void mockLock(String phoneNum) {
log.info("{} - prepare lock", threadName);
RLock lock = redissonClient.getLock("redis_cache_test" + phoneNum);
try {
    lock.lock();
    log.info("{} - get lock", threadName);
    //睡眠10s
    Thread.sleep(10000);
} catch (Exception e) {
    log.info("{} - exception", threadName,e);
} finally {
    log.info("{} - unlock lock", threadName);
    lock.unlock();
}

导致报错的代码是lock.lock()的实现

java 复制代码
@Override
public void syncSubscription(RFuture<?> future) {
    MasterSlaveServersConfig config = connectionManager.getConfig();
    try {
        int timeout = config.getTimeout() + config.getRetryInterval()*config.getRetryAttempts();
        if (!future.await(timeout)) {
            throw new RedisTimeoutException("Subscribe timeout: (" + timeout + "ms)");
        }
    } catch (InterruptedException e) {
        Thread.currentThread().interrupt();
    }
    future.syncUninterruptibly();
}

溯因

syncSubscription中的futureRedissonLock.subscribe(long threadId)方法

java 复制代码
protected RFuture<RedissonLockEntry> subscribe(long threadId) {
    return PUBSUB.subscribe(getEntryName(), getChannelName(), commandExecutor.getConnectionManager());
}

这里可以看出大概是在PUBSUB中获取一个订阅,再往下看源码

java 复制代码
public RFuture<E> subscribe(final String entryName, final String channelName, final ConnectionManager connectionManager) {
    //监听持有
    final AtomicReference<Runnable> listenerHolder = new AtomicReference<Runnable>();
    //获取锁订阅队列
    final AsyncSemaphore semaphore = connectionManager.getSemaphore(channelName);
    //订阅拒绝实现
    final RPromise<E> newPromise = new PromiseDelegator<E>(connectionManager.<E>newPromise()) {
        @Override
        public boolean cancel(boolean mayInterruptIfRunning) {
            return semaphore.remove(listenerHolder.get());
        }
    };

    Runnable listener = new Runnable() {

        @Override
        public void run() {
        //判断是否已经存在相同的entry
            E entry = entries.get(entryName);
            if (entry != null) {
                entry.aquire();
                semaphore.release();
                entry.getPromise().addListener(new TransferListener<E>(newPromise));
                return;
            }
            //没有则新建
            E value = createEntry(newPromise);
            value.aquire();
            
            E oldValue = entries.putIfAbsent(entryName, value);
            if (oldValue != null) {
                oldValue.aquire();
                semaphore.release();
                oldValue.getPromise().addListener(new TransferListener<E>(newPromise));
                return;
            }
            //监听对应的entry
            RedisPubSubListener<Object> listener = createListener(channelName, value);
            //订阅事件
            connectionManager.subscribe(LongCodec.INSTANCE, channelName, listener, semaphore);
        }
    };
    //用semaphore管理监听队列,因为可能存在多个线程等待一个锁
    semaphore.acquire(listener);
    //保证订阅拒绝逻辑
    listenerHolder.set(listener);
    
    return newPromise;
}

这里可以看到这个方法其实只是定义了一个名叫listener的Runnable, semaphore.acquire(listener);则保证了同一个channel仅会有一个线程去监听,其他的继续等待,而订阅逻辑还在connectionManager.subscribe里面

java 复制代码
private void subscribe(final Codec codec, final String channelName, final RedisPubSubListener<?> listener, 
        final RPromise<PubSubConnectionEntry> promise, final PubSubType type, final AsyncSemaphore lock) {
    final PubSubConnectionEntry connEntry = name2PubSubConnection.get(channelName);
    if (connEntry != null) {
        connEntry.addListener(channelName, listener);
        connEntry.getSubscribeFuture(channelName, type).addListener(new FutureListener<Void>() {
            @Override
            public void operationComplete(Future<Void> future) throws Exception {
                lock.release();
                promise.trySuccess(connEntry);
            }
        });
        return;
    }

    freePubSubLock.acquire(new Runnable() {

        @Override
        public void run() {
            if (promise.isDone()) {
                return;
            }
            //如果没有获取到公共的连接直接返回
            final PubSubConnectionEntry freeEntry = freePubSubConnections.peek();
            if (freeEntry == null) {
                connect(codec, channelName, listener, promise, type, lock);
                return;
            }
            //entry有个计数器subscriptionsPerConnection
            如果为-1报错因为下面有0的判断
            int remainFreeAmount = freeEntry.tryAcquire();
            if (remainFreeAmount == -1) {
                throw new IllegalStateException();
            }
            
            final PubSubConnectionEntry oldEntry = name2PubSubConnection.putIfAbsent(channelName, freeEntry);
            if (oldEntry != null) {
                freeEntry.release();
                freePubSubLock.release();
                
                oldEntry.addListener(channelName, listener);
                oldEntry.getSubscribeFuture(channelName, type).addListener(new FutureListener<Void>() {
                    @Override
                    public void operationComplete(Future<Void> future) throws Exception {
                        lock.release();
                        promise.trySuccess(oldEntry);
                    }
                });
                return;
            }
            //subscriptionsPerConnection为0时从公共连接池中吐出
            if (remainFreeAmount == 0) {
                freePubSubConnections.poll();
            }
            freePubSubLock.release();
            
            freeEntry.addListener(channelName, listener);
            freeEntry.getSubscribeFuture(channelName, type).addListener(new FutureListener<Void>() {
                @Override
                public void operationComplete(Future<Void> future) throws Exception {
                    lock.release();
                    promise.trySuccess(freeEntry);
                }
            });
            
            if (PubSubType.PSUBSCRIBE == type) {
                freeEntry.psubscribe(codec, channelName);
            } else {
                freeEntry.subscribe(codec, channelName);
            }
        }
        
    });
}

这里第一次一定会进到connect(codec, channelName, listener, promise, type, lock);中去

java 复制代码
private void connect(final Codec codec, final String channelName, final RedisPubSubListener<?> listener,
        final RPromise<PubSubConnectionEntry> promise, final PubSubType type, final AsyncSemaphore lock) {
    final int slot = calcSlot(channelName);
    //根据subscriptionConnectionPoolSize获取下一个链接
    RFuture<RedisPubSubConnection> connFuture = nextPubSubConnection(slot);
    connFuture.addListener(new FutureListener<RedisPubSubConnection>() {

        @Override
        public void operationComplete(Future<RedisPubSubConnection> future) throws Exception {
            if (!future.isSuccess()) {
                freePubSubLock.release();
                lock.release();
                promise.tryFailure(future.cause());
                return;
            }

            RedisPubSubConnection conn = future.getNow();
            
            final PubSubConnectionEntry entry = new PubSubConnectionEntry(conn, config.getSubscriptionsPerConnection());
            entry.tryAcquire();
            
            final PubSubConnectionEntry oldEntry = name2PubSubConnection.putIfAbsent(channelName, entry);
            if (oldEntry != null) {
                releaseSubscribeConnection(slot, entry);
                
                freePubSubLock.release();
                
                oldEntry.addListener(channelName, listener);
                oldEntry.getSubscribeFuture(channelName, type).addListener(new FutureListener<Void>() {
                    @Override
                    public void operationComplete(Future<Void> future) throws Exception {
                        lock.release();
                        promise.trySuccess(oldEntry);
                    }
                });
                return;
            }
            
            freePubSubConnections.add(entry);
            freePubSubLock.release();
            
            entry.addListener(channelName, listener);
            entry.getSubscribeFuture(channelName, type).addListener(new FutureListener<Void>() {
                @Override
                public void operationComplete(Future<Void> future) throws Exception {
                    lock.release();
                    promise.trySuccess(entry);
                }
            });
            
            if (PubSubType.PSUBSCRIBE == type) {
                entry.psubscribe(codec, channelName);
            } else {
                entry.subscribe(codec, channelName);
            }
            
        }
    });
}

这里的RFuture<RedisPubSubConnection> connFuture = nextPubSubConnection(slot);最终会调用ClientConnectionsEntry#acquireSubscribeConnection方法的
freeSubscribeConnectionsCounter.acquire(runnable) 至此我们找到原因 当同时等待锁订阅消息达到subscriptionConnectionPoolSize*subscriptionsPerConnection个时,再多一个订阅消息,连接一直无法获取导致MasterSlaveConnectionManager中的freePubSubLock没有释放。 另外由于在超时场景下MasterSlaveConnectionManager向连接池获取连接后是直接缓存下来,不把分发订阅链接释返回给连接池的,因此导致freeSubscribeConnectionsCounter一直等待,出现死锁情况。

最终表现就是org.redisson.client.RedisTimeoutException: Subscribe timeout: (7500ms)

复现

Redis配置

java 复制代码
public RedissonClient redissonClient(RedisConfig redisConfig) {

    Config config = new Config();
    config.useSingleServer()
        .setAddress(redisConfig.getHost() + ":" + redisConfig.getPort())
        .setPassword(redisConfig.getPassword())
        .setDatabase(redisConfig.getDatabase())
        .setConnectTimeout(redisConfig.getConnectionTimeout())
        .setTimeout(redisConfig.getTimeout())
        //把两个配置项设置为1
        .setSubscriptionConnectionPoolSize(1)
        .setSubscriptionsPerConnection(1);
    return Redisson.create(config);
}

测试方法

java 复制代码
void contextLoads() throws InterruptedException {
    Runnable runnable = () -> {
        redissonLock.tryRedissonLock();
    };
    new Thread(runnable, "线程1").start();
    new Thread(runnable, "线程12").start();
    new Thread(runnable, "线程23").start();
    new Thread(runnable, "线程21").start();
    
    Thread.sleep(200000);
}

结果

bash 复制代码
org.redisson.client.RedisTimeoutException: Subscribe timeout: (5500ms)
	at org.redisson.command.CommandAsyncService.syncSubscription(CommandAsyncService.java:126) ~[redisson-2.8.2.jar:na]
	at org.redisson.RedissonLock.lockInterruptibly(RedissonLock.java:121) ~[redisson-2.8.2.jar:na]
	at org.redisson.RedissonLock.lockInterruptibly(RedissonLock.java:108) ~[redisson-2.8.2.jar:na]
	at org.redisson.RedissonLock.lock(RedissonLock.java:90) ~[redisson-2.8.2.jar:na]
	at com.rick.redislock.lock.RedissonLock.registerPersonalMember(RedissonLock.java:30) ~[classes/:na]
	at com.rick.redislock.RedisLockApplicationTests.lambda$contextLoads$0(RedisLockApplicationTests.java:15) [test-classes/:na]
	at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_144]

符合预期

相关推荐
市场部需要一个软件开发岗位4 小时前
JAVA开发常见安全问题:纵向越权
java·数据库·安全
历程里程碑4 小时前
普通数组----合并区间
java·数据结构·python·算法·leetcode·职场和发展·tornado
程序员泠零澪回家种桔子5 小时前
Spring AI框架全方位详解
java·人工智能·后端·spring·ai·架构
CodeCaptain5 小时前
nacos-2.3.2-OEM与nacos3.1.x的差异分析
java·经验分享·nacos·springcloud
Anastasiozzzz6 小时前
Java Lambda 揭秘:从匿名内部类到底层原理的深度解析
java·开发语言
骇客野人6 小时前
通过脚本推送Docker镜像
java·docker·容器
铁蛋AI编程实战6 小时前
通义千问 3.5 Turbo GGUF 量化版本地部署教程:4G 显存即可运行,数据永不泄露
java·人工智能·python
晚霞的不甘6 小时前
CANN 编译器深度解析:UB、L1 与 Global Memory 的协同调度机制
java·后端·spring·架构·音视频
SunnyDays10116 小时前
使用 Java 冻结 Excel 行和列:完整指南
java·冻结excel行和列
摇滚侠6 小时前
在 SpringBoot 项目中,开发工具使用 IDEA,.idea 目录下的文件需要提交吗
java·spring boot·intellij-idea