Redisson分布式锁实现原理

加锁

`lock.tryLock()`

首先来看无参的tryLock()方法

java 复制代码

org.redisson.RedissonLock
    
	@Override
    public boolean tryLock() {
        return get(tryLockAsync());
    }

    @Override
    public RFuture<Boolean> tryLockAsync() {
        //将当前获取锁线程的id传进去
        return tryLockAsync(Thread.currentThread().getId());
    }

     @Override
    public RFuture<Boolean> tryLockAsync(long threadId) {
        return tryAcquireOnceAsync(-1, -1, null, threadId);
    }

java 复制代码

    private RFuture<Boolean> tryAcquireOnceAsync(long waitTime, long leaseTime, TimeUnit unit, long threadId) {
        //waitTime为获取锁的超时时间 leaseTime为锁的过期时间
        //无参的tryLock 这里必定为-1 不会进入这个判断
        if (leaseTime != -1) {
            return tryLockInnerAsync(waitTime, leaseTime, unit, threadId, RedisCommands.EVAL_NULL_BOOLEAN);
        }
        //commandExecutor.getConnectionManager().getCfg().getLockWatchdogTimeout()
        //private long lockWatchdogTimeout = 30 * 1000; 这个是看门狗机制默认的超时时间
        RFuture<Boolean> ttlRemainingFuture = tryLockInnerAsync(waitTime,
                                                    commandExecutor.getConnectionManager().getCfg().getLockWatchdogTimeout(),
                                                    TimeUnit.MILLISECONDS, threadId, RedisCommands.EVAL_NULL_BOOLEAN);
        //等待获取加锁的结果
        ttlRemainingFuture.onComplete((ttlRemaining, e) -> {
            // 存在异常 直接返回
            if (e != null) {
                return;
            }
		    // 加锁成功,下面代码实现锁超时重试，也就是看门狗的逻辑
            // lock acquired
            if (ttlRemaining) {
                scheduleExpirationRenewal(threadId);
            }
        });
        return ttlRemainingFuture;
    }

这块代码执行获取锁的流程，锁可重入的逻辑在lua脚本中

java 复制代码

    <T> RFuture<T> tryLockInnerAsync(long waitTime, long leaseTime, TimeUnit unit, long threadId, RedisStrictCommand<T> command) {
        // 将超时时间保存在RedissonLock的 internalLockLeaseTime 变量中，用来解决锁超时问题watchDog机制
        internalLockLeaseTime = unit.toMillis(leaseTime);
		// 加锁的过程
        return evalWriteAsync(getName(), LongCodec.INSTANCE, command,
                "if (redis.call('exists', KEYS[1]) == 0) then " +
                        "redis.call('hincrby', KEYS[1], ARGV[2], 1); " +
                        "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                        "return nil; " +
                        "end; " +
                        "if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then " +
                        "redis.call('hincrby', KEYS[1], ARGV[2], 1); " +
                        "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                        "return nil; " +
                        "end; " +
                        "return redis.call('pttl', KEYS[1]);",
                Collections.singletonList(getName()), internalLockLeaseTime, getLockName(threadId));
    }

LUA脚本参数解析：

KEYS[1] 表示的是getName()，即锁key的名称

ARGV[1] 表示的是 internalLockLeaseTime 默认值是30s；

ARGV[2] 表示的是 getLockName(threadId) ，唯一标识当前访问线程，使用锁对象id+线程id（UUID:ThreadId）方式表示，用于区分不同服务器上的线程。

UUID用来唯⼀标识⼀个客户端，因为会有多个客户端的多个线程加锁；

结合起来的UUID:ThreadId 表示：具体哪个客户端上的哪个线程过来加锁，通过这样的组合⽅式唯⼀标识⼀个线程。

LUA脚本逻辑：

如果锁名称不存在

则向redis中添加一个key的HASH结构、添加一个field为线程id，值=1的键值对{field:increment}，表示此线程的重入次数为1；

设置test_lock的过期时间，防止当前服务器出问题后导致死锁，然后return nil; end;返回nil，lua脚本执行完毕

如果锁存在，检测当前线程是否持有锁

如果是当前线程持有锁，hincrby将该线程重入的次数++；并重新设置锁的过期时间；返回nil，lua脚本执行完毕；

如果不是当前线程持有锁，pttl返回锁的过期时间，单位ms。

总体来看，加锁的逻辑很简单：

在key对应的hash数据结构中记录了⼀下当前是哪个客户端的哪个线程过来加锁了，然后设置了⼀下key的过期时间为30s。

看下后面的续期逻辑

java 复制代码

    private void scheduleExpirationRenewal(long threadId) {
        ExpirationEntry entry = new ExpirationEntry();
        ExpirationEntry oldEntry = EXPIRATION_RENEWAL_MAP.putIfAbsent(getEntryName(), entry);
        if (oldEntry != null) {
            oldEntry.addThreadId(threadId);
        } else {
            entry.addThreadId(threadId);
            renewExpiration();
        }
    }

java 复制代码

    private void renewExpiration() {
        ExpirationEntry ee = EXPIRATION_RENEWAL_MAP.get(getEntryName());
        if (ee == null) {
            return;
        }
        
        // 如创建一个延时任务task
        Timeout task = commandExecutor.getConnectionManager().newTimeout(new TimerTask() {
            @Override
            public void run(Timeout timeout) throws Exception {
                ExpirationEntry ent = EXPIRATION_RENEWAL_MAP.get(getEntryName());
                if (ent == null) {
                    return;
                }
                Long threadId = ent.getFirstThreadId();
                if (threadId == null) {
                    return;
                }
                //重新设置超时时间
                RFuture<Boolean> future = renewExpirationAsync(threadId);
                future.onComplete((res, e) -> {
                    if (e != null) {
                        log.error("Can't update lock " + getName() + " expiration", e);
                        return;
                    }
                    
                    if (res) {
                        // reschedule itself
                        renewExpiration();
                    }
                });
            }
            //internalLockLeaseTime 就是我们之前获取到的leaseTime 不传默认30秒 这里每十秒触发一次
        }, internalLockLeaseTime / 3, TimeUnit.MILLISECONDS);
        
        ee.setTimeout(task);
    }

java 复制代码

    protected RFuture<Boolean> renewExpirationAsync(long threadId) {
        return evalWriteAsync(getName(), LongCodec.INSTANCE, RedisCommands.EVAL_BOOLEAN,
                "if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then " +
                        "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                        "return 1; " +
                        "end; " +
                        "return 0;",
                Collections.singletonList(getName()),
                internalLockLeaseTime, getLockName(threadId));
    }

这里LUA脚本的逻辑很简单：

判断当前key中，是否还被线程UUID:ThreadId持有锁，持有则设置过期时间为30s（续命）。

锁续约（看门狗机制）其实就是每次加锁成功后，会⻢上开启⼀个后台线程，每隔10s检查⼀下key是否存在，如果存在就为key续期30s。
这里的10s，取自配置的lockWatchdogTimeout参数，默认为30 * 1000 ms；
所以⼀个key往往当过期时间慢慢消逝到20s左右时就⼜会被定时任务重置为了30s，这样就能保证：只要这个定时任务还在、这个key还在，就⼀直维持加锁。

`lock.tryLock(waitTime, leaseTime, TimeUnit)`

java 复制代码

    @Override
    public boolean tryLock(long waitTime, long leaseTime, TimeUnit unit) throws InterruptedException {
        long time = unit.toMillis(waitTime);
        long current = System.currentTimeMillis();
        long threadId = Thread.currentThread().getId();
        //尝试获取锁，成功返回null，失败返回锁的过期时间
        Long ttl = tryAcquire(waitTime, leaseTime, unit, threadId);
        // lock acquired
        if (ttl == null) {
            return true;
        }
        //获取锁剩余的等待时长
        time -= System.currentTimeMillis() - current;
        if (time <= 0) {
            //获取锁超时，返回获取分布式锁失败
            acquireFailed(waitTime, unit, threadId);
            return false;
        }
        
        current = System.currentTimeMillis();
        RFuture<RedissonLockEntry> subscribeFuture = subscribe(threadId);
        //这里的subscribe就是订阅释放锁的lua脚本中的publish
	   //如果等待结束还没有收到通知就取消订阅, 并返回获取锁失败
        if (!subscribeFuture.await(time, TimeUnit.MILLISECONDS)) {
            if (!subscribeFuture.cancel(false)) {
                subscribeFuture.onComplete((res, e) -> {
                    if (e == null) {
                        unsubscribe(subscribeFuture, threadId);
                    }
                });
            }
            acquireFailed(waitTime, unit, threadId);
            return false;
        }

        try {
            //获取锁剩余的等待时长
            time -= System.currentTimeMillis() - current;
            if (time <= 0) {
                //获取锁超时，返回获取分布式锁失败
                acquireFailed(waitTime, unit, threadId);
                return false;
            }
        
            //循环获取锁
            while (true) {
                long currentTime = System.currentTimeMillis();
                ttl = tryAcquire(waitTime, leaseTime, unit, threadId);
                // lock acquired
                if (ttl == null) {
                    return true;
                }

                time -= System.currentTimeMillis() - currentTime;
                if (time <= 0) {
                    acquireFailed(waitTime, unit, threadId);
                    return false;
                }

                // waiting for message
                currentTime = System.currentTimeMillis();
                //取锁存活时长（ttl）和获取锁的剩余等待时间（time）中的较小值
                if (ttl >= 0 && ttl < time) {
                    //采用信号量的方式，释放锁的代码会传递一个信号量，收到之后继续尝试获取锁
                    subscribeFuture.getNow().getLatch().tryAcquire(ttl, TimeUnit.MILLISECONDS);
                } else {
                    subscribeFuture.getNow().getLatch().tryAcquire(time, TimeUnit.MILLISECONDS);
                }

                time -= System.currentTimeMillis() - currentTime;
                //继续判断锁剩余的等待时长，如果time>0，则继续执行循环
                if (time <= 0) {
                    acquireFailed(waitTime, unit, threadId);
                    return false;
                }
            }
        } finally {
            unsubscribe(subscribeFuture, threadId);
        }
//        return get(tryLockAsync(waitTime, leaseTime, unit));
    }

这里设计的巧妙之处就在于利用了消息订阅，信号量的机制，它不是无休止的这种盲等机制，也避免了不断的重试，而是检测到锁被释放才去尝试重新获取，这对CPU十分的友好

`lock.lock()`

来看无参的lock()方法

java 复制代码

    @Override
    public void lock() {
        try {
            //过期时间为-1
            lock(-1, null, false);
        } catch (InterruptedException e) {
            throw new IllegalStateException();
        }
    }

java 复制代码

    private void lock(long leaseTime, TimeUnit unit, boolean interruptibly) throws InterruptedException {
        long threadId = Thread.currentThread().getId();
        //尝试获取锁，成功返回null，失败则返回锁的过期时间
        Long ttl = tryAcquire(-1, leaseTime, unit, threadId);
        // lock acquired
        if (ttl == null) {
            return;
        }

        //订阅释放锁的消息
        RFuture<RedissonLockEntry> future = subscribe(threadId);
        if (interruptibly) {
            commandExecutor.syncSubscriptionInterrupted(future);
        } else {
            commandExecutor.syncSubscription(future);
        }

        try {
            //循环获取锁
            while (true) {
                //尝试去获取锁，直到成功获取到锁才会跳出while死循环。
                ttl = tryAcquire(-1, leaseTime, unit, threadId);
                // lock acquired
                if (ttl == null) {
                    break;
                }

                // waiting for message
                if (ttl >= 0) {
                    try {
                        future.getNow().getLatch().tryAcquire(ttl, TimeUnit.MILLISECONDS);
                    } catch (InterruptedException e) {
                        if (interruptibly) {
                            throw e;
                        }
                        future.getNow().getLatch().tryAcquire(ttl, TimeUnit.MILLISECONDS);
                    }
                } else {
                    if (interruptibly) {
                        future.getNow().getLatch().acquire();
                    } else {
                        future.getNow().getLatch().acquireUninterruptibly();
                    }
                }
            }
        } finally {
            unsubscribe(future, threadId);
        }
//        get(lockAsync(leaseTime, unit));
    }

`lock.lock(leaseTime,TimeUnit)`

相比于无参的lock()方法，多了锁的过期时间，没有watchDog机制

总结

tryLock()方法获取锁会失败，lock()方法获取锁一定会成功。

不传锁的过期时间（leaseTime），会开启watchDog机制，每隔一段时间（默认10s)，重置超时时间

解锁

lock.unlock();

java 复制代码

    @Override
    public void unlock() {
        try {
            get(unlockAsync(Thread.currentThread().getId()));
        } catch (RedisException e) {
            if (e.getCause() instanceof IllegalMonitorStateException) {
                throw (IllegalMonitorStateException) e.getCause();
            } else {
                throw e;
            }
        }
        
    }

    @Override
    public RFuture<Void> unlockAsync(long threadId) {
        RPromise<Void> result = new RedissonPromise<Void>();
        //释放锁的核心方法
        RFuture<Boolean> future = unlockInnerAsync(threadId);

        future.onComplete((opStatus, e) -> {
            //锁释放成功之后，杀死看门狗
            cancelExpirationRenewal(threadId);

            if (e != null) {
                result.tryFailure(e);
                return;
            }

            if (opStatus == null) {
                IllegalMonitorStateException cause = new IllegalMonitorStateException("attempt to unlock lock, not locked by current thread by node id: "
                        + id + " thread-id: " + threadId);
                result.tryFailure(cause);
                return;
            }

            result.trySuccess(null);
        });

        return result;
    }

java 复制代码

    protected RFuture<Boolean> unlockInnerAsync(long threadId) {
        return evalWriteAsync(getName(), LongCodec.INSTANCE, RedisCommands.EVAL_BOOLEAN,
                "if (redis.call('hexists', KEYS[1], ARGV[3]) == 0) then " +
                        "return nil;" +
                        "end; " +
                        "local counter = redis.call('hincrby', KEYS[1], ARGV[3], -1); " +
                        "if (counter > 0) then " +
                        "redis.call('pexpire', KEYS[1], ARGV[2]); " +
                        "return 0; " +
                        "else " +
                        "redis.call('del', KEYS[1]); " +
                        "redis.call('publish', KEYS[2], ARGV[1]); " +
                        "return 1; " +
                        "end; " +
                        "return nil;",
                Arrays.asList(getName(), getChannelName()), LockPubSub.UNLOCK_MESSAGE, internalLockLeaseTime, getLockName(threadId));
    }

和加锁的方式⼀样，释放锁也是通过lua脚本来完成的；

LUA脚本参数解析：

KEYS[1] 表示的是getName()，代表的是锁名；
KEYS[2] 表示getChanelName() 表示的是发布订阅过程中使用的Chanel；
ARGV[1] 表示的是LockPubSub.unLockMessage，解锁消息，实际代表的是数字 0，代表解锁消息；
ARGV[2] 表示的是internalLockLeaseTime 默认的有效时间 30s；
ARGV[3] 表示的是getLockName(thread.currentThread().getId())代表的是 UUID:ThreadId 用锁对象id+线程id，表示当前访问线程，用于区分不同服务器上的线程。

LUA脚本逻辑：

如果锁名称不存在

可能是因为锁过期导致锁不存在，也可能是并发解锁。

则发布锁解除的消息，返回1，lua脚本执行完毕；

如果锁存在，检测当前线程是否持有锁

如果是当前线程持有锁，定义变量counter，接收执行incrby将该线程重入的次数--1的结果；

如果重入次数大于0，表示该线程还有其他任务需要执行；重新设置锁的过期时间；返回0，lua脚本执行完毕；

否则表示该线程执行结束，del删除该锁；并且publish发布该锁解除的消息；返回1，lua脚本执行完毕；

如果不是当前线程持有锁或其他情况，都返回nil，lua脚本执行完毕。

推荐文章: 图解Redisson如何实现分布式锁、锁续约？

锁重试和续约? Redisson: 不错, 正是在下 (源码解读)