大纲
1.漏桶算法的实现对比
(1)普通思路的漏桶算法实现
(2)节省线程的漏桶算法实现
(3)Sentinel中的漏桶算法实现
(4)Sentinel中的漏桶算法与普通漏桶算法的区别
(5)Sentinel中的漏桶算法存在的问题
2.令牌桶算法的实现对比
(1)普通思路的令牌桶算法实现
(2)节省线程的令牌桶算法实现
(3)Guava中的令牌桶算法实现
(4)Sentinel中的令牌桶算法实现
(5)Sentinel中的令牌桶算法总结
三.SmoothWarmingUp的初始化
@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime")
public abstract class RateLimiter {
...
//Creates a RateLimiter with the specified stable throughput,
//given as "permits per second" (commonly referred to as QPS, queries per second),
//and a warmup period, during which the RateLimiter smoothly ramps up its rate,
//until it reaches its maximum rate at the end of the period (as long as there are enough requests to saturate it).
//Similarly, if the RateLimiter is left unused for a duration of warmupPeriod,
//it will gradually return to its "cold" state,
//i.e. it will go through the same warming up process as when it was first created.
//The returned RateLimiter is intended for cases where the resource that actually fulfills the requests (e.g., a remote server) needs "warmup" time,
//rather than being immediately accessed at the stable (maximum) rate.
//The returned RateLimiter starts in a "cold" state (i.e. the warmup period will follow),
//and if it is left unused for long enough, it will return to that state.
//创建一个具有指定稳定吞吐量的RateLimiter,
//入参为:"每秒多少令牌"(通常称为QPS,每秒的查询量),以及平稳增加RateLimiter速率的预热期,
//直到RateLimiter在该预热周期结束时达到最大速率(只要有足够的请求使其饱和);
//类似地,如果RateLimiter在预热时段的持续时间内未被使用,它将逐渐返回到它的"冷"状态,
//也就是说,它将经历与最初创建时相同的预热过程;
//返回的RateLimiter适用于实际满足请求的资源(例如远程服务器)需要"预热"时间的情况,而不是以稳定(最大)速率立即访问;
//返回的RateLimiter在"冷"状态下启动(也就是说,接下来将是预热期),如果它被闲置足够长的时间,它就会回到那个"冷"状态;
//@param permitsPerSecond the rate of the returned RateLimiter, measured in how many permits become available per second
//@param warmupPeriod the duration of the period where the RateLimiter ramps up its rate, before reaching its stable (maximum) rate
//@param unit the time unit of the warmupPeriod argument
public static RateLimiter create(double permitsPerSecond, long warmupPeriod, TimeUnit unit) {
checkArgument(warmupPeriod >= 0, "warmupPeriod must not be negative: %s", warmupPeriod);
return create(permitsPerSecond, warmupPeriod, unit, 3.0, SleepingStopwatch.createFromSystemTimer());
}
@VisibleForTesting
static RateLimiter create(double permitsPerSecond, long warmupPeriod, TimeUnit unit, double coldFactor, SleepingStopwatch stopwatch) {
RateLimiter rateLimiter = new SmoothWarmingUp(stopwatch, warmupPeriod, unit, coldFactor);
//调用RateLimiter.setRate()方法
rateLimiter.setRate(permitsPerSecond);
return rateLimiter;
}
//Updates the stable rate of this RateLimiter,
//that is, the permitsPerSecond argument provided in the factory method that constructed the RateLimiter.
//Currently throttled threads will not be awakened as a result of this invocation,
//thus they do not observe the new rate; only subsequent requests will.
//Note though that, since each request repays (by waiting, if necessary) the cost of the previous request,
//this means that the very next request after an invocation to setRate() will not be affected by the new rate;
//it will pay the cost of the previous request, which is in terms of the previous rate.
//The behavior of the RateLimiter is not modified in any other way,
//e.g. if the RateLimiter was configured with a warmup period of 20 seconds,
//it still has a warmup period of 20 seconds after this method invocation.
//更新该RateLimiter的稳定速率,即在构造RateLimiter的工厂方法中提供permitsPerSecond参数;
//当前被限流的线程将不会由于这个调用而被唤醒,因此它们没有观察到新的速率;只有随后的请求才会;
//但是要注意的是,由于每个请求(如果需要,通过等待)会偿还先前请求的成本,
//这意味着调用setRate()方法后的下一个请求将不会受到新速率的影响,
//它将按照先前的速率处理先前请求的成本;
//RateLimiter的行为不会以任何其他方式修改,
//例如:如果RateLimiter被配置为具有20秒的预热周期,在该方法调用之后,它仍然有20秒的预热期;
//@param permitsPerSecond the new stable rate of this {@code RateLimiter}
public final void setRate(double permitsPerSecond) {
checkArgument(permitsPerSecond > 0.0 && !Double.isNaN(permitsPerSecond), "rate must be positive");
//在同步代码块中设定速率
synchronized (mutex()) {
//调用SmoothRateLimiter.doSetRate()方法
doSetRate(permitsPerSecond, stopwatch.readMicros());
}
}
...
}
@GwtIncompatible
abstract class SmoothRateLimiter extends RateLimiter {
//The currently stored permits.
//令牌桶中当前缓存的未消耗的令牌数
double storedPermits;
//The maximum number of stored permits.
//令牌桶中允许存放的最大令牌数
double maxPermits;
//The interval between two unit requests, at our stable rate.
//E.g., a stable rate of 5 permits per second has a stable interval of 200ms.
//按照我们稳定的速率,两个单位请求之间的时间间隔;例如,每秒5个令牌的稳定速率具有200ms的稳定间隔
double stableIntervalMicros;
//The time when the next request (no matter its size) will be granted.
//After granting a request, this is pushed further in the future. Large requests push this further than small requests.
//下一个请求(无论大小)将被批准的时间.
//在批准请求后,这将在未来进一步推进,大请求比小请求更能推动这一进程。
private long nextFreeTicketMicros = 0L;//could be either in the past or future
...
//这是一个可以重复调用的函数.
//第一次调用和非第一次调用的过程有些不一样,目的是设定一个新的速率Rate.
@Override
final void doSetRate(double permitsPerSecond, long nowMicros) {
//调用SmoothRateLimiter.resync()方法,重试计算和同步存储的预分配的令牌.
resync(nowMicros);
//计算稳定的发放令牌的时间间隔. 单位us, 比如qps为5, 则为200ms即20万us的间隔进行令牌发放.
double stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond;
this.stableIntervalMicros = stableIntervalMicros;
//调用SmoothWarmingUp.doSetRate()设定其内部的比率.
doSetRate(permitsPerSecond, stableIntervalMicros);
}
//Updates storedPermits and nextFreeTicketMicros based on the current time.
//根据当前时间,更新storedPermits和nextFreeTicketMicros变量
//注意: 在初始化SmoothBursty时会第一次调用resync()方法,此时各值的情况如下:
//coolDownIntervalMicros = 0、nextFreeTicketMicros = 0、newPermits = 无穷大.
//maxPermits = 0(初始值,还没有重新计算)、最后得到的: storedPermits = 0;
//同时,nextFreeTicketMicros = "起始时间"
void resync(long nowMicros) {
//if nextFreeTicket is in the past, resync to now
if (nowMicros > nextFreeTicketMicros) {
double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();
storedPermits = min(maxPermits, storedPermits + newPermits);
nextFreeTicketMicros = nowMicros;
}
}
abstract void doSetRate(double permitsPerSecond, double stableIntervalMicros);
...
static final class SmoothWarmingUp extends SmoothRateLimiter {
private final long warmupPeriodMicros;
//The slope of the line from the stable interval (when permits == 0), to the cold interval (when permits == maxPermits)
private double slope;//斜率
private double thresholdPermits;
private double coldFactor;
SmoothWarmingUp(SleepingStopwatch stopwatch, long warmupPeriod, TimeUnit timeUnit, double coldFactor) {
super(stopwatch);
//将warmupPeriod转换成微妙并赋值给warmupPeriodMicros
this.warmupPeriodMicros = timeUnit.toMicros(warmupPeriod);
this.coldFactor = coldFactor;
}
@Override
void doSetRate(double permitsPerSecond, double stableIntervalMicros) {
double oldMaxPermits = maxPermits;
//stableIntervalMicros此时已由前面的SmoothRateLimiter.doSetRate()方法设为:1/qps
//coldFactor的值默认会初始化为3
//因此系统最冷时的令牌生成间隔:coldIntervalMicros等于3倍的普通间隔stableIntervalMicros
double coldIntervalMicros = stableIntervalMicros * coldFactor;
//warmupPeriodMicros是用户传入的预热时间
//stableIntervalMicros是稳定期间令牌发放的间隔
//进入预热阶段的临界令牌数thresholdPermits,默认就是:整个预热时间除以正常速率的一半
//该值太小会过早进入预热阶段,影响性能;该值太大会对系统产生压力,没达到预热效果
thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros;
//最大令牌数
maxPermits = thresholdPermits + 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros);
//斜率
slope = (coldIntervalMicros - stableIntervalMicros) / (maxPermits - thresholdPermits);
//设置当前桶内的存储令牌数
//突发型的RateLimiter------SmoothBursty:
//初始化时不会预生成令牌,因为storedPermits初始为0;
//随着时间推移,则会产生新的令牌,这些令牌如果没有被消费,则会存储在storedPermits里;
//预热型的RateLimiter------SmoothWarmingUp:
//初始化时会预生成令牌,并且初始化时肯定是系统最冷的时候,所以桶内默认就是maxPermits
if (oldMaxPermits == Double.POSITIVE_INFINITY) {
//if we don't special-case this, we would get storedPermits == NaN, below
storedPermits = 0.0;
} else {
//对于SmoothWarmingUp的RateLimiter来说,其初始存储值storedPermits是满的,也就是存储了最大限流的令牌数
//而对于突发型的限流器SmoothBursty来说,其初始存储值storedPermits是0
storedPermits = (oldMaxPermits == 0.0) ? maxPermits : storedPermits * maxPermits / oldMaxPermits;
}
}
...
}
...
}
四.SmoothWarmingUp的acquire()方法

@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime")
public abstract class RateLimiter {
...
//无限等待的获取
//Acquires the given number of permits from this RateLimiter,
//blocking until the request can be granted.
//Tells the amount of time slept, if any.
//@param permits the number of permits to acquire,获取的令牌数量
//@return time spent sleeping to enforce rate, in seconds; 0.0 if not rate-limited
@CanIgnoreReturnValue
public double acquire(int permits) {
//调用RateLimiter.reserve()方法
//预支令牌并获取需要阻塞的时间:即预定数量为permits的令牌数,并返回需要等待的时间
long microsToWait = reserve(permits);
//将需要等待的时间补齐, 从而满足限流的需求,即根据microsToWait来让线程sleep(共性)
stopwatch.sleepMicrosUninterruptibly(microsToWait);
//返回这次调用使用了多少时间给调用者
return 1.0 * microsToWait / SECONDS.toMicros(1L);
}
//Reserves the given number of permits from this RateLimiter for future use,
//returning the number of microseconds until the reservation can be consumed.
//从这个RateLimiter限速器中保留给定数量的令牌,以备将来使用,返回可以使用保留前的微秒数
//@return time in microseconds to wait until the resource can be acquired, never negative
final long reserve(int permits) {
checkPermits(permits);
//由于涉及并发操作,所以必须使用synchronized进行互斥处理
synchronized (mutex()) {
//调用RateLimiter.reserveAndGetWaitLength()方法
return reserveAndGetWaitLength(permits, stopwatch.readMicros());
}
}
//Reserves next ticket and returns the wait time that the caller must wait for.
//预定下一个ticket,并且返回需要等待的时间
final long reserveAndGetWaitLength(int permits, long nowMicros) {
//调用SmoothRateLimiter.reserveEarliestAvailable()方法
long momentAvailable = reserveEarliestAvailable(permits, nowMicros);
return max(momentAvailable - nowMicros, 0);
}
//Reserves the requested number of permits and returns the time that those permits can be used (with one caveat).
//保留请求数量的令牌,并返回可以使用这些令牌的时间(有一个警告)
//生产令牌、获取令牌、计算阻塞时间的具体细节由子类来实现
//@return the time that the permits may be used, or, if the permits may be used immediately, an arbitrary past or present time
abstract long reserveEarliestAvailable(int permits, long nowMicros);
...
}
@GwtIncompatible
abstract class SmoothRateLimiter extends RateLimiter {
//The currently stored permits.
//令牌桶中当前缓存的未消耗的令牌数
double storedPermits;
//The maximum number of stored permits.
//令牌桶中允许存放的最大令牌数
double maxPermits;
//The interval between two unit requests, at our stable rate.
//E.g., a stable rate of 5 permits per second has a stable interval of 200ms.
//按照我们稳定的速率,两个单位请求之间的时间间隔;例如,每秒5个令牌的稳定速率具有200ms的稳定间隔
double stableIntervalMicros;
//The time when the next request (no matter its size) will be granted.
//After granting a request, this is pushed further in the future. Large requests push this further than small requests.
//下一个请求(无论大小)将被批准的时间. 在批准请求后,这将在未来进一步推进,大请求比小请求更能推动这一进程.
private long nextFreeTicketMicros = 0L;//could be either in the past or future
...
@Override
final long reserveEarliestAvailable(int requiredPermits, long nowMicros) {
//1.根据nextFreeTicketMicros计算新产生的令牌数,更新当前未使用的令牌数storedPermits
//获取令牌时调用SmoothRateLimiter.resync()方法与初始化时的调用不一样.
//此时会把"没有过期"的令牌存储起来.
//但是如果计数时间nextFreeTicketMicros是在未来. 那就不做任何处理.
resync(nowMicros);
//下一个请求(无论大小)将被批准的时间
long returnValue = nextFreeTicketMicros;
//2.计算需要阻塞等待的时间
//2.1.先从桶中取未消耗的令牌,如果桶中令牌数不足,看最多能取多少个
//存储的令牌可供消费的数量
double storedPermitsToSpend = min(requiredPermits, this.storedPermits);
//2.2.计算是否需要等待新鲜的令牌(当桶中现有的令牌数不足时就需要等待新鲜的令牌),如果需要,则计算需要等待的令牌数
//需要等待的令牌:新鲜的令牌
double freshPermits = requiredPermits - storedPermitsToSpend;
//计算需要等待的时间
//分两部分计算:waitMicros = 从桶中获取storedPermitsToSpend个现有令牌的代价 + 等待生成freshPermits个新鲜令牌的代价
//从桶中取storedPermitsToSpend个现有令牌也是有代价的,storedPermitsToWaitTime()方法是个抽象方法,会由SmoothBursty和SmoothWarmingUp实现
//对于SmoothBursty来说,storedPermitsToWaitTime()会返回0,表示已经存储的令牌不需要等待.
//而生成新鲜令牌需要等待的代价是:新鲜令牌的个数freshPermits * 每个令牌的耗时stableIntervalMicros
long waitMicros = storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend) + (long) (freshPermits * stableIntervalMicros);
//3.更新nextFreeTicketMicros
//由于新鲜的令牌可能已被预消费,所以nextFreeTicketMicros就得往后移,以表示这段时间被预消费了
this.nextFreeTicketMicros = LongMath.saturatedAdd(nextFreeTicketMicros, waitMicros);
//4.扣减令牌数,更新桶内剩余令牌
//最后把上面计算的可扣减的令牌数量从存储的令牌里减掉
this.storedPermits -= storedPermitsToSpend;
//返回请求需要等待的时间
//需要注意returnValue被赋值的是上次的nextFreeTicketMicros,说明当前这次请求获取令牌的代价由下一个请求去支付
return returnValue;
}
//Updates storedPermits and nextFreeTicketMicros based on the current time.
//根据当前时间,更新storedPermits和nextFreeTicketMicros变量
//计算nextFreeTicketMicros到当前时间内新产生的令牌数,这个就是延迟计算
void resync(long nowMicros) {
//if nextFreeTicket is in the past, resync to now
//一般当前的时间是大于下个请求被批准的时间
//此时:会把过去的时间换成令牌数存储起来,注意存储的令牌数不能大于最大的令牌数
//当RateLimiter初始化好后,可能刚开始没有流量,或者是一段时间没有流量后突然来了流量
//此时可以往"后"预存储一秒时间的令牌数. 也就是这里所说的burst能力
//如果nextFreeTicketMicros在未来的一个时间点,那这个if判断便不满足
//此时,不需要进行更新storedPermits和nextFreeTicketMicros变量
//此种情况发生在:"预借"了令牌的时候
if (nowMicros > nextFreeTicketMicros) {
//时间差除以生成一个新鲜令牌的耗时,coolDownIntervalMicros()是抽象方法,由子类实现
double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();
//更新令牌桶内已存储的令牌个数,注意不超过最大限制
storedPermits = min(maxPermits, storedPermits + newPermits);
//更新nextFreeTicketMicros为当前时间
nextFreeTicketMicros = nowMicros;
}
}
//Translates a specified portion of our currently stored permits which we want to spend/acquire, into a throttling time.
//Conceptually, this evaluates the integral of the underlying function we use, for the range of [(storedPermits - permitsToTake), storedPermits].
//This always holds: 0 <= permitsToTake <= storedPermits
//从桶中取出已存储的令牌的代价,由子类实现
//这是一个抽象函数,SmoothBursty中的实现会直接返回0,可以认为已经预分配的令牌,在获取时不需要待待时间
abstract long storedPermitsToWaitTime(double storedPermits, double permitsToTake);
//Returns the number of microseconds during cool down that we have to wait to get a new permit.
//每生成一个新鲜令牌的耗时,由子类实现
abstract double coolDownIntervalMicros();
...
static final class SmoothWarmingUp extends SmoothRateLimiter {
private final long warmupPeriodMicros;
private double slope;//斜率
private double thresholdPermits;
private double coldFactor;
...
@Override
long storedPermitsToWaitTime(double storedPermits, double permitsToTake) {
//检查当前桶内存储的令牌数是否大于进入预热阶段的临界令牌数thresholdPermits
double availablePermitsAboveThreshold = storedPermits - thresholdPermits;
long micros = 0;
//如果当前桶内存储的令牌数大于进入预热阶段的临界令牌数thresholdPermits
//则说明系统当前已经冷下来了,需要进入预热期,于是需要计算在预热期生成令牌的耗时
if (availablePermitsAboveThreshold > 0.0) {
//计算在超出临界值的令牌中需要取出多少个令牌,并计算耗时
double permitsAboveThresholdToTake = min(availablePermitsAboveThreshold, permitsToTake);
//计算预热阶段的耗时,前半部分的permitsToTime()计算的是生成令牌的初始速率,后半部分的permitsToTime()计算的是生成令牌的结束速率
double length = permitsToTime(availablePermitsAboveThreshold) + permitsToTime(availablePermitsAboveThreshold - permitsAboveThresholdToTake);
//总耗时 = ((初始速率 + 结束速率) * 令牌数) / 2
micros = (long) (permitsAboveThresholdToTake * length / 2.0);
permitsToTake -= permitsAboveThresholdToTake;
}
//加上稳定阶段的令牌耗时就是总耗时
micros += (long) (stableIntervalMicros * permitsToTake);
return micros;
}
//已知每生成一个令牌,下一个令牌的耗时就会固定增加slope微秒
//那么在知道初始耗时stableIntervalMicros的情况下,就可以按如下公式求出生成第permits个令牌的耗时
private double permitsToTime(double permits) {
return stableIntervalMicros + permits * slope;
}
@Override
double coolDownIntervalMicros() {
//预热时长 / 最大令牌数
return warmupPeriodMicros / maxPermits;
}
}
...
}
(4)Sentinel中的令牌桶算法实现
一.WarmUpController的初始化
二.WarmUpController.canPass()方法
三.WarmUpController.syncToken()方法
四.WarmUpController.coolDownTokens()方法
Guava中的预热是通过控制令牌的生成时间来实现的,Sentinel中的预热则是通过控制每秒通过的请求数来实现的。在Guava中,冷却因子coldFactor固定为3,已被写死。在Sentinel中,冷却因子coldFactor默认为3,可通过参数修改。
一.WarmUpController的初始化
public class WarmUpController implements TrafficShapingController {
//count是QPS阈值,即FlowRule中设定的阈值,表示系统在稳定阶段下允许的最大QPS
//在预热阶段,系统允许的QPS不会直接到达count值,而是会逐渐增加(对应预热模型图从右向左),直到达到这个count值为止
//这样就能实现让系统接收到的流量是一个平滑上升的状态,而不是让系统瞬间被打满
protected double count;
//coldFactor是冷却因子,表示系统在最冷时(预热阶段刚开始时)允许的QPS阈值与稳定阶段下允许的QPS阈值之比
//此参数直接影响预热阶段允许的QPS递增值,冷却因子越大,预热阶段允许的QPS递增值越低,默认为3
private int coldFactor;
//告警值,大于告警值系统就进入预热阶段,小于告警值系统进入稳定阶段
protected int warningToken = 0;
//令牌桶可以存储的最大令牌数
private int maxToken;
//斜率,预热阶段令牌生成速率的增速
protected double slope;
//令牌桶中已存储的令牌数
protected AtomicLong storedTokens = new AtomicLong(0);
//最后一次添加令牌的时间戳
protected AtomicLong lastFilledTime = new AtomicLong(0);
public WarmUpController(double count, int warmUpPeriodInSec, int coldFactor) {
construct(count, warmUpPeriodInSec, coldFactor);
}
public WarmUpController(double count, int warmUpPeriodInSec) {
//warmUpPeriodInSec是预热时长,表示系统需要多长时间从预热阶段到稳定阶段
//比如限制QPS为100,设置预热时长为10s,那么在预热阶段,令牌生成的速率会越来越快
//可能第1s只允许10个请求通过,第2s可能允许15个请求通过,这样逐步递增,直至递增到100为止
construct(count, warmUpPeriodInSec, 3);
}
private void construct(double count, int warmUpPeriodInSec, int coldFactor) {
if (coldFactor <= 1) {
throw new IllegalArgumentException("Cold factor should be larger than 1");
}
this.count = count;
this.coldFactor = coldFactor;
//thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros;
//1.告警值,大于告警值系统就进入预热阶段;例如预热时长为5s,QPS为100,那么warningToken就为250
warningToken = (int)(warmUpPeriodInSec * count) / (coldFactor - 1);
//maxPermits = thresholdPermits + 2 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros);
//2.系统最冷时桶内存储的令牌数,例如预热时长为5s,QPS为100,那么maxToken为500
maxToken = warningToken + (int)(2 * warmUpPeriodInSec * count / (1.0 + coldFactor));
//slope = (coldIntervalMicros - stableIntervalMicros) / (maxPermits - thresholdPermits);
//3.slope斜率,例如预热时长为5s,QPS为100,那么slope为0.00008
slope = (coldFactor - 1.0) / count / (maxToken - warningToken);
}
...
}
二.WarmUpController.canPass()方法
**步骤一:**调用WarmUpController的syncToken()方法生成令牌并同步到令牌桶内
**步骤二:**判断令牌桶内剩余令牌数是否大于告警值
**情况一:**如果剩余令牌数大于警戒值,说明系统处于预热阶段,此时需要进一步比较令牌的生产速率与令牌的消耗速率。若消耗速率大,则限流,否则请求正常通行。
**情况二:**如果剩余令牌数小于警戒值,说明系统处于稳定阶段。此时就直接判断当前请求的QPS与阈值大小,超过阈值则限流。
三.WarmUpController.syncToken()方法
该方法会生成令牌并同步到令牌桶内。其中入参passQps是前一个时间窗口的QPS,即上一秒通过的QPS数。首先验证当前时间与最后更新时间,避免在同一时间窗口重复添加令牌。其次通过WarmUpController的coolDownTokens()方法获取最新的令牌数,接着利用CAS来保证更新令牌桶的线程安全性,最后通过减去上一秒通过的QPS数得到目前令牌桶剩余的令牌数来更新。
四.WarmUpController.coolDownTokens()方法
该方法会根据当前时间和上一个时间窗口通过的QPS计算更新后的令牌数。具体来说就是,首先获取当前令牌桶已存储的令牌数,然后判断桶内令牌数和告警值的大小。
情况一:如果令牌桶中已存储的令牌数小于告警值
说明系统已结束冷启动,即退出预热阶段进入了稳定阶段。也就是桶内已存储的令牌数没有达到进入预热阶段的阈值,此时需要较快地向令牌桶中添加令牌。
情况二:如果令牌桶中已存储的令牌数大于告警值
说明系统处于预热阶段,还在进行冷启动。此时如果上一个时间窗口通过的QPS,小于系统最冷时允许通过的QPS。那么就说明当前系统的负载比较低,可以向令牌桶中添加令牌。系统最冷时允许通过的QPS = (1 / (1 / count * coldFactor))。
其中,向令牌桶中添加令牌的处理,就是在当前令牌数量的基础上,加上从上次添加令牌到现在经过的时间乘以QPS阈值。
注意:Guava中的预热是通过控制令牌的生成时间来实现的,Sentinel中的预热是通过控制每秒通过的请求数来实现的。
Guava的实现侧重于调整请求间隔,这类似于漏桶算法。而Sentinel更注重控制每秒传入请求的数量,而不计算其间隔,这类似于令牌桶算法。
//The principle idea comes from Guava.
//However, the calculation of Guava is rate-based, which means that we need to translate rate to QPS.
//这个原理来自于Guava;
//然而,Guava的计算是基于速率的,这意味着我们需要将速率转换为QPS;
//Requests arriving at the pulse may drag down long idle systems even though it has a much larger handling capability in stable period.
//It usually happens in scenarios that require extra time for initialization,
//e.g. DB establishes a connection, connects to a remote service, and so on.
//That's why we need "warm up".
//突发式的流量可能会拖累一个长期空闲的系统,即使这个系统在稳定阶段具有更大的流量处理能力;
//这通常发生在需要额外时间进行初始化的场景中,比如DB建立连接、连接到远程服务等;
//这就是为什么我们需要对系统进行"预热";
//Sentinel's "warm-up" implementation is based on the Guava's algorithm.
//However, Guava's implementation focuses on adjusting the request interval, which is similar to leaky bucket.
//Sentinel pays more attention to controlling the count of incoming requests per second without calculating its interval,
//which resembles token bucket algorithm.
//Sentinel的"预热"实现是基于Guava的算法的;
//然而,Guava的实现侧重于调整请求间隔,这类似于漏桶;
//而Sentinel更注重控制每秒传入请求的数量,而不计算其间隔,这类似于令牌桶算法;
//The remaining tokens in the bucket is used to measure the system utility.
//Suppose a system can handle b requests per second.
//Every second b tokens will be added into the bucket until the bucket is full.
//And when system processes a request, it takes a token from the bucket.
//The more tokens left in the bucket, the lower the utilization of the system;
//when the token in the token bucket is above a certain threshold,
//we call it in a "saturation" state.
//桶中存储的令牌是用来测量系统的实用程序的;
//假设一个系统每秒可以处理b个请求;
//那么每秒就有b个令牌被添加到桶中,直到桶满为止;
//当系统处理一个请求时,就会从桶中获取一个令牌;
//桶中存储的令牌剩余得越多,那么就说明系统的利用率就越低;
//当令牌桶中的令牌数高于某个阈值时,我们称之为"饱和"状态;
//Base on Guava's theory, there is a linear equation we can write this in the form
//y = m * x + b where y (a.k.a y(x)), or qps(q)),
//is our expected QPS given a saturated period (e.g. 3 minutes in),
//m is the rate of change from our cold (minimum) rate to our stable (maximum) rate,
//x (or q) is the occupied token.
//根据Guava的理论,有一个线性方程,我们可以把它写成y = m * x + b;
//这是在给定饱和周期(例如3分钟)的情况下预期的QPS;
//m是从我们的冷(最小)速率到我们的稳定(最大)速率的变化率;
//x(或q)就是需要被占用的令牌数;
public class WarmUpController implements TrafficShapingController {
...
@Override
public boolean canPass(Node node, int acquireCount) {
return canPass(node, acquireCount, false);
}
@Override
public boolean canPass(Node node, int acquireCount, boolean prioritized) {
//获取当前1s的QPS
long passQps = (long) node.passQps();
//获取上一窗口通过的QPS
long previousQps = (long) node.previousPassQps();
//1.生成令牌并同步到令牌桶内
syncToken(previousQps);
//获取令牌桶内剩余的令牌数
long restToken = storedTokens.get();
//2.如果令牌桶中的令牌数量大于告警值,说明还处于预热阶段,此时需要判断令牌的生成速度和消费速度
if (restToken >= warningToken) {
//获取桶内剩余令牌数超过告警值的令牌个数
long aboveToken = restToken - warningToken;
//当前令牌的生成间隔 = 稳定阶段的生成间隔 + 桶内超出告警值部分的已存储令牌数 * slope
//其中,稳定阶段的生成间隔是1/count,桶内超出告警值部分的已存储令牌数是aboveToken
//注意:预热阶段生成令牌的速率会越来越慢,也就是生成令牌的间隔越来越大;
//当桶内已存储的令牌超过告警值后,令牌越多,那1秒可允许的QPS越小;
//下面代码计算的是:
//当前1s内的时间窗口能够生成的令牌数量,即当前时间窗口生成的令牌可满足的QPS = 1 / 当前令牌的生成间隔
double warningQps = Math.nextUp(1.0 / (aboveToken * slope + 1.0 / count));
//如果当前消费令牌的速度(passQps + acquireCount) <= 当前生成令牌的速度(warningQps),则允许通过
//如果当前时间窗口通过的QPS + 客户端申请的令牌数 小于等于 当前预热阶段的告警QPS,则代表允许通过
if (passQps + acquireCount <= warningQps) {
return true;
}
}
//3.如果令牌桶中的令牌数量小于告警值,说明预热结束,进入稳定阶段
else {
//如果当前消费令牌的速度(passQps + acquireCount) <= 当前生成令牌的速度(count),则允许通过
if (passQps + acquireCount <= count) {
return true;
}
}
return false;
}
//生成令牌并同步到令牌桶内
//入参passQps是前一个时间窗口的QPS,也就是上一秒通过的QPS数
//syncToken()方法的逻辑是:
//1.首先验证当前时间与最后更新令牌桶的时间,避免在同一个时间窗口重复添加令牌;
//2.其次通过WarmUpController.coolDownTokens()方法获取最新的令牌数;
//3.接着利用CAS来保证更新令牌桶的线程安全性;
//4.最后将桶内已存储的令牌数,减去上一秒通过的QPS数,得到目前令牌桶剩余的令牌数;
protected void syncToken(long passQps) {
//获取当前时间ms
long currentTime = TimeUtil.currentTimeMillis();
//将当前时间ms转换为s
currentTime = currentTime - currentTime % 1000;
//获取上一次更新令牌桶已存储的令牌数量的时间
long oldLastFillTime = lastFilledTime.get();
//如果上一次更新令牌桶已存储的令牌数量的时间和当前时间一样,或发生了时钟回拨等情况导致比当前时间还小
//那么就无需更新,直接return即可
if (currentTime <= oldLastFillTime) {
return;
}
//先获取目前令牌桶已存储的令牌数
long oldValue = storedTokens.get();
//调用WarmUpController.coolDownTokens()方法得到最新的令牌数
long newValue = coolDownTokens(currentTime, passQps);
//通过CAS更新令牌桶已存储的令牌数
//注意:系统初始化完毕,第一个请求进来调用WarmUpController.canPass()方法时,storedTokens = maxToken
if (storedTokens.compareAndSet(oldValue, newValue)) {
//设置令牌桶内已存储的最新令牌数 = 当前令牌数 - 上一个时间窗口通过的请求数
long currentValue = storedTokens.addAndGet(0 - passQps);
if (currentValue < 0) {
storedTokens.set(0L);
}
//更新最后一次添加令牌的时间戳
lastFilledTime.set(currentTime);
}
}
//根据当前时间和上一个时间窗口通过的QPS计算更新后的令牌数
private long coolDownTokens(long currentTime, long passQps) {
//获取当前令牌桶已存储的令牌数
long oldValue = storedTokens.get();
long newValue = oldValue;
//如果令牌桶中已存储的令牌数小于告警值,说明系统已结束冷启动,即退出预热阶段进入稳定阶段
//也就是桶内已存储的令牌数没有达到进入预热阶段的阈值,此时需要较快地向令牌桶中添加令牌
if (oldValue < warningToken) {
//在当前令牌数量的基础上,加上从上次添加令牌到现在经过的时间(以秒为单位)乘以令牌生成速率(QPS阈值count)
newValue = (long)(oldValue + (currentTime - lastFilledTime.get()) * count / 1000);
}
//如果令牌桶中已存储的令牌数大于告警值,说明系统处于预热阶段,还在进行冷启动
else if (oldValue > warningToken) {
//如果上一个时间窗口通过的QPS,小于系统最冷时允许通过的QPS(1 / (1 / count * coldFactor))
//那么就说明当前系统的负载比较低,可以向令牌桶中添加令牌
if (passQps < (int)count / coldFactor) {
//在当前令牌数量的基础上,加上从上次添加令牌到现在经过的时间(以秒为单位)乘以令牌生成速率(QPS阈值count)
newValue = (long)(oldValue + (currentTime - lastFilledTime.get()) * count / 1000);
}
}
//确保令牌桶更新后的令牌数不超过最大令牌数(maxToken)
//系统初始化完毕,第一个请求进来调用WarmUpController.canPass()方法时,
//oldValue = 0,lastFilledTime = 0,此时返回maxToken
return Math.min(newValue, maxToken);
}
}
(5)Sentinel中的令牌桶算法总结
WarmUpController的核心原理是:首先根据当前时间和上一个时间窗口通过的QPS同步令牌桶内的令牌数。然后比较桶内令牌数和告警值,计算当前时间窗口允许通过的告警QPS。最后比较当前请求下的QPS是否大于允许通过的告警QPS来决定限流。
注意:系统在预热阶段会逐渐提高令牌的生成速度,从而平滑过渡到稳定阶段。当系统启动时,桶内令牌数最大,令牌生成速率最低,允许的QPS最低。随着桶内令牌数减少,令牌生成速度逐渐提高,允许的QPS也逐渐提高。最后到达稳定阶段,此时允许的QPS便是FlowRule中设置的QPS阈值。
所以根据稳定阶段令牌的生成速率是1/count,默认冷却因子为3,得出系统最冷时令牌的生成速率是3/count。因此预热阶段一开始允许的QPS为count/3,预热完毕的QPS就是count。