Sentinel源码—5.FlowSlot借鉴Guava的限流算法

大纲

1.Guava提供的RateLimiter限流使用示例

2.Guava提供的RateLimiter简介与设计

3.继承RateLimiter的SmoothBursty源码

4.继承RateLimiter的SmoothWarmingUp源码

1.Guava提供的RateLimiter限流使用示例

(1)拦截器示例

(2)AOP切面示例

(1)拦截器示例

一.pom文件中引入Guava的依赖包

复制代码

<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
    <groupId>com.google.guava</groupId>
    <artifactId>guava</artifactId>
    <version>27.0.1-jre</version>
</dependency>

二.自定义拦截器并在拦截器中实现限流

首先定义一个拦截器抽象类，用于多个拦截器复用。主要是继承HandlerInterceptorAdapter，重写preHandle()方法，并且提供preFilter()抽象方法供子类实现。

复制代码

public abstract class AbstractInterceptor extends HandlerInterceptorAdapter {
    private Logger logger = LoggerFactory.getLogger(AbstractInterceptor.class);

    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
        ResponseEnum result;
        try {
            result = preFilter(request);
        } catch (Exception e) {
            logger.error("preHandle catch a exception:" + e.getMessage());
            result = ResponseEnum.FAIL;
        }
        if (ResponseEnum.SUCCESS.code.equals(result.code)) {
            return true;
        }
        handlerResponse(result, response);
        return false;
    }

    //自定义pre处理
    protected abstract ResponseEnum preFilter(HttpServletRequest request);

    //错误处理事件
    private void handlerResponse(ResponseEnum result, HttpServletResponse response) {
        ResponseDto responseDto = new ResponseDto();
        responseDto.setCode(result.code);
        responseDto.setStatus(result.status);
        responseDto.setMessage(result.message);
        response.setStatus(HttpServletResponse.SC_OK);
        response.setContentType(MediaType.APPLICATION_JSON_UTF8_VALUE);
        PrintWriter printWriter = null;
        try {
            printWriter = response.getWriter();
            printWriter.write(JsonUtils.toJson(responseDto));
        } catch (Exception e) {
            logger.error("handlerResponse catch a exception:" + e.getMessage());
        } finally {
            if (printWriter != null) {
                printWriter.close();
            }
        }
    }
}

然后定义流量控制拦截器，流量控制拦截器继承自上面的拦截器抽象类，并在preFilter()方法中进行流量控制。

使用Guava提供的RateLimiter类来实现流量控制，过程很简单：定义一个QPS为1的全局限流器，使用tryAcquire()方法来尝试获取令牌。如果成功则返回允许通过，否则返回限流提示。

复制代码

@Component("rateLimitInterceptor")
public class RateLimitInterceptor extends AbstractInterceptor {
    private Logger logger = LoggerFactory.getLogger(RateLimitInterceptor.class);
    //定义一个QPS为1的全局限流器
    private static final RateLimiter rateLimiter = RateLimiter.create(1);

    public static void setRate(double limiterQPS){
        rateLimiter.setRate(limiterQPS);
    }
    
    @Override
    protected ResponseEnum preFilter(HttpServletRequest request) {
        if (!rateLimiter.tryAcquire()) {
            logger.warn("限流中......");
            return ResponseEnum.RATE_LIMIT;
        }
        return ResponseEnum.SUCCESS;
    }
}

三.继承WebMvcConfigurerAdapter来添加自定义拦截器

复制代码

@Configuration
public class MyWebAppConfigurer extends WebMvcConfigurationSupport {
    @Override
    public void addInterceptors(InterceptorRegistry registry) {
        //多个拦截器组成一个拦截器链
        //addPathPatterns()方法用于添加拦截规则
        //excludePathPatterns()方法用户排除拦截
        registry.addInterceptor(new RateLimitInterceptor()).addPathPatterns("/**");
        super.addInterceptors(registry);
    }
}

四.写一个Controller来提供一个简单的访问接口

复制代码

@RestController
public class GuavaController {
    @RequestMapping(value = "getUserList", method = RequestMethod.GET)
    public String getUserList() {
        String result = null;
        try {
            result = "请求成功";
        } catch (Exception e) {
            logger.error("请求失败", e);
            return JsonUtils.toJson(ResponseUtils.failInServer(result));
        }
        return JsonUtils.toJson(ResponseUtils.success(result));
    }
}

(2)AOP切面示例

一.编写自定义注解Limiter

当需要对接口限流时，可直接使用@Limiter注解。

复制代码

@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
public @interface Limiter {
    //默认每秒放入桶中的token
    double limitNum()default 5;
    Stringname()default "";
}

二.编写切面处理逻辑

复制代码

@Aspect
@Component
public class RateLimitAspect {
    private ConcurrentHashMap RATE_LIMITER = new ConcurrentHashMap<>();
    private RateLimiter rateLimiter;
    
    @Pointcut("@annotation(添加Limiter注解所在类路径)")
    public void serviceLimit() {
    }

    @Around("serviceLimit()")
    public Objectaround(ProceedingJoinPoint point)throws Throwable {
        //获取拦截的方法名
        Signature sig = point.getSignature();
        //获取拦截的方法名
        MethodSignature msig = (MethodSignature) sig;
       
        //返回被织入增加处理目标对象
        Object target = point.getTarget();
        //为了获取注解信息
        Method currentMethod = target.getClass().getMethod(msig.getName(), msig.getParameterTypes());
        //获取注解信息
        Limiter annotation = currentMethod.getAnnotation(Limiter.class);
       
        //获取注解每秒加入桶中的token
        double limitNum = annotation.limitNum();
        //注解所在方法名区分不同的限流策略
        String methodName = msig.getName();
        
        if (RATE_LIMITER.containsKey(methodName)) {
            rateLimiter = RATE_LIMITER.get(methodName);
        } else {
            RATE_LIMITER.put(methodName, RateLimiter.create(limitNum));
            rateLimiter = RATE_LIMITER.get(methodName);
        }
      
        if (rateLimiter.tryAcquire()) {
            log.info("流量正常范围内");
            return point.proceed();
        } else {
            log.info("您被限流了");
        }
    }
}

三.业务代码中添加限流控制

注意：@Limiter注解中的limitNum参数表示每秒接口最大调用次数，而name表示限流名称，整个工程中需要保证全局唯一。

复制代码

@RequestMapping("/test/path")
@RestController
public class LimiterController {
    @PostMapping("/do")
    @Limiter(limitNum = 30, name = "test_name")
    public Result vote(@RequestBody @Validated TestRequest request) {
        //编写业务逻辑代码
        return nulll;
    }
}

2.Guava提供的RateLimiter简介与设计

(1)RateLimiter的简介

(2)RateLimiter通过延迟计算的方式实现限流

(3)SmoothRateLimiter的注释说明

(4)SmoothWarmingUp的注释说明

(1)RateLimiter的介绍

RateLimiter是一个抽象类，从它的注释可知：

一.限流器会以固定配置的速率来分配令牌。每次调用acquire()方法，如果令牌不够则会阻塞，直到有足够的令牌。

二.RateLimiter是线程安全的。它会限制所有线程的总速率，但是它并不能保证公平。

三.acquire(1)和acquire(100)所产生的影响是一样的。它们都不会影响第一次调用这个方法的请求，影响的是后面的请求。

RateLimiter类本身是一个抽象类，子类SmoothRateLimiter又做了一层抽象。SmoothRateLimiter有两个子类SmoothBursty和SmoothWarmingUp，可以说SmoothWarmingUp是SmoothBursty的升级版，SmoothWarmingUp是为了弥补SmoothBursty的不足而实现的。

所以RateLimiter有两个具体的继承类：SmoothWarmingUp和SmoothBursty。SmoothWarmingUp和SmoothBursty都是SmoothRateLimiter的内部类。分别对应两种限流方式：一是有预热时间，一是没有预热时间。

复制代码

//Conceptually：在概念上；distributes：分发、分配；permits：许可、令牌；configurable：可配置的
//restrict：限制；in contrast to：与...相反、相比之下
//Absent additional configuration：缺少额外配置；individual：单独的；maintained：维持的
//steadily：稳定地；smoothly：平稳地、平滑地；accomplished：完成；specifying：明确规定
//throttling、throttle：阻碍、抑制、使节流；
//i.e.：也就是

//Conceptually, a rate limiter distributes permits at a configurable rate.
//Each acquire() blocks if necessary until a permit is available, and then takes it.
//Once acquired, permits need not be released.

//从概念上讲，速率限制器RateLimiter会以可配置的速率分配令牌；
//如果需要，每个acquired()方法都会阻塞，直到一个令牌可用，然后再获取到一个令牌；
//一旦获得令牌，就不需要发放了；

//RateLimiter is safe for concurrent use: 
//It will restrict the total rate of calls from all threads. 
//Note, however, that it does not guarantee fairness.

//并发使用RateLimiter时是安全的；
//它将限制来自所有线程的调用的总速率；
//然而，它并不能保证公平；

//Rate limiters are often used to restrict the rate at which some physical or logical resource is accessed.
//This is in contrast to JDK's Semaphore which restricts the number of concurrent accesses instead of the rate.

//速率限制器RateLimiter通常用于限制访问某些物理或逻辑资源的速率；
//这与JDK的Semaphore形成对比，后者限制并发访问的数量而不是速率；

//A RateLimiter is defined primarily by the rate at which permits are issued. 
//Absent additional configuration, permits will be distributed at a fixed rate, defined in terms of permits per second.
//Permits will be distributed smoothly, 
//with the delay between individual permits being adjusted to ensure that the configured rate is maintained.

//一个RateLimiter主要由发放令牌的速率来定义；
//如果没有额外的配置，令牌将以固定的速率分发，以每秒多少个令牌的形式定义；
//通过调整各个令牌之间的延迟来确保维持所配置的速率，来实现令牌被平滑地发放；

//It is possible to configure a RateLimiter to have a warmup period during which time
//the permits issued each second steadily increases until it hits the stable rate.
//可以将RateLimiter配置为具有预热期，在此期间每秒发放的令牌会稳步增加，直到达到稳定的速率；

//As an example, imagine that we have a list of tasks to execute, 
//but we don't want to submit more than 2 per second:
//举个例子，假设我们有一个要执行的任务列表，但我们不希望每秒超过2个提交：
//    final RateLimiter rateLimiter = RateLimiter.create(2.0); // rate is "2 permits per second"
//    void submitTasks(List<Runnable> tasks, Executor executor) {
//        for (Runnable task : tasks) {
//            rateLimiter.acquire(); // may wait
//            executor.execute(task);
//        }
//    }

//As another example, imagine that we produce a stream of data, and we want to cap it at 5kb per second.
//This could be accomplished by requiring a permit per byte, and specifying a rate of 5000 permits per second:
//作为另一个例子，假设我们产生一个数据流，我们希望将其限制在每秒5kb；
//这可以通过要求每个字节有一个令牌，并指定每秒5000个令牌的速率来实现：
//    final RateLimiter rateLimiter = RateLimiter.create(5000.0); // rate = 5000 permits per second
//    void submitPacket(byte[] packet) {
//        rateLimiter.acquire(packet.length);
//        networkService.send(packet);
//    }

//It is important to note that the number of permits requested never affects the throttling of the request itself 
//(an invocation to acquire(1) and an invocation to acquire(1000) will result in exactly the same throttling, if any), 
//but it affects the throttling of the next request. 
//I.e., if an expensive task arrives at an idle RateLimiter, it will be granted immediately, 
//but it is the next request that will experience extra throttling,
//thus paying for the cost of the expensive task.

//需要注意的是，请求的令牌数量永远不会影响请求本身的限流；
//调用acquire(1)和调用acquire(1000)将导致完全相同的限流(如果有的话)；
//但是它会影响下一个请求的限流；
//也就是说，如果一个昂贵的任务到达空闲的RateLimiter，它将会被立即允许；
//但是下一个请求将经历额外的限流，从而支付了昂贵的限流成本；

@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime") // lots of violations - also how should we model a rate?
public abstract class RateLimiter {
    ...
    ...
}

(2)RateLimiter通过延迟计算的方式实现限流

令牌桶算法就是以固定速率生成令牌放入桶中。每个请求都需要从桶中获取令牌，没有获取到令牌的请求会被阻塞限流。当令牌消耗速度小于生成速度时，令牌桶内就会预存这些未消耗的令牌。当有突发流量进来时，可以直接从桶中取出令牌，而不会被限流。

漏桶算法就是将请求放入桶中，然后以固定的速率从桶中取出请求来处理。当桶中等待的请求数超过桶的容量后，后续的请求就不再加入桶中。

漏桶算法适用于需要以固定速率处理请求的场景。在多数业务场景中，其实并不需要按照严格的速率进行请求处理。而且多数业务场景都需要应对突发流量的能力，所以会使用令牌桶算法。

但不管是令牌桶算法还是漏桶算法，都可以通过延迟计算的方式来实现。延迟计算指的是不需要单独的线程来定时生成令牌或者从漏桶中定时获取请求，而是由调用限流器的线程自己计算是否有足够的令牌以及需要sleep的时间。延迟计算的方式可以节省一个线程资源。

Guava提供的RateLimiter就是通过延迟计算的方式来实现限流效果的。

(3)SmoothRateLimiter的注释说明

复制代码

//How is the RateLimiter designed, and why?

//The primary feature of a RateLimiter is its "stable rate", 
//the maximum rate that is should allow at normal conditions. 
//This is enforced by "throttling" incoming requests as needed, 
//i.e. compute, for an incoming request, the appropriate throttle time, 
//and make the calling thread wait as much.

//RateLimiter的主要特点是其"稳定的速率"------即在正常条件下被允许的最大速率；
//这是通过根据需要，强制限制"到达的请求"来实现的；
//也就是说，对于一个到达的请求，计算一个合适的限流时间，然后让调用线程等待同样多的时间；

//The simplest way to maintain a rate of QPS is to keep the timestamp of the last granted request, 
//and ensure that (1/QPS) seconds have elapsed since then. 
//For example, for a rate of QPS=5 (5 tokens per second), 
//if we ensure that a request isn't granted earlier than 200ms after the last one, then we achieve the intended rate. 
//If a request comes and the last request was granted only 100ms ago, then we wait for another 100ms. 
//At this rate, serving 15 fresh permits (i.e. for an acquire(15) request) naturally takes 3 seconds.

//保持QPS速率的最简单方法是保留最后一个允许请求的时间戳，并确保从那时起过了(1/QPS)秒之后再放行另外一个请求；
//例如：对于QPS=5的速率(每秒五个令牌，200ms一个)
//如果我们能保证一个请求被允许通过的时间点比上次放行的请求的时间点之差不小于200ms，那么我们就算保证了这个QPS=5的速率；
//如果一个请求到达时，上次放行的请求才过了100ms，那么当前这个请求就得再等待100ms；
//按照这个速率，如果调用acquire(15)，就是想得到15个新鲜的令牌，那么就需要花费3秒的时间；

//It is important to realize that such a RateLimiter has a very superficial memory of the past: it only remembers the last request.
//What if the RateLimiter was unused for a long period of time, then a request arrived and was immediately granted? 
//This RateLimiter would immediately forget about that past underutilization.
//This may result in either underutilization or overflow, 
//depending on the real world consequences of not using the expected rate.

//重要的是需要认识到，这样一个RateLimiter对于过去的请求是不怎么记忆的，唯一记忆的是上次的请求；
//如果RateLimiter在很长一段时间内未使用，那么一个请求到达并立即被批准，该怎么办？
//RateLimiter会立刻忘记它之前是处于一个未被充分利用的情况("过去未充分利用")；
//这可能导致利用不足或溢出，具体取决于未使用预期速率的实际后果；

//Past underutilization could mean that excess resources are available. 
//Then, the RateLimiter should speed up for a while, to take advantage of these resources. 
//This is important when the rate is applied to networking (limiting bandwidth), 
//where past underutilization typically translates to "almost empty buffers", which can be filled immediately.

//"过去未充分利用"可能意味着资源过剩；RateLimiter应该要加快速度来充分利用资源；
//当这个速率代表着宽带限制的时候，"过去未充分利用"这种状况通常意味着"几乎是空的缓存区"，它可以被瞬间填充满；

//On the other hand, past underutilization could mean that 
//"the server responsible for handling the request has become less ready for future requests", 
//i.e. its caches become stale, and requests become more likely to trigger expensive operations 
//(a more extreme case of this example is when a server has just booted, 
//and it is mostly busy with getting itself up to speed).

//另一方面，"过去未充分利用"也可能意味着：负责处理请求的服务器对未来的请求准备不足；
//例如：它的缓存失效了，服务端需要花费更多的时间来处理请求；
//一个更极端的情况就是是服务端刚刚启动，它忙于让自己跟上速度；

//To deal with such scenarios, we add an extra dimension, 
//that of "past underutilization", modeled by "storedPermits" variable. 
//This variable is zero when there is no underutilization, 
//and it can grow up to maxStoredPermits, for sufficiently large underutilization. 
//So, the requested permits, by an invocation acquire(permits), are served from:
// - stored permits (if available)
// - fresh permits (for any remaining permits)

//为了处理这种情况，我们增加了一个额外的维度："过去未充分利用"使用storedPermits变量来表示；
//当不存在未充分利用时，该变量为零；并且它可以增长到maxStoredPermits，以获得足够大的未充分利用；
//因此acquire(permits)得到的令牌由两部分组成：存储的令牌、新鲜的令牌

//How this works is best explained with an example:
//For a RateLimiter that produces 1 token per second, 
//every second that goes by with the RateLimiter being unused, we increase storedPermits by 1. 
//Say we leave the RateLimiter unused for 10 seconds 
//(i.e., we expected a request at time X, but we are at time X + 10 seconds before a request actually arrives; 
//this is also related to the point made in the last paragraph), 
//thus storedPermits becomes 10.0 (assuming maxStoredPermits >= 10.0). 
//At that point, a request of acquire(3) arrives. 
//We serve this request out of storedPermits, 
//and reduce that to 7.0 (how this is translated to throttling time is discussed later). 
//Immediately after, assume that an acquire(10) request arriving. 
//We serve the request partly from storedPermits, using all the remaining 7.0 permits, 
//and the remaining 3.0, we serve them by fresh permits produced by the rate limiter.

//这是如何工作的呢，最好用一个例子来解释：
//对于每秒生成1个令牌的RateLimiter，
//在RateLimiter未使用的情况下，每过一秒，我们就会将storedPermits增加1；
//假设我们让RateLimiter闲置10秒钟(即，我们期望在时间点X收到请求，但是在时间点X+10秒时才收到请求；这也与上一段中提出的观点有关)
//那么storedPermits就会变成10.0(假设maxStoredPermits >= 10.0)
//在这种情况下，有一个acquire(3)的请求到达；
//我们会从storedPermits中取出令牌来服务这个请求，并将storedPermits减少到7.0；
//刚处理完这个请求，马上有个acquire(10)的请求到来，
//我们会继续从storedPermits中取出剩下的7个，其它3个从freshPermits中取出；

//We already know how much time it takes to serve 3 fresh permits: 
//if the rate is "1 token per second", then this will take 3 seconds. 
//But what does it mean to serve 7 stored permits? 
//As explained above, there is no unique answer. 
//If we are primarily interested to deal with underutilization, 
//then we want stored permits to be given out faster than fresh ones,
//because underutilization = free resources for the taking. 
//If we are primarily interested to deal with overflow, 
//then stored permits could be given out slower than fresh ones. 
//Thus, we require a (different in each case) function that translates storedPermits to throttling time.

//我们已经知道提供3个新鲜的令牌需要多长时间：如果速率是"每秒1个令牌"，那么这将需要3秒；
//但是，提供7个存储的令牌意味着什么？如上所述，这没有唯一的答案；
//如果我们主要感兴趣的是处理未充分利用的问题，那么可以让存储的令牌比新鲜的令牌发放得更快，因为未充分利用 = 存在可供占用的空闲资源；
//如果我们主要感兴趣的是处理溢出的问题，那么可以让存储的令牌比新鲜的令牌发放得更慢；
//因此，我们需要一个(在每种情况下都不同)函数将storedPermits转换为限流时间;

//This role is played by storedPermitsToWaitTime(double storedPermits, double permitsToTake). 
//The underlying model is a continuous function mapping storedPermits (from 0.0 to maxStoredPermits)
//onto the 1/rate (i.e. intervals) that is effective at the given storedPermits. 
//"storedPermits" essentially measure unused time; 
//we spend unused time buying/storing permits. 
//Rate is "permits / time", thus "1 / rate = time / permits". 
//Thus, "1/rate" (time / permits) times "permits" gives time, 
//i.e., integrals on this function (which is what storedPermitsToWaitTime() computes) 
//correspond to minimum intervals between subsequent requests, 
//for the specified number of requested permits.

//这个角色由函数storedPermitsToWaitTime(double storedPermits，double permitsToTake)来扮演；
//该函数的底层模型是一个映射"storedPermits"到"1/rate"的连续函数；
//其中"storedPermits"主要衡量RateLimiter未被使用的时间，我们会在这段未使用的时间内存储令牌；
//而"rate"(速率)就是"申请的令牌/时间"，即"rate" = "permits/time"；
//因此，"1/rate" = "time/permits"，表示每个令牌需要的时间；
//因此，"1/rate"(time/permits)乘以"permits"等于给定的时间time；
//也就是说，此连续函数上的积分(就是storedPermitsToWaitTime()计算的值)，
//对应于随后的(申请指定令牌permits的)请求之间的最小时间间隔；

//Here is an example of storedPermitsToWaitTime: 
//If storedPermits == 10.0, and we want 3 permits,
//we take them from storedPermits, reducing them to 7.0, 
//and compute the throttling for these as a call to storedPermitsToWaitTime(storedPermits = 10.0, permitsToTake = 3.0), 
//which will evaluate the integral of the function from 7.0 to 10.0.

//下面是一个关于storedPermitsToWaitTime()函数的例子：
//如果storedPermits==10.0，并且我们想要3个令牌；
//那么我们会从存储的令牌中获取，将其降低到7.0；
//而且会调用函数storedPermitsToWaitTime(storedPermits = 10.0, permitsToTake = 3.0)计算出限流信息，
//这可以评估出连续函数从7.0到10.0的积分；
//注意：积分的结果是请求间的最小间隔；

//Using integrals guarantees that the effect of a single acquire(3) is equivalent to 
//{ acquire(1); acquire(1); acquire(1); }, or { acquire(2); acquire(1); }, etc, 
//since the integral of the function in [7.0, 10.0] is equivalent to 
//the sum of the integrals of [7.0, 8.0], [8.0, 9.0], [9.0, 10.0] (and so on), 
//no matter what the function is. 
//This guarantees that we handle correctly requests of varying weight (permits), 
//no matter what the actual function is - so we can tweak the latter freely. 
//(The only requirement, obviously, is that we can compute its integrals).

//使用积分可以保证acquire(3)得到的结果和{ acquire(1); acquire(1); acquire(1); }或者{ acquire(2); acquire(1); }的结果是一样的，
//由于不管连续函数是什么，[7.0，10.0]中的积分等于[7.0，8.0]，[8.0，9.0]，[9.0，10.0]的积分之和(依此类推)；
//这保证了我们能正确处理不同令牌申请的请求；
//显然，我们唯一需要做的就是计算该连续函数的积分；

//Note well that if, for this function, we chose a horizontal line, at height of exactly (1/QPS),
//then the effect of the function is non-existent: 
//we serve storedPermits at exactly the same cost as fresh ones (1/QPS is the cost for each). 
//We use this trick later.

//注意，如果对于这个函数，我们选择了一条水平线，高度恰好为(1/QPS)，则函数的效果不存在：
//我们以与新鲜的令牌完全相同的成本提供储存的令牌(1/QPS是每个令牌的成本)；

//If we pick a function that goes below that horizontal line, 
//it means that we reduce the area of the function, thus time. 
//Thus, the RateLimiter becomes faster after a period of underutilization. 
//If, on the other hand, we pick a function that goes above that horizontal line, 
//then it means that the area (time) is increased, thus storedPermits are more costly than fresh permits, 
//thus the RateLimiter becomes slower after a period of underutilization.

//如果我们选择一个位于该水平线下方的函数，
//这意味着我们减少了函数的面积，从而减少了时间；
//因此，经过一段时间的未充分利用后，RateLimiter会变得更快；
//另一方面，如果我们选择一个位于水平线上的函数，
//则意味着面积(时间)增加，因此存储的令牌比新鲜的令牌更具成本，
//因此RateLimiter在一段时间的未充分利用之后变得更慢；

//Last, but not least: consider a RateLimiter with rate of 1 permit per second, 
//currently completely unused, and an expensive acquire(100) request comes. 
//It would be nonsensical to just wait for 100 seconds, and then start the actual task. 
//Why wait without doing anything? 
//A much better approach is to allow the request right away (as if it was an acquire(1) request instead), 
//and postpone subsequent requests as needed. 
//In this version, we allow starting the task immediately, and postpone by 100 seconds future requests, 
//thus we allow for work to get done in the meantime instead of waiting idly.

//最后，考虑速率为每秒1个令牌的RateLimiter，当前完全未使用，并且来了一个acquire(100)请求；
//如果要等待100秒才开始返回100个令牌，那是没有意义的；为什么在等待的时候什么都不做呢？
//一个更好的方法是立即允许该acquire(100)请求，就好像它是一个acquire(1)请求一样，并根据需要推迟后续请求；
//在这个版本中，我们允许立即启动任务，并将未来的请求推迟100秒；
//因此，我们允许工作在此期间完成，而不是无所事事地等待；

//This has important consequences: 
//it means that the RateLimiter doesn't remember the time of the last request, 
//but it remembers the (expected) time of the next request. 
//This also enables us to tell immediately (see tryAcquire(timeout)) whether a particular timeout is enough to 
//get us to the point of the next scheduling time, since we always maintain that. 

//And what we mean by "an unused RateLimiter" is also defined by that notion: 
//when we observe that the "expected arrival time of the next request" is actually in the past, 
//then the difference (now - past) is the amount of time that the RateLimiter was formally unused, 
//and it is that amount of time which we translate to storedPermits. 
//(We increase storedPermits with the amount of permits that would have been produced in that idle time). 
//So, if rate == 1 permit per second, and arrivals come exactly one second after the previous, 
//then storedPermits is never increased -- we would only increase it for arrivals later than the expected one second.

//这有一个很重要的结论：
//这意味着RateLimiter不保存上一次请求的时间，但是它保存下一次请求期望到达的时间；
//这也使我们能够立即判断下次被调度的时间是否是超时；

//我们所说的"一个未使用的RateLimiter"也是由这个概念定义的：
//当我们观察到"下一个请求的预期到达时间"实际上已经过去时，并假设下次请求期望到达的时间点是past, 现在的时间点是now,
//那么now - past这段时间表示着RateLimiter没有被使用，所以在这段空闲时间内我们会增加storedPermits的数量；
//(我们只会在这种空闲的时间内增加储存的令牌）
//注意：假设速率rate = 1，即每秒一个permit，并且请求刚好在上一次请求之后的1s到达，
//那么storedPermits是不会增加的，只有超过1s才会增加；

假设RateLimiter对于过去的请求不怎么记忆，只记忆上一次的请求。那么当RateLimiter在很长一段时间内未被使用且一个请求到达时，该如何处理？

由于RateLimiter之前是处于一个长时间未被使用的状态("过去未充分利用")，所以可能是由如下两种情况导致的：利用不足或溢出。

如果是利用不足导致的，即资源过剩，那么RateLimiter应该马上批准新到来的请求。如果是溢出导致的，即服务器因缓存失效等原因花更多时间处理请求，那么RateLimiter应该阻塞新到来的请求。

所以，为了处理RateLimiter在过去未充分利用时面对的情况，我们增加一个storedPermits变量来表示"过去未充分利用"。当不存在未充分利用时，storedPermits变量为零，并且可以增长到maxStoredPermits，以获得足够大的未充分利用。因此acquire(permits)得到的令牌由两部分组成：存储的令牌、新鲜的令牌。

在RateLimiter处于长时间未被使用的状态下：如果是利用不足导致的，那么应该让存储的令牌比新鲜的令牌发放得更快。如果是溢出导致的，那么应该让存储的令牌比新鲜的令牌发放得更慢。因此，我们需要一个函数将storedPermits转换为限流时间。

这个函数是一个将storedPermits映射到1/rate的连续函数。其中storedPermits主要衡量RateLimiter未被使用的时间，我们会在这段未被使用的时间内存储令牌。而rate(速率)就是申请的令牌 / 时间，即rate = permits / time。因此1 / rate = time / permits，表示每个令牌需要的时间。也就是说，这个连续函数上的积分，对应于随后的请求之间的最小时间间隔。

接下来假设这个函数是一条水平线：如果其高度恰好为1/QPS，则函数的效果不存在，因为此时表示以与新鲜的令牌完全相同的成本提供存储的令牌。其中，1/QPS是每个新鲜令牌的成本。

如果该水平线的高度位于1/QPS下方，则意味着减少了函数的面积，从而减少了时间。因此RateLimiter经过一段时间的未充分利用后，会变得更快。

如果该水平线的高度位于1/QPS上方，则意味着增加了函数的面积，从而增加了时间。因此存储的令牌比新鲜的令牌更具成本，RateLimiter经过一段时间的未充分利用后，会变得更慢。

注意：RateLimiter不保存上一次请求的时间，但是它保存下一次请求期望到达的时间。如果下一个请求的预期到达时间实际上已经过去了，并且假设下次请求期望到达的时间点是past，现在的时间点是now。那么now - past的这段时间表示RateLimiter没有被使用，所以在这段空闲时间内我们会增加storedPermits的数量。

(4)SmoothWarmingUp的注释说明

复制代码

//This implements a "bursty" RateLimiter, where storedPermits are translated to zero throttling.
//The maximum number of permits that can be saved (when the RateLimiter is unused) is defined in terms of time, 
//in this sense: if a RateLimiter is 2qps, and this time is specified as 10 seconds, we can save up to 2 * 10 = 20 permits.
static final class SmoothBursty extends SmoothRateLimiter {
    ...
    ...
}

//This implements the following function where coldInterval = coldFactor * stableInterval.
//          ^ throttling
//          |
//    cold  +                  /
// interval |                 /.
//          |                / .
//          |               /  .   ← "warmup period" is the area of the trapezoid between thresholdPermits and maxPermits
//          |              /   .     "预热期"是指thresholdPermits和maxPermits之间的梯形区域
//          |             /    .
//          |            /     .
//          |           /      .
//   stable +----------/  WARM .
// interval |          .   UP  .
//          |          . PERIOD.
//          |          .       .
//        0 +----------+-------+--------------→ storedPermits
//          0 thresholdPermits maxPermits

//Before going into the details of this particular function, let's keep in mind the basics:
//The state of the RateLimiter (storedPermits) is a vertical line in this figure.
//When the RateLimiter is not used, this goes right (up to maxPermits).
//When the RateLimiter is used, this goes left (down to zero), since if we have storedPermits, we serve from those first.
//When unused, we go right at a constant rate! The rate at which we move to the right is chosen as maxPermits / warmupPeriod. 
//This ensures that the time it takes to go from 0 to maxPermits is equal to warmupPeriod.
//When used, the time it takes, as explained in the introductory class note, 
//is equal to the integral of our function, between X permits and X-K permits, 
//assuming we want to spend K saved permits.

//在深入了解这个特定函数的细节之前，让我们记住以下基本内容：
//横坐标是RateLimiter的状态，表示storedPermits的值；
//当RateLimiter没有被使用时，横坐标向右移动直到maxPermits；
//当RateLimiter被使用时，横坐标开始向左移动直到0，storedPermits有值，会优先使用它；
//当未被使用时，坐标以恒定的速率向右移动，这个速率被选为：maxPermits / warmupPeriod；
//这样可以保证横坐标从0到maxPermits花费的时间等于warmupPeriod；
//当被使用时，花费的时间就是花费K个permits宽度之间的积分；
//(纵坐标就是1 / rate = time / permits，每个令牌的时间)

//In summary, the time it takes to move to the left (spend K permits), 
//is equal to the area of the function of width == K.

//Assuming we have saturated demand, the time to go from maxPermits to thresholdPermits is equal to warmupPeriod. 
//And the time to go from thresholdPermits to 0 is warmupPeriod/2. 
//(The reason that this is warmupPeriod/2 is to maintain the behavior of the original implementation where coldFactor was hard coded as 3.)

//假设我们的令牌桶满了，maxPermits到thresholdPermits花费的时间等于warmupPeriod；
//从thresholdPermits到0花费的时间是warmupPeriod / 2；

//It remains to calculate thresholdsPermits and maxPermits.
//The time to go from thresholdPermits to 0 is equal to the integral of the function between 0 and thresholdPermits. 
//This is thresholdPermits * stableIntervals. By (5) it is also equal to warmupPeriod/2. 
//Therefore thresholdPermits = 0.5 * warmupPeriod / stableInterval
//The time to go from maxPermits to thresholdPermits is equal to the integral of the function between thresholdPermits and maxPermits. 
//This is the area of the pictured trapezoid, and it is equal to 0.5 * (stableInterval + coldInterval) * (maxPermits - thresholdPermits). 
//It is also equal to warmupPeriod, so maxPermits = thresholdPermits + 2 * warmupPeriod / (stableInterval + coldInterval)

//从thresholdPermits到0花费的时间，是从0到thresholdPermits之间的函数积分；
//等于thresholdPermits * stableIntervals，按照上面讲的它也等于warmupPeriod / 2；
//所以thresholdPermits = 0.5 * warmupPeriod / stableInterval；
//从maxPermits到thresholdPermits花费的时间，等于上图梯形的面积：
//0.5 * (stableInterval + coldInterval) * (maxPermits - thresholdPermits) = warmupPeriod
//所以maxPermits = thresholdPermits + 2 * warmupPeriod / (stableInterval + coldInterval)

static final class SmoothWarmingUp extends SmoothRateLimiter {
    ...
    ...
}

说明一：RateLimiter是令牌桶算法的具体实现

所以其获取令牌和放入令牌的方法可配合漏桶算法的流程去理解。

说明二：预热模式图中涉及的变量

其中横轴表示令牌桶中的令牌数量，纵轴表示生成令牌的时间间隔。令牌的消费是从右往左进行的。当限流器RateLimiter未被使用时，即空闲时，会生成令牌放入桶中。

SmoothBursty生成令牌的速率是固定的：

复制代码

stableInterval = 1 / permitsPerSecond；

SmoothWarmingUp生成令牌的速率则会随storedPermits变化而变化：

复制代码

当storedPermits < thresholdPermits时，速率为stableInterval；
当storedPermits > thresholdPermits时，速率为变化的coldInterval；

变量一：stableIntervalMicros

表示系统预热完成后，生成令牌的时间间隔。若QPS限制为100，则说明每10ms生成一个令牌。

复制代码

stableIntervalMicros = 1 / permitsPerSecond

变量二：coldIntervalMicros

表示系统水位最低时，生成令牌的时间间隔，与coldFactor有关。

变量三：coldFactor

冷却因子，表示倍数，即coldInterval是stableInterval的多少倍。

变量四：thresholdPermits

表示进入预热阶段的临界值。当令牌桶中的令牌数量减少到临界值时，系统预热结束。当令牌桶中的令牌数量大于临界值时，系统进入冷启动模式。

变量五：maxPermits

表示令牌桶的容量。当令牌桶中的令牌数达到最大容量时，生成的令牌将被抛弃。

变量六：slope

表示斜率。用于计算当前令牌生成时的时间间隔，从而计算当前每秒能生成多少令牌。

变量七：warmupPeriod

表示系统预热时间，即梯形的面积。在预热模型图中，梯形的面积 = (coldFactor - 1) * 长方形面积。

根据梯形面积的计算公式：知道warmupPeriodMicros和permitsPerSecond，就可以计算maxPermits和thresholdPermits。也就是thresholdPermits = 0.5 * warmupPeriodMicros / stableInterval，所以maxPermits = warmupPeriodMicros / stableInterval。

复制代码

stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond;
thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros;
maxPermits = thresholdPermits + 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros);

3.继承RateLimiter的SmoothBursty源码

(1)SmoothBursty的初始化流程

(2)SmoothBursty的初始化完成后的变量值

(3)SmoothBursty的acquire()和tryAcquire()

(4)SmoothBursty的令牌生成规则分析

(5)SmoothRateLimiter对预支令牌的处理分析

(6)SmoothBursty的案例场景分析

(1)SmoothBursty的初始化流程

令牌桶算法是可以应对突发流量的，Bursty则有突发的含义。SmoothBursty应对突发流量是有前提条件的，只有在令牌桶内有存储的令牌情况下，才会放行相应的突发流量，而令牌桶内的已存储令牌是低流量时省下来的。如果系统一直处于高流量，导致令牌桶内没有存储的令牌，那么当突发流量过来时，也只能按照固定速率放行。

所以在SmoothBursty类中，获取令牌桶中的存储令牌是无需额外代价的。当令牌桶能满足请求线程所需的令牌数量时，就不会阻塞线程，从而达到应对突发流量的能力。当然，令牌桶中的存储令牌是有上限的，该上限会通过构造方法进行设置。

首先，new SmoothBursty(stopwatch, 1.0)构造方法表示的是：通过硬编码指定了令牌桶中最多存储1秒的令牌数。如果传入的permitsPerSecond = 10，表示的是每秒生成10个令牌，那么意味着令牌桶中最多存储10个令牌。

然后，初始化SmoothBursty的重点是RateLimiter的setRate()方法。该方法会调用SmoothRateLimiter的doSetRate()方法，然后调用SmoothRateLimiter的resync()方法，最后调用SmoothBursty的doSetRate()设定maxPermits和storedPermits。

复制代码

@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime")
public abstract class RateLimiter {
    ...
    //Creates a RateLimiter with the specified stable throughput, 
    //given as "permits per second" (commonly referred to as QPS, queries per second).
    //The returned RateLimiter ensures that on average no more than permitsPerSecond are issued during any given second, 
    //with sustained requests being smoothly spread over each second.
    //When the incoming request rate exceeds permitsPerSecond the rate limiter will release one permit every (1.0 / permitsPerSecond) seconds. 
    //When the rate limiter is unused, bursts of up to permitsPerSecond permits will be allowed, 
    //with subsequent requests being smoothly limited at the stable rate of permitsPerSecond.
    //创建一个具有指定稳定吞吐量的RateLimiter，传入的"permits per second"通常称为QPS、每秒查询量；
    //返回的RateLimiter确保在任何给定的秒期间平均不超过permitsPerSecond的令牌被发出，持续的请求将在每一秒内被平稳地通过；
    //当传入请求的速率超过permitsPerSecond时，速率限制器将每隔(1.0/permitsPerSecond)秒释放一个令牌；
    //当速率限制器未被使用时，将允许突发式的高达permitsPerSecond的令牌，而随后的请求将以permitsPerSecond的稳定速率被平滑地限制；
    
    //对外暴露的创建方法
    //@param permitsPerSecond the rate of the returned RateLimiter, measured in how many permits become available per second.
    public static RateLimiter create(double permitsPerSecond) {
        //The default RateLimiter configuration can save the unused permits of up to one second. 
        //This is to avoid unnecessary stalls in situations like this: 
        //A RateLimiter of 1qps, and 4 threads, all calling acquire() at these moments:
        //T0 at 0 seconds、T1 at 1.05 seconds、T2 at 2 seconds、T3 at 3 seconds
        //Due to the slight delay of T1, T2 would have to sleep till 2.05 seconds, and T3 would also have to sleep till 3.05 seconds.
        //默认的RateLimiter配置可以保存长达一秒钟的未被使用的令牌；
        //这是为了避免在这种情况下出现不必要的停顿：
        //一个由1QPS和4个线程组成的RateLimiter，所有线程都在如下这些时刻调用acquired()：
        //Thread0在0秒、Thread1在1.05秒、Thread2在2秒、Thread3在3秒
        //由于Thread1的轻微延迟，Thread2必须睡眠到2.05秒，Thread3也必须睡眠到3.05秒
        
        //内部调用一个QPS设定 + 起始时间StopWatch的构建函数.
        //这里传入的SleepingStopwatch是一个以系统启动时间的一个相对时间的计量.
        //后面的读时间偏移是以这个开始的时间偏移为起始的.
        return create(permitsPerSecond, SleepingStopwatch.createFromSystemTimer());
    }
    
    @VisibleForTesting
    static RateLimiter create(double permitsPerSecond, SleepingStopwatch stopwatch) {
        //指定了令牌桶中最多存储1秒的令牌数
        RateLimiter rateLimiter = new SmoothBursty(stopwatch, 1.0 /* maxBurstSeconds */);
        //调用RateLimiter的setRate()方法
        rateLimiter.setRate(permitsPerSecond);
        return rateLimiter;
    }
    
    //Updates the stable rate of this RateLimiter, 
    //that is, the permitsPerSecond argument provided in the factory method that constructed the RateLimiter. 
    //Currently throttled threads will not be awakened as a result of this invocation, 
    //thus they do not observe the new rate; only subsequent requests will.
    //Note though that, since each request repays (by waiting, if necessary) the cost of the previous request, 
    //this means that the very next request after an invocation to setRate() will not be affected by the new rate; 
    //it will pay the cost of the previous request, which is in terms of the previous rate.
    //The behavior of the RateLimiter is not modified in any other way, 
    //e.g. if the RateLimiter was configured with a warmup period of 20 seconds, 
    //it still has a warmup period of 20 seconds after this method invocation.
    //更新该RateLimiter的稳定速率，即在构造RateLimiter的工厂方法中提供permitsPerSecond参数；
    //当前被限流的线程将不会由于这个调用而被唤醒，因此它们没有观察到新的速率；只有随后的请求才会；
    //但是要注意的是，由于每个请求(如果需要，通过等待)会偿还先前请求的成本，
    //这意味着调用setRate()方法后的下一个请求将不会受到新速率的影响，
    //它将按照先前的速率处理先前请求的成本；
    //RateLimiter的行为不会以任何其他方式修改，
    //例如：如果RateLimiter被配置为具有20秒的预热周期，在该方法调用之后，它仍然有20秒的预热期；

    //@param permitsPerSecond the new stable rate of this {@code RateLimiter}
    public final void setRate(double permitsPerSecond) {
        checkArgument(permitsPerSecond > 0.0 && !Double.isNaN(permitsPerSecond), "rate must be positive");
        //在同步代码块中设定速率
        synchronized (mutex()) {
            //调用SmoothRateLimiter.doSetRate()方法
            doSetRate(permitsPerSecond, stopwatch.readMicros());
        }
    }
    ...
}

@GwtIncompatible
abstract class SmoothRateLimiter extends RateLimiter {
    //The currently stored permits.
    //令牌桶中当前缓存的未消耗的令牌数
    double storedPermits;
    //The maximum number of stored permits.
    //令牌桶中允许存放的最大令牌数
    double maxPermits;
    //The interval between two unit requests, at our stable rate.
    //E.g., a stable rate of 5 permits per second has a stable interval of 200ms.
    //按照我们稳定的速率，两个单位请求之间的时间间隔；例如，每秒5个令牌的稳定速率具有200ms的稳定间隔
    double stableIntervalMicros;
    //The time when the next request (no matter its size) will be granted. 
    //After granting a request, this is pushed further in the future. Large requests push this further than small requests.
    //下一个请求(无论大小)将被批准的时间.
    //在批准请求后，这将在未来进一步推进，大请求比小请求更能推动这一进程。
    private long nextFreeTicketMicros = 0L;//could be either in the past or future
    ...
    //这是一个可以重复调用的函数.
    //第一次调用和非第一次调用的过程有些不一样，目的是设定设定最大令牌数maxPermits和已存储的令牌数storedPermits
    @Override
    final void doSetRate(double permitsPerSecond, long nowMicros) {
        //调用SmoothRateLimiter.resync()方法，重试计算和同步存储的预分配的令牌.
        resync(nowMicros);
        //计算稳定的发放令牌的时间间隔. 单位us, 比如QPS为5, 则为200ms的间隔进行令牌发放. 
        double stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond;
        this.stableIntervalMicros = stableIntervalMicros;
        //调用SmoothBursty.doSetRate()设定最大令牌数maxPermits和已存储的令牌数storedPermits
        doSetRate(permitsPerSecond, stableIntervalMicros);
    }
    
    //Updates storedPermits and nextFreeTicketMicros based on the current time.
    //根据当前时间，更新storedPermits和nextFreeTicketMicros变量
    //注意: 在初始化SmoothBursty时会第一次调用resync()方法，此时各值的情况如下：
    //coolDownIntervalMicros = 0、nextFreeTicketMicros = 0、newPermits = 无穷大.
    //maxPermits = 0(初始值，还没有重新计算)、最后得到的: storedPermits = 0;
    //同时，nextFreeTicketMicros = "起始时间"
    void resync(long nowMicros) {
        //if nextFreeTicket is in the past, resync to now
        if (nowMicros > nextFreeTicketMicros) {
            double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();
            storedPermits = min(maxPermits, storedPermits + newPermits);
            nextFreeTicketMicros = nowMicros;
        }
    }
    
    abstract void doSetRate(double permitsPerSecond, double stableIntervalMicros);
    ...
    
    //This implements a "bursty" RateLimiter, where storedPermits are translated to zero throttling.
    //The maximum number of permits that can be saved (when the RateLimiter is unused) is defined in terms of time, 
    //in this sense: if a RateLimiter is 2qps, and this time is specified as 10 seconds, we can save up to 2 * 10 = 20 permits.
    //SmoothBursty实现了一个"突发式"的速率限制器RateLimiter，其中的storedPermits会被转换为0；
    //它可以保存的最大令牌数量(当RateLimiter未使用时)是根据时间定义的，
    //从这个意义上说：如果RateLimiter是2QPS，并且这个时间被指定为10秒，那么最多可以保存2 * 10 = 20个令牌；
    static final class SmoothBursty extends SmoothRateLimiter {
        //The work (permits) of how many seconds can be saved up if this RateLimiter is unused?
        //如果这个速率限制器RateLimiter没有被使用，那么可以节省多少秒的工作(令牌)？
        final double maxBurstSeconds;
        SmoothBursty(SleepingStopwatch stopwatch, double maxBurstSeconds) {
            super(stopwatch);
            this.maxBurstSeconds = maxBurstSeconds;
        }
      
        @Override
        void doSetRate(double permitsPerSecond, double stableIntervalMicros) {
            //初次设定的时候，oldMaxPermits  = 0.0
            double oldMaxPermits = this.maxPermits;
            //新的(当前的)maxPermits为burst的时间周期(1秒) * 每周期的令牌数.
            maxPermits = maxBurstSeconds * permitsPerSecond;
            if (oldMaxPermits == Double.POSITIVE_INFINITY) {
                //if we don't special-case this, we would get storedPermits == NaN, below
                storedPermits = maxPermits;
            } else {
                //初始化SmoothBursty，执行到此处时，storedPermits为0
                storedPermits = (oldMaxPermits == 0.0) ? 0.0 : storedPermits * maxPermits / oldMaxPermits;
            }
        }
        
        @Override
        long storedPermitsToWaitTime(double storedPermits, double permitsToTake) {
            return 0L;
        }
        
        @Override
        double coolDownIntervalMicros() {
            return stableIntervalMicros;
        }
    }
    ...
}

(2)SmoothBursty的初始化完成后的变量值

在构建完SmoothBursty这个RateLimiter后，其初始状态说明如下：

说明一：maxBurstSeconds为1秒。默认情况下，传入的突发周期参数为1秒。

说明二：storedPermits为0。没有预分配的令牌，因为此时还处于初始的状态。

说明三：stableIntervalMicros表示的是每个令牌发放时的时间间隔，会根据给定的QPS换算出来。

说明四：maxPermits表示的是最大允许存储的令牌个数(= 突发周期 * 每周期允许数)，这里突发周期限定为1秒，也就是可以预存储一个周期的令牌。

说明五：nextFreeTicketMicros表示的是下一次可以发放令牌的起始时间，会被初始化为"开始时间"。

(3)SmoothBursty的acquire()和tryAcquire()

一.RateLimiter实现限流的过程

二.SmoothBursty的acquire()方法分析

三.SmoothBursty的tryAcquire()方法分析

一.RateLimiter的限流过程

RateLimiter的限流过程可以分为如下四个步骤：

步骤一：生产令牌

步骤二：获取令牌

步骤三：计算阻塞时间

步骤四：阻塞线程

既然RateLimiter做了抽象，那么说明它提取了限流过程中的共性，而RateLimiter里的共性就是阻塞线程的逻辑。即RateLimiter的acquire()方法将阻塞线程这个共性提取了出来，将生产令牌、获取令牌、计算阻塞时间的具体细节由子类去实现。RateLimiter的子类SmoothRateLimiter的几个重要属性如下：

属性一：nextFreeTicketMicros

表示的是下一次请求被允许的时间。当令牌数不足时，会由处理当前请求的线程延迟计算令牌生成数及耗时。即使需要等待，当前线程也不会去阻塞等待，而是提前预支令牌。而这个预支的代价会转嫁给下一个请求，这样做的目的是为了减少线程阻塞。

属性二：stableIntervalMicros

表示的是每产生一个令牌需要消耗的微秒，这个值是根据构造器传入的permitsPerSecond换算成微秒数得来的。

属性三：maxPermits

表示的是令牌桶中允许存放的最大令牌数。

属性四：storedPermits

表示的是令牌桶中当前缓存的未消耗的令牌数。当令牌消耗速度小于令牌产生速度时，令牌桶内就会开始堆积令牌，但是storedPermits不会大于maxPermits。

二.SmoothBursty的acquire()方法分析

执行SmoothBursty的acquire()方法时，会对令牌对象加synchronized锁。通过加synchronized锁让并发的请求进行互斥，才能实现限流效果。其中SmoothRateLimiter的reserveEarliestAvailable()方法的细节说明如下：

说明一：该方法主要用来实现生产令牌、获取令牌、计算阻塞时间

计算阻塞时间时，会将总的阻塞时间拆分成两部分。第一部分是从桶中获取storedPermitsToSpend个现有令牌的代价，第二部分是等待生成freshPermits个新鲜令牌的代价。

对于子类，生成新鲜令牌的代价是相同的，只有获取现有令牌代价才会不同。所以从桶中获取令牌需要等待的时间的抽象方法storedPermitsToWaitTime()会由SmoothRateLimiter子类实现。其中的一个子类SmoothBursty的storedPermitsToWaitTime()方法返回0，表示不需要等待。

说明二：获取令牌的阻塞代价会转移给下一个请求

如果处理当前请求时发现需要阻塞等待，那么等待时间由下个请求承受。这样做的目的是为了减少线程的阻塞。因为下一个请求的请求时间是不确定的，可能很久后才到来下一个请求。而这段时间内生成的新鲜令牌已经可以满足下一个请求了，从而不用阻塞。

复制代码

@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime")
public abstract class RateLimiter {
    ...
    //无限等待的获取
    //Acquires the given number of permits from this RateLimiter, 
    //blocking until the request can be granted. 
    //Tells the amount of time slept, if any.
    //@param permits the number of permits to acquire，获取的令牌数量
    //@return time spent sleeping to enforce rate, in seconds; 0.0 if not rate-limited
    @CanIgnoreReturnValue
    public double acquire(int permits) {
        //调用RateLimiter.reserve()方法
        //预支令牌并获取需要阻塞的时间：即预定数量为permits的令牌数，并返回需要等待的时间
        long microsToWait = reserve(permits);
        //将需要等待的时间补齐, 从而满足限流的需求，即根据microsToWait来让线程sleep(共性)
        stopwatch.sleepMicrosUninterruptibly(microsToWait);
        //返回这次调用使用了多少时间给调用者
        return 1.0 * microsToWait / SECONDS.toMicros(1L);
    }
        
    //Reserves the given number of permits from this RateLimiter for future use, 
    //returning the number of microseconds until the reservation can be consumed.
    //从这个RateLimiter限速器中保留给定数量的令牌，以备将来使用，返回可以使用保留前的微秒数
    //@return time in microseconds to wait until the resource can be acquired, never negative
    final long reserve(int permits) {
        checkPermits(permits);
        //由于涉及并发操作，所以必须使用synchronized进行互斥处理
        synchronized (mutex()) {
            //调用RateLimiter.reserveAndGetWaitLength()方法
            return reserveAndGetWaitLength(permits, stopwatch.readMicros());
        }
    }
    
    //Reserves next ticket and returns the wait time that the caller must wait for.
    //预定下一个ticket，并且返回需要等待的时间
    final long reserveAndGetWaitLength(int permits, long nowMicros) {
        //调用SmoothRateLimiter.reserveEarliestAvailable()方法
        long momentAvailable = reserveEarliestAvailable(permits, nowMicros);
        return max(momentAvailable - nowMicros, 0);
    }
    
    //Reserves the requested number of permits and returns the time that those permits can be used (with one caveat).
    //保留请求数量的令牌，并返回可以使用这些令牌的时间(有一个警告)
    //生产令牌、获取令牌、计算阻塞时间的具体细节由子类来实现
    //@return the time that the permits may be used, or, if the permits may be used immediately, an arbitrary past or present time
    abstract long reserveEarliestAvailable(int permits, long nowMicros);
    ...
}

@GwtIncompatible
abstract class SmoothRateLimiter extends RateLimiter {
    //The currently stored permits. 
    //令牌桶中当前缓存的未消耗的令牌数
    double storedPermits;
    //The maximum number of stored permits.
    //令牌桶中允许存放的最大令牌数
    double maxPermits;
    //The interval between two unit requests, at our stable rate.
    //E.g., a stable rate of 5 permits per second has a stable interval of 200ms.
    //按照我们稳定的速率，两个单位请求之间的时间间隔；例如，每秒5个令牌的稳定速率具有200ms的稳定间隔
    double stableIntervalMicros;
    //The time when the next request (no matter its size) will be granted. 
    //After granting a request, this is pushed further in the future. Large requests push this further than small requests.
    //下一个请求(无论大小)将被批准的时间. 在批准请求后，这将在未来进一步推进，大请求比小请求更能推动这一进程.
    private long nextFreeTicketMicros = 0L;//could be either in the past or future
    ...
    @Override
    final long reserveEarliestAvailable(int requiredPermits, long nowMicros) {
        //1.根据nextFreeTicketMicros计算新产生的令牌数，更新当前未使用的令牌数storedPermits
        //获取令牌时调用SmoothRateLimiter.resync()方法与初始化时的调用不一样.
        //此时会把"还没有使用"的令牌存储起来.
        //但是如果计数时间nextFreeTicketMicros是在未来. 那就不做任何处理.
        resync(nowMicros);
        //下一个请求(无论大小)将被批准的时间，这个值将被作为方法结果返回
        long returnValue = nextFreeTicketMicros;
        
        //2.计算需要阻塞等待的时间
        //2.1.先从桶中取未消耗的令牌，如果桶中令牌数不足，看最多能取多少个
        //存储的令牌可供消费的数量
        double storedPermitsToSpend = min(requiredPermits, this.storedPermits);
        //2.2.计算是否需要等待新鲜的令牌(当桶中现有的令牌数不足时就需要等待新鲜的令牌)，如果需要，则计算需要等待的令牌数
        //需要等待的令牌：新鲜的令牌
        double freshPermits = requiredPermits - storedPermitsToSpend;
        //计算需要等待的时间
        //分两部分计算：waitMicros = 从桶中获取storedPermitsToSpend个现有令牌的代价 + 等待生成freshPermits个新鲜令牌的代价
        //从桶中取storedPermitsToSpend个现有令牌也是有代价的，storedPermitsToWaitTime()方法是个抽象方法，会由SmoothBursty和SmoothWarmingUp实现
        //对于SmoothBursty来说，storedPermitsToWaitTime()会返回0，表示已经存储的令牌不需要等待.
        //而生成新鲜令牌需要等待的代价是：新鲜令牌的个数freshPermits * 每个令牌的耗时stableIntervalMicros
        long waitMicros = storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend) + (long) (freshPermits * stableIntervalMicros);
        
        //3.更新nextFreeTicketMicros
        //由于新鲜的令牌可能已被预消费，所以nextFreeTicketMicros就得往后移，以表示这段时间被预消费了
        this.nextFreeTicketMicros = LongMath.saturatedAdd(nextFreeTicketMicros, waitMicros);
        
        //4.扣减令牌数，更新桶内剩余令牌
        //最后把上面计算的可扣减的令牌数量从存储的令牌里减掉
        this.storedPermits -= storedPermitsToSpend;
        //返回请求需要等待的时间
        //需要注意returnValue被赋值的是上次的nextFreeTicketMicros，说明当前这次请求获取令牌的代价由下一个请求去支付
        return returnValue;
    }
    
    //Updates storedPermits and nextFreeTicketMicros based on the current time.
    //根据当前时间，更新storedPermits和nextFreeTicketMicros变量
    //计算nextFreeTicketMicros到当前时间内新产生的令牌数，这个就是延迟计算
    void resync(long nowMicros) {
        //if nextFreeTicket is in the past, resync to now
        //一般当前的时间是大于下个请求被批准的时间
        //此时：会把过去的时间换成令牌数存储起来，注意存储的令牌数不能大于最大的令牌数
        //当RateLimiter初始化好后，可能刚开始没有流量，或者是一段时间没有流量后突然来了流量
        //此时可以往"后"预存储一秒时间的令牌数. 也就是这里所说的burst能力
        
        //如果nextFreeTicketMicros在未来的一个时间点，那这个if判断便不满足
        //此时，不需要进行更新storedPermits和nextFreeTicketMicros变量
        //此种情况发生在："预借"了令牌的时候
        if (nowMicros > nextFreeTicketMicros) {
            //时间差除以生成一个新鲜令牌的耗时，coolDownIntervalMicros()是抽象方法，由子类实现
            double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();
            //更新令牌桶内已存储的令牌个数，注意不超过最大限制
            storedPermits = min(maxPermits, storedPermits + newPermits);
            //更新nextFreeTicketMicros为当前时间
            nextFreeTicketMicros = nowMicros;
        }
    }
    
    //Translates a specified portion of our currently stored permits which we want to spend/acquire, into a throttling time.
    //Conceptually, this evaluates the integral of the underlying function we use, for the range of [(storedPermits - permitsToTake), storedPermits].
    //This always holds: 0 <= permitsToTake <= storedPermits
    //从桶中取出已存储的令牌的代价，由子类实现
    //这是一个抽象函数，SmoothBursty中的实现会直接返回0，可以认为已经预分配的令牌，在获取时不需要待待时间
    abstract long storedPermitsToWaitTime(double storedPermits, double permitsToTake);
    
    //Returns the number of microseconds during cool down that we have to wait to get a new permit.
    //每生成一个新鲜令牌的耗时，由子类实现
    abstract double coolDownIntervalMicros();
    ...
    
    static final class SmoothBursty extends SmoothRateLimiter {
        ...
        @Override
        long storedPermitsToWaitTime(double storedPermits, double permitsToTake) {
            return 0L;
        }
        
        @Override
        double coolDownIntervalMicros() {
            return stableIntervalMicros;
        }
    }
    ...
}

三.SmoothBursty的tryAcquire()方法分析

其实就是在acquire()方法的基础上，增加了如下判断：如果当前时间 + 超时时间 > nextFreeTicketMicros，那么就可以继续尝试。

复制代码

@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime")
public abstract class RateLimiter {
    ...    
    //有超时时间的获取
    //@param permits the number of permits to acquire，获取的令牌数量
    //@param timeout the maximum time to wait for the permits. Negative values are treated as zero.
    //@param unit the time unit of the timeout argument
    //@return true if the permits were acquired, false otherwise
    public boolean tryAcquire(int permits, long timeout, TimeUnit unit) {
        long timeoutMicros = max(unit.toMicros(timeout), 0);
        checkPermits(permits);
        long microsToWait;
        synchronized (mutex()) {
            long nowMicros = stopwatch.readMicros();
            //调用RateLimiter.canAcquire()方法看是否超时
            if (!canAcquire(nowMicros, timeoutMicros)) {
                return false;
            } else {
                microsToWait = reserveAndGetWaitLength(permits, nowMicros);
            }
        }
        stopwatch.sleepMicrosUninterruptibly(microsToWait);
        return true;
    }
    
    private boolean canAcquire(long nowMicros, long timeoutMicros) {
        //SmoothRateLimiter.queryEarliestAvailable()方法会返回nextFreeTicketMicros
        //如果当前时间nowMicros + 超时时间timeoutMicros > nextFreeTicketMicros，那么就可以继续等待尝试获取
        return queryEarliestAvailable(nowMicros) - timeoutMicros <= nowMicros;
    }
    
    //Returns the earliest time that permits are available (with one caveat).
    //@return the time that permits are available, or, if permits are available immediately, an arbitrary past or present time
    abstract long queryEarliestAvailable(long nowMicros);
    ...
}

@GwtIncompatible
abstract class SmoothRateLimiter extends RateLimiter {
    //The time when the next request (no matter its size) will be granted. 
    //After granting a request, this is pushed further in the future. Large requests push this further than small requests.
    //下一个请求(无论大小)将被批准的时间. 在批准请求后，这将在未来进一步推进，大请求比小请求更能推动这一进程.
    private long nextFreeTicketMicros = 0L;//could be either in the past or future
    ...
    @Override
    final long queryEarliestAvailable(long nowMicros) {
        return nextFreeTicketMicros;
    }
    ...
}

(4)SmoothBursty的令牌生成规则分析

SmoothBursty的消费原理示意图如下：

青色代表已经存储的令牌(permits)，紫色代表新同步时间产生的令牌(permits)。

一.分析时间t1

在时刻t1请求的令牌数量是permits_01，此时青色中已经存储了部分令牌，但是不够。同步时间轴后，产生了部分紫色的新令牌。重新计算可用令牌，青色与紫色加一起。如果两者之和超过了一个Burst周期，则取周期的最大值maxPermits。此时消费后，还剩余部分的令牌，在图中的t2时刻表示为青色部分，而且在t2时刻时nextFreeTicketMicros已经被标记为t1。

二.分析时间t2

在时刻t2请求的令牌数量是pertmits_02，此时青色部分代表的已存储令牌数不够。同步时间轴后，会产生紫色部分的令牌。但是此时，已经产生的令牌数量还是不够消费。因此需要睡眠一个由freshPermits转换的时间间隔，然后nextFreeTicketMicros被更新到t2时刻的未来时间，即产生了预支。如果在nextFreeTicketMicros这个时间点到来之前，直接调用tryAcquire()，并且给定的超时时间太短，那么tryAcquire()就会返回失败。

(5)SmoothRateLimiter对预支令牌的处理分析

Smooth的含义是平稳的、平滑的，Bursty的含义是突发式的、突发性的。

SmoothRateLimiter类型的RateLimiter，在没有被消费的前提下，可以最多预存1秒的令牌来应对突发式流量。

关于nextFreeTicketMicros所表示的"下一个请求被批准的时间"补充说明：在一般场景下，它是一个过去的值，此时可以预存令牌。在初始化场景下，它会初始化为nowMicros，此时无法应对突发式流量。

对于任意时刻，SmoothRateLimiter都可以向以后的时间预支令牌来消费。但所预支令牌的时间成本，当前消费者不承担，由下一个消费者承担。这可参考SmoothRateLimiter.reserveEarliestAvailable()方法中的处理，也就是当前请求不进行等待，而是由下一次请求进行等待。

每次执行SmoothRateLimiter的reserveEarliestAvailable()方法时，都会先调用SmoothRateLimiter的resync()方法更新nextFreeTicketMicros。但如果存在预支令牌的情况，该方法是不会更新nextFreeTicketMicros的。因为预支令牌时，nextFreeTicketMicros是未来的时间，大于nowMicros。然后SmoothRateLimiter的reserveEarliestAvailable()方法最后会返回执行SmoothRateLimiter的resync()方法后的nextFreeTicketMicros，让处理当前请求的线程睡眠阻塞到nextFreeTicketMicros这个时间点。

在SmoothRateLimiter.reserveEarliestAvailable()方法中，由于先执行SmoothRateLimiter的resync()方法更新nextFreeTicketMicros。所以只要处理当前请求时，上一个请求没有出现预支令牌的情况，也就是nextFreeTicketMicros比nowMicros小的时候。那么即使当前请求需要申请的令牌数不够，当前请求也不需要进行等待，只需要向往后的时间去借足够的令牌即可立刻返回。

(6)SmoothBursty的案例场景分析

当程序已经完成了初始化，但是没有任何流量，持续了很长的时间。此时来了一个acquire(200)的请求，不管已经存储的令牌有多少，处理该请求时都可消费这些已存储的令牌，并且在不够时可以借后面时间产生的令牌 + 不需要等待。

但下一个请求acquire(250)到来时可能就没有这么幸运了，处理这个请求可能需要等待上一个请求所预支令牌的生成时间。比如上一个请求是借了后面时间产生的50个令牌，那么处理当前请求时就需要先等待生成这50个令牌的时间。如果等待完成之后，当前请求还需要额外的100个令牌，那么当前请求还需借后面时间产生的100个令牌。再下一个请求的处理，以此类推。

4.继承RateLimiter的SmoothWarmingUp源码

(1)SmoothWarmingUp的介绍

(2)SmoothWarmingUp的初始化

(3)SmoothWarmingUp中maxPermits的计算

(4)SmoothWarmingUp获取令牌

(5)SmoothBursty和SmoothWarmingUp的对比

(1)SmoothWarmingUp的介绍

SmoothWarmingUp支持预热功能，预热是对于冷系统来说的。当系统流量低时，系统就会冷下来，具体表现在：线程池会释放多余线程、连接池会释放多余连接、缓存会过期失效等。这时如果还放行满负荷流量甚至突发流量进入冷系统，则系统压力会暴增。

系统压力暴增很容易会导致系统出现问题，这也是SmoothBursty的不足。因为在SmoothBursty的实现逻辑里，流量低时桶内存储的令牌会增多。此时如果有满负荷流量甚至突发流量进入系统，SmoothBursty会放行，从而对系统产生比较大的压力。所以不能简单根据桶内是否有存储的令牌来放行流量，要判断系统冷热程度。

简单来说就是：流量越低时，桶内堆积的令牌数就会越高(因为生成速度大于消耗速度)，而系统就会越冷，这时令牌生成速率就应该要越低，从而达到预热的目的。

上图中的变量含义如下：

变量一：coldIntervalMicros

表示的是系统最冷时的令牌生成速率，这时单位令牌的耗时最大。

变量二：stableIntervalMicros

表示的是稳定阶段每生成一个令牌需要消耗的微秒数，这个值是根据构造方法传入的permitsPerSecond换算成微秒数得来的。

变量三：maxPermits

表示的是令牌桶中允许存放的最大令牌数。

变量四：storedPermits

表示的是令牌桶中当前缓存的未消耗的令牌数。当令牌消耗速度小于令牌产生速度时，桶内就会开始堆积令牌。但是堆积的令牌数不会大于maxPermits，这个值越大，说明系统越冷。

变量五：thresholdPermits

表示的是进入预热阶段的临界令牌数。当桶内存储的令牌数storedPermits大于该值时，说明系统冷下来了。此时需要进入预热阶段，加大生成单个令牌的耗时。当桶内存储的令牌数storedPermits小于该值时，说明进入热系统阶段，此时可以按正常速率生成令牌了。thresholdPermits默认是整个预热时间除以正常速率的一半。该值太小会过早进入预热阶段，影响性能。该值太大会对系统产生压力，没能达到预热效果。

上图中，横轴表示当前桶内库存令牌数，纵轴表示生成单个令牌的耗时。当桶内存储的令牌数大于storedPermits这个临界值时，系统就会进入预热阶段，对应的纵轴的生成单个令牌的耗时就会增加。当桶内存储的令牌数达到上限maxPermits时，系统处于最冷阶段，此时生成单个令牌的耗时就是最长，从而达到预热的目的。

(2)SmoothWarmingUp的初始化

在初始化SmoothWarmingUp的RateLimiter的create()方法中，会传入如下参数：

参数一：permitsPerSecond

表示的是稳定阶段的速率，也就是稳定阶段每秒生成的令牌数。

参数二：warmupPeriod

表示的是预热时间。

参数三：unit

表示的是预热时间warmupPeriod的单位。

参数四：coldFactor

表示的是冷却因子，这里固定是3.0，决定了coldIntervalMicros的值。

参数五：stopwatch

这可以理解成计时器，记录限流的计时信息，通过计时信息来计算令牌的产生和消耗等信息。

复制代码

@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime")
public abstract class RateLimiter {
    ...
    //Creates a RateLimiter with the specified stable throughput, 
    //given as "permits per second" (commonly referred to as QPS, queries per second), 
    //and a warmup period, during which the RateLimiter smoothly ramps up its rate, 
    //until it reaches its maximum rate at the end of the period (as long as there are enough requests to saturate it). 
    //Similarly, if the RateLimiter is left unused for a duration of warmupPeriod, 
    //it will gradually return to its "cold" state, 
    //i.e. it will go through the same warming up process as when it was first created.
    
    //The returned RateLimiter is intended for cases where the resource that actually fulfills the requests (e.g., a remote server) needs "warmup" time, 
    //rather than being immediately accessed at the stable (maximum) rate.
    //The returned RateLimiter starts in a "cold" state (i.e. the warmup period will follow), 
    //and if it is left unused for long enough, it will return to that state.
    
    //创建一个具有指定稳定吞吐量的RateLimiter，
    //入参为："每秒多少令牌"(通常称为QPS，每秒的查询量)，以及平稳增加RateLimiter速率的预热期，
    //直到RateLimiter在该预热周期结束时达到最大速率(只要有足够的请求使其饱和)；
    //类似地，如果RateLimiter在预热时段的持续时间内未被使用，它将逐渐返回到它的"冷"状态，
    //也就是说，它将经历与最初创建时相同的预热过程；
    
    //返回的RateLimiter适用于实际满足请求的资源(例如远程服务器)需要"预热"时间的情况，而不是以稳定(最大)速率立即访问；
    //返回的RateLimiter在"冷"状态下启动(也就是说，接下来将是预热期)，如果它被闲置足够长的时间，它就会回到那个"冷"状态；

    //@param permitsPerSecond the rate of the returned RateLimiter, measured in how many permits become available per second
    //@param warmupPeriod the duration of the period where the RateLimiter ramps up its rate, before reaching its stable (maximum) rate
    //@param unit the time unit of the warmupPeriod argument
    public static RateLimiter create(double permitsPerSecond, long warmupPeriod, TimeUnit unit) {
        checkArgument(warmupPeriod >= 0, "warmupPeriod must not be negative: %s", warmupPeriod);
        return create(permitsPerSecond, warmupPeriod, unit, 3.0, SleepingStopwatch.createFromSystemTimer());
    }
    
    @VisibleForTesting
    static RateLimiter create(double permitsPerSecond, long warmupPeriod, TimeUnit unit, double coldFactor, SleepingStopwatch stopwatch) {
        RateLimiter rateLimiter = new SmoothWarmingUp(stopwatch, warmupPeriod, unit, coldFactor);
        //调用RateLimiter.setRate()方法
        rateLimiter.setRate(permitsPerSecond);
        return rateLimiter;
    }
    
    //Updates the stable rate of this RateLimiter, 
    //that is, the permitsPerSecond argument provided in the factory method that constructed the RateLimiter. 
    //Currently throttled threads will not be awakened as a result of this invocation, 
    //thus they do not observe the new rate; only subsequent requests will.
    //Note though that, since each request repays (by waiting, if necessary) the cost of the previous request, 
    //this means that the very next request after an invocation to setRate() will not be affected by the new rate; 
    //it will pay the cost of the previous request, which is in terms of the previous rate.
    //The behavior of the RateLimiter is not modified in any other way, 
    //e.g. if the RateLimiter was configured with a warmup period of 20 seconds, 
    //it still has a warmup period of 20 seconds after this method invocation.
    //更新该RateLimiter的稳定速率，即在构造RateLimiter的工厂方法中提供permitsPerSecond参数；
    //当前被限流的线程将不会由于这个调用而被唤醒，因此它们没有观察到新的速率；只有随后的请求才会；
    //但是要注意的是，由于每个请求(如果需要，通过等待)会偿还先前请求的成本，
    //这意味着调用setRate()方法后的下一个请求将不会受到新速率的影响，
    //它将按照先前的速率处理先前请求的成本；
    //RateLimiter的行为不会以任何其他方式修改，
    //例如：如果RateLimiter被配置为具有20秒的预热周期，在该方法调用之后，它仍然有20秒的预热期；

    //@param permitsPerSecond the new stable rate of this {@code RateLimiter}
    public final void setRate(double permitsPerSecond) {
        checkArgument(permitsPerSecond > 0.0 && !Double.isNaN(permitsPerSecond), "rate must be positive");
        //在同步代码块中设定速率
        synchronized (mutex()) {
            //调用SmoothRateLimiter.doSetRate()方法
            doSetRate(permitsPerSecond, stopwatch.readMicros());
        }
    }
    ...
}

@GwtIncompatible
abstract class SmoothRateLimiter extends RateLimiter {
    //The currently stored permits. 
    //令牌桶中当前缓存的未消耗的令牌数
    double storedPermits;
    //The maximum number of stored permits.
    //令牌桶中允许存放的最大令牌数
    double maxPermits;
    //The interval between two unit requests, at our stable rate.
    //E.g., a stable rate of 5 permits per second has a stable interval of 200ms.
    //按照我们稳定的速率，两个单位请求之间的时间间隔；例如，每秒5个令牌的稳定速率具有200ms的稳定间隔
    double stableIntervalMicros;
    //The time when the next request (no matter its size) will be granted. 
    //After granting a request, this is pushed further in the future. Large requests push this further than small requests.
    //下一个请求(无论大小)将被批准的时间.
    //在批准请求后，这将在未来进一步推进，大请求比小请求更能推动这一进程。
    private long nextFreeTicketMicros = 0L;//could be either in the past or future
    ...
    
    //这是一个可以重复调用的函数.
    //第一次调用和非第一次调用的过程有些不一样，目的是设定一个新的速率Rate.
    @Override
    final void doSetRate(double permitsPerSecond, long nowMicros) {
        //调用SmoothRateLimiter.resync()方法，重试计算和同步存储的预分配的令牌.
        resync(nowMicros);
        //计算稳定的发放令牌的时间间隔. 单位us, 比如qps为5, 则为200ms即20万us的间隔进行令牌发放. 
        double stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond;
        this.stableIntervalMicros = stableIntervalMicros;
        //调用SmoothWarmingUp.doSetRate()设定其内部的比率.
        doSetRate(permitsPerSecond, stableIntervalMicros);
    }
    
    //Updates storedPermits and nextFreeTicketMicros based on the current time.
    //根据当前时间，更新storedPermits和nextFreeTicketMicros变量
    //注意: 在初始化SmoothBursty时会第一次调用resync()方法，此时各值的情况如下：
    //coolDownIntervalMicros = 0、nextFreeTicketMicros = 0、newPermits = 无穷大.
    //maxPermits = 0(初始值，还没有重新计算)、最后得到的: storedPermits = 0;
    //同时，nextFreeTicketMicros = "起始时间"
    void resync(long nowMicros) {
        //if nextFreeTicket is in the past, resync to now
        if (nowMicros > nextFreeTicketMicros) {
            double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();
            storedPermits = min(maxPermits, storedPermits + newPermits);
            nextFreeTicketMicros = nowMicros;
        }
    }
    
    abstract void doSetRate(double permitsPerSecond, double stableIntervalMicros);
    ...
    
    static final class SmoothWarmingUp extends SmoothRateLimiter {
        private final long warmupPeriodMicros;
        //The slope of the line from the stable interval (when permits == 0), to the cold interval (when permits == maxPermits)
        private double slope;//斜率
        private double thresholdPermits;
        private double coldFactor;

        SmoothWarmingUp(SleepingStopwatch stopwatch, long warmupPeriod, TimeUnit timeUnit, double coldFactor) {
            super(stopwatch);
            //将warmupPeriod转换成微妙并赋值给warmupPeriodMicros
            this.warmupPeriodMicros = timeUnit.toMicros(warmupPeriod);
            this.coldFactor = coldFactor;
        }

        @Override
        void doSetRate(double permitsPerSecond, double stableIntervalMicros) {
            double oldMaxPermits = maxPermits;
            //stableIntervalMicros此时已由前面的SmoothRateLimiter.doSetRate()方法设为：1/qps
            //coldFactor的值默认会初始化为3
            //因此系统最冷时的令牌生成间隔：coldIntervalMicros等于3倍的普通间隔stableIntervalMicros
            double coldIntervalMicros = stableIntervalMicros * coldFactor;
            //warmupPeriodMicros是用户传入的预热时间
            //stableIntervalMicros是稳定期间令牌发放的间隔
            //进入预热阶段的临界令牌数thresholdPermits，默认就是：整个预热时间除以正常速率的一半
            //该值太小会过早进入预热阶段，影响性能；该值太大会对系统产生压力，没达到预热效果
            thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros;
            //最大令牌数
            maxPermits = thresholdPermits + 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros);
            //斜率
            slope = (coldIntervalMicros - stableIntervalMicros) / (maxPermits - thresholdPermits);
            //设置当前桶内的存储令牌数
            //突发型的RateLimiter------SmoothBursty：
            //初始化时不会预生成令牌，因为storedPermits初始为0；
            //随着时间推移，则会产生新的令牌，这些令牌如果没有被消费，则会存储在storedPermits里；
            //预热型的RateLimiter------SmoothWarmingUp：
            //初始化时会预生成令牌，并且初始化时肯定是系统最冷的时候，所以桶内默认就是maxPermits
            if (oldMaxPermits == Double.POSITIVE_INFINITY) {
                //if we don't special-case this, we would get storedPermits == NaN, below
                storedPermits = 0.0;
            } else {
                //对于SmoothWarmingUp的RateLimiter来说，其初始存储值storedPermits是满的，也就是存储了最大限流的令牌数
                //而对于突发型的限流器SmoothBursty来说，其初始存储值storedPermits是0
                storedPermits = (oldMaxPermits == 0.0) ? maxPermits : storedPermits * maxPermits / oldMaxPermits;
            }
        }
        ...
    }
    ...
}

SmoothWarmingUp初始化时就是系统最冷的时候，此时令牌桶内的已存储令牌数等于maxPermits。SmoothWarmingUp的doSetRate()方法涉及的变量有：

变量一：stableIntervalMicros

表示的是稳定阶段生成令牌的速率，也就是1 / qps。

复制代码

stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond

变量二：warmupPeriodMicros

表示的是根据构造方法中传入的预热阶段总时间warmupPeriod换算成的微秒值，将时间单位控制在微秒会让耗时更精确。

变量三：coldIntervalMicros

表示的是系统最冷时的令牌生成速率。

复制代码

coldIntervalMicros = stableIntervalMicros * coldFactor

变量四：thresholdPermits

表示的是进入预热阶段的临界令牌数。

复制代码

thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros

变量五：maxPermits

表示的是令牌桶内的最大令牌数。

复制代码

maxPermits = 稳定阶段生成的令牌数 + 预热阶段生成的令牌数

稳定阶段生成的令牌数是thresholdPermits，预热阶段的总时间是warmupPeriodMicros，所以预热阶段生成令牌的平均速率是：

复制代码

(stableIntervalMicros + coldIntervalMicros) / 2

所以预热阶段生成的令牌数就是：

复制代码

2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros)

变量六：slope

表示的是斜率或者坡度，单位是微秒。预热阶段是以固定速度来提速的，预热阶段生成后一个令牌的耗时比生成上一个令牌的耗时要多slope微秒。也就是每生成一个令牌，下一个令牌的耗时就会固定增加slope微秒。已知预热阶段每个令牌的初始耗时为coldIntervalMicros微秒，预热结束时每个令牌的耗时为stableIntervalMicros微秒，整个预热阶段产生的令牌数是maxPermits - thresholdPermits，所以可以得出预热阶段生成令牌的增速为：

复制代码

slope = (coldIntervalMicros - stableIntervalMicros) / (maxPermits - thresholdPermits)

变量七：storedPermits

默认SmoothWarmingUp初始化时就是系统最冷的时候，此时的storedPermits = maxPermits。

(3)SmoothWarmingUp中maxPermits的计算

一.计算公式分析

复制代码

stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond;
thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros;
maxPermits = thresholdPermits + 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros);

进行变换:
maxPermits - thresholdPermits = 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros)

继续变换:
(stableIntervalMicros + coldIntervalMicros) * (maxPermits - thresholdPermits ) / 2 = warmupPeriodMicros

其中梯形的斜边对应的斜率是：
slope = (coldIntervalMicros - stableIntervalMicros) / (maxPermits - thresholdPermits);

结合如下的图(令牌的发放时间间隔随着已存储的令牌不同而不同)可知：maxPermits - thresholdPermits就是梯形的高，stableIntervalMicros + coldIntervalMicros就是梯形的两个底的和。所以给定梯形的面积即warmupPeriodMicros，就可以计算出maxPermits。也就是根据传入的预热时间 + 稳定时的令牌发放间隔 + 冷却因子，就可以计算出预热期间能发放的最大令牌数。

二.举例说明

假如QPS限制为100，预热时间为5秒，那么：

复制代码

stableIntervalMicros = 1s / 100 = 10ms
coldIntervalMicros = 10ms * 3 = 30ms

也就是说在预热期间，最慢会慢到到每30ms才产生一个令牌，预热周期是5000ms。由于梯形面积5000ms = (上底10ms + 下底30ms) * h / 2，那么h = 10000 / 40 = 250个令牌。

由如下的公式可得如下的结果：

复制代码

公式：thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros
结果：thresholdPermits = 0.5 * 预热周期5000ms / 稳定间隔10ms = 250

也就是说，在5s的预热周期内，按正常速率本来要生成500个令牌。但SmoothWarmingUp会以正常速率(每10ms一个令牌)生成其中一半，剩下一半再用5s的预热时间来进行预热式生成。

根据上面计算maxPermits的公式:

复制代码

maxPermits = thresholdPermits + 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros);

因为冷却间隔时间是稳定间隔时间的3倍，所以：

复制代码

stableIntervalMicros + coldIntervalMicros = 4stableIntervalMicros

因此该公式的后半部分也是：0.5倍 * 预热周期5000s / 稳定间隔10ms。

**总结：**带有预热型的限流器SmoothWarmingUp，会用一个预热周期的时间配合稳定间隔时间来确定最大可存储的令牌数。这个最大可存储的令牌的一半，是按照稳定的、正常速率生成的，另外一半令牌的平均生成速率是正常速率的一半。

(4)SmoothWarmingUp获取令牌

在调用SmoothWarmingUp的acquire()方法获取令牌时，最后会调用到SmoothRateLimiter的reserveEarliestAvailable()方法计算当前线程需要阻塞等待的时间，这个阻塞等待的时间由两部分组成。第一部分是从桶中获取storedPermitsToSpend个现有令牌的耗时，第二部分是等待生成freshPermits个新鲜令牌的耗时。

SmoothWarmingUp从桶中获取storedPermitsToSpend个现有令牌的耗时，会调用SmoothWarmingUp.storedPermitsToWaitTime()方法计算具体的耗时。该等待时间又会分为两部分进行计算：第一部分是获取预热阶段的令牌的耗时，第二部分是获取稳定阶段的令牌的耗时。并且只有当预热阶段的令牌获取完还不够时，才会去获取稳定阶段的令牌。

比如请求4个令牌，此时桶内令牌数是22，进入预热阶段的临界值是20。那么桶内稳定阶段生成的令牌就是20个，预热阶段生成的令牌就是2个。面对要获取4个令牌的请求，会先获取预热阶段的全部令牌也就是2个，然后再获取稳定阶段中的2个令牌。

复制代码

获取预热阶段的令牌的耗时 = (初始速度 + 结束速度) * 令牌数 / 2
获取稳定阶段的令牌的耗时 = 固定速率stableIntervalMicros * 令牌数

@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime")
public abstract class RateLimiter {
    ...
    //无限等待的获取
    //Acquires the given number of permits from this RateLimiter, 
    //blocking until the request can be granted. 
    //Tells the amount of time slept, if any.
    //@param permits the number of permits to acquire，获取的令牌数量
    //@return time spent sleeping to enforce rate, in seconds; 0.0 if not rate-limited
    @CanIgnoreReturnValue
    public double acquire(int permits) {
        //调用RateLimiter.reserve()方法
        //预支令牌并获取需要阻塞的时间：即预定数量为permits的令牌数，并返回需要等待的时间
        long microsToWait = reserve(permits);
        //将需要等待的时间补齐, 从而满足限流的需求，即根据microsToWait来让线程sleep(共性)
        stopwatch.sleepMicrosUninterruptibly(microsToWait);
        //返回这次调用使用了多少时间给调用者
        return 1.0 * microsToWait / SECONDS.toMicros(1L);
    }
        
    //Reserves the given number of permits from this RateLimiter for future use, 
    //returning the number of microseconds until the reservation can be consumed.
    //从这个RateLimiter限速器中保留给定数量的令牌，以备将来使用，返回可以使用保留前的微秒数
    //@return time in microseconds to wait until the resource can be acquired, never negative
    final long reserve(int permits) {
        checkPermits(permits);
        //由于涉及并发操作，所以必须使用synchronized进行互斥处理
        synchronized (mutex()) {
            //调用RateLimiter.reserveAndGetWaitLength()方法
            return reserveAndGetWaitLength(permits, stopwatch.readMicros());
        }
    }
    
    //Reserves next ticket and returns the wait time that the caller must wait for.
    //预定下一个ticket，并且返回需要等待的时间
    final long reserveAndGetWaitLength(int permits, long nowMicros) {
        //调用SmoothRateLimiter.reserveEarliestAvailable()方法
        long momentAvailable = reserveEarliestAvailable(permits, nowMicros);
        return max(momentAvailable - nowMicros, 0);
    }
    
    //Reserves the requested number of permits and returns the time that those permits can be used (with one caveat).
    //保留请求数量的令牌，并返回可以使用这些令牌的时间(有一个警告)
    //生产令牌、获取令牌、计算阻塞时间的具体细节由子类来实现
    //@return the time that the permits may be used, or, if the permits may be used immediately, an arbitrary past or present time
    abstract long reserveEarliestAvailable(int permits, long nowMicros);
    ...
}

@GwtIncompatible
abstract class SmoothRateLimiter extends RateLimiter {
    //The currently stored permits. 
    //令牌桶中当前缓存的未消耗的令牌数
    double storedPermits;
    //The maximum number of stored permits. 
    //令牌桶中允许存放的最大令牌数
    double maxPermits;
    //The interval between two unit requests, at our stable rate.
    //E.g., a stable rate of 5 permits per second has a stable interval of 200ms.
    //按照我们稳定的速率，两个单位请求之间的时间间隔；例如，每秒5个令牌的稳定速率具有200ms的稳定间隔
    double stableIntervalMicros;
    //The time when the next request (no matter its size) will be granted. 
    //After granting a request, this is pushed further in the future. Large requests push this further than small requests.
    //下一个请求(无论大小)将被批准的时间. 在批准请求后，这将在未来进一步推进，大请求比小请求更能推动这一进程.
    private long nextFreeTicketMicros = 0L;//could be either in the past or future
    ...
    
    @Override
    final long reserveEarliestAvailable(int requiredPermits, long nowMicros) {
        //1.根据nextFreeTicketMicros计算新产生的令牌数，更新当前未使用的令牌数storedPermits
        //获取令牌时调用SmoothRateLimiter.resync()方法与初始化时的调用不一样.
        //此时会把"没有过期"的令牌存储起来.
        //但是如果计数时间nextFreeTicketMicros是在未来. 那就不做任何处理.
        resync(nowMicros);
        //下一个请求(无论大小)将被批准的时间，这个值将被作为方法结果返回
        long returnValue = nextFreeTicketMicros;
        
        //2.计算需要阻塞等待的时间
        //2.1.先从桶中取未消耗的令牌，如果桶中令牌数不足，看最多能取多少个
        //存储的令牌可供消费的数量
        double storedPermitsToSpend = min(requiredPermits, this.storedPermits);
        //2.2.计算是否需要等待新鲜的令牌(当桶中现有的令牌数不足时就需要等待新鲜的令牌)，如果需要，则计算需要等待的令牌数
        //需要等待的令牌：新鲜的令牌
        double freshPermits = requiredPermits - storedPermitsToSpend;
        //计算需要等待的时间
        //分两部分计算：waitMicros = 从桶中获取storedPermitsToSpend个现有令牌的代价 + 等待生成freshPermits个新鲜令牌的代价
        //从桶中取storedPermitsToSpend个现有令牌也是有代价的，storedPermitsToWaitTime()方法是个抽象方法，会由SmoothBursty和SmoothWarmingUp实现
        //对于SmoothBursty来说，storedPermitsToWaitTime()会返回0，表示已经存储的令牌不需要等待.
        //而生成新鲜令牌需要等待的代价是：新鲜令牌的个数freshPermits * 每个令牌的耗时stableIntervalMicros
        long waitMicros = storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend) + (long) (freshPermits * stableIntervalMicros);
        
        //3.更新nextFreeTicketMicros
        //由于新鲜的令牌可能已被预消费，所以nextFreeTicketMicros就得往后移，以表示这段时间被预消费了
        this.nextFreeTicketMicros = LongMath.saturatedAdd(nextFreeTicketMicros, waitMicros);
        
        //4.扣减令牌数，更新桶内剩余令牌
        //最后把上面计算的可扣减的令牌数量从存储的令牌里减掉
        this.storedPermits -= storedPermitsToSpend;
        //返回请求需要等待的时间
        //需要注意returnValue被赋值的是上次的nextFreeTicketMicros，说明当前这次请求获取令牌的代价由下一个请求去支付
        return returnValue;
    }
    
    //Updates storedPermits and nextFreeTicketMicros based on the current time.
    //根据当前时间，更新storedPermits和nextFreeTicketMicros变量
    //计算nextFreeTicketMicros到当前时间内新产生的令牌数，这个就是延迟计算
    void resync(long nowMicros) {
        //if nextFreeTicket is in the past, resync to now
        //一般当前的时间是大于下个请求被批准的时间
        //此时：会把过去的时间换成令牌数存储起来，注意存储的令牌数不能大于最大的令牌数
        //当RateLimiter初始化好后，可能刚开始没有流量，或者是一段时间没有流量后突然来了流量
        //此时可以往"后"预存储一秒时间的令牌数. 也就是这里所说的burst能力
        
        //如果nextFreeTicketMicros在未来的一个时间点，那这个if判断便不满足
        //此时，不需要进行更新storedPermits和nextFreeTicketMicros变量
        //此种情况发生在："预借"了令牌的时候
        if (nowMicros > nextFreeTicketMicros) {
            //时间差除以生成一个新鲜令牌的耗时，coolDownIntervalMicros()是抽象方法，由子类实现
            double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();
            //更新令牌桶内已存储的令牌个数，注意不超过最大限制
            storedPermits = min(maxPermits, storedPermits + newPermits);
            //更新nextFreeTicketMicros为当前时间
            nextFreeTicketMicros = nowMicros;
        }
    }
    
    //Translates a specified portion of our currently stored permits which we want to spend/acquire, into a throttling time.
    //Conceptually, this evaluates the integral of the underlying function we use, for the range of [(storedPermits - permitsToTake), storedPermits].
    //This always holds: 0 <= permitsToTake <= storedPermits
    //从桶中取出已存储的令牌的代价，由子类实现
    //这是一个抽象函数，SmoothBursty中的实现会直接返回0，可以认为已经预分配的令牌，在获取时不需要待待时间
    abstract long storedPermitsToWaitTime(double storedPermits, double permitsToTake);
    
    //Returns the number of microseconds during cool down that we have to wait to get a new permit.
    //每生成一个新鲜令牌的耗时，由子类实现
    abstract double coolDownIntervalMicros();
    ...
    
    static final class SmoothWarmingUp extends SmoothRateLimiter {
        private final long warmupPeriodMicros;
        private double slope;//斜率
        private double thresholdPermits;
        private double coldFactor;
        ...
        @Override
        long storedPermitsToWaitTime(double storedPermits, double permitsToTake) {
            //检查当前桶内存储的令牌数是否大于进入预热阶段的临界令牌数thresholdPermits
            double availablePermitsAboveThreshold = storedPermits - thresholdPermits;
            long micros = 0;
            //如果当前桶内存储的令牌数大于进入预热阶段的临界令牌数thresholdPermits
            //则说明系统当前已经冷下来了，需要进入预热期，于是需要计算在预热期生成令牌的耗时
            if (availablePermitsAboveThreshold > 0.0) {
                //计算在超出临界值的令牌中需要取出多少个令牌，并计算耗时
                double permitsAboveThresholdToTake = min(availablePermitsAboveThreshold, permitsToTake);
                //计算预热阶段的耗时，前半部分的permitsToTime()计算的是生成令牌的初始速率，后半部分的permitsToTime()计算的是生成令牌的结束速率
                double length = permitsToTime(availablePermitsAboveThreshold) + permitsToTime(availablePermitsAboveThreshold - permitsAboveThresholdToTake);
                //总耗时 = ((初始速率 + 结束速率) * 令牌数) / 2
                micros = (long) (permitsAboveThresholdToTake * length / 2.0);
                permitsToTake -= permitsAboveThresholdToTake;
            }
            //加上稳定阶段的令牌耗时就是总耗时
            micros += (long) (stableIntervalMicros * permitsToTake);
            return micros;
        }
       
        //已知每生成一个令牌，下一个令牌的耗时就会固定增加slope微秒
        //那么在知道初始耗时stableIntervalMicros的情况下，就可以按如下公式求出生成第permits个令牌的耗时
        private double permitsToTime(double permits) {
            return stableIntervalMicros + permits * slope;
        }
      
        @Override
        double coolDownIntervalMicros() {
            //预热时长 / 最大令牌数
            return warmupPeriodMicros / maxPermits;
        }
    }
    ...
}

(5)SmoothBursty和SmoothWarmingUp的对比

SmoothBursty和SmoothWarmingUp这两种限流器都使用了预支令牌的思路，就是当前线程获取令牌的代价(阻塞时间)需要由下一个线程来支付。这样可以减少当前线程阻塞的概率，因为下一个请求不确定什么时候才来。如果下一个请求很久才来，那么这段时间产生的新令牌已经满足下一个线程的需求，这样就不用阻塞了。

一.在SmoothBursty中

桶内的已存储令牌是可以直接拿来用的，不需要额外的耗时，以此应对突发的流量，但这些已存储的令牌是之前低流量时积累下来的。

如果流量一直处于满负荷，没有结余的令牌，那么当突发流量到来时，仍然会被限流。

而且令牌桶内默认最大的令牌数就是1秒内产生的令牌。比如QPS设置为10的话，那么令牌桶内最多存储10个令牌。当QPS=20的流量到来时，也只够1秒钟的消耗，后面又会进入限流状态。

二.在SmoothWarmingUp中

桶内的已存储令牌是不可以直接拿来用的，需要额外的耗时。为了弥补SmoothBursty的不足，它将系统分为热系统和冷系统两个阶段。

满负荷流量或者突发流量对于热系统来说，可能危害不大。因为系统的线程池、缓存、连接池在热系统下都火力全开、抗压能力强。但对于冷系统，满负荷流量和突发流量会加大系统压力，导致各种问题。

所以一般会加入预热的思路来控制冷系统下的流量(即预热阶段等待时间会更长)，而系统的冷热程度就是通过令牌桶内已存储的未消耗的令牌数来判断。因为当系统冷下来时，也就是系统流量小的时候，令牌消耗速度就会少，相应的令牌桶内已存储的令牌数就会多起来。

如果桶内的令牌数超过了进入预热阶段的临界令牌数thresholdPermits，那么就代表系统进入了预热阶段，在该阶段获取令牌的耗时将会增大，而且增大的速度是slope。