sentinel学习笔记7-熔断降级

本文属于sentinel学习笔记系列。网上看到吴就业老师的专栏,写的好值得推荐,我整理的有所删减,推荐看原文。

https://blog.csdn.net/baidu_28523317/category_10400605.html

限流需要我们根据不同的硬件条件做好压测,不好准确评估,限流的阈值都会配置的比压测结果略大,这时就需要结合熔断降级做兜底。在配置了限流规则的基础上,我们还可以为同一资源配置熔断降级规则。当接口的 QPS 未达限流阈值却已经有很多请求超时的情况下,就可能达到熔断降级规则的阈值从而触发熔断,这就能很好地保护服务自身。降级这是一个常用的,对于限流少用从业务上出发考虑的是加机器扩容,不是简单限制掉。本篇主要关注com.alibaba.csp.sentinel.slots.block.degrade包 下面。

具体分析先从断路器开始吧。

断路器

Sentinel中的熔断降级使用断路器实现,先看下断路器概念,来自百科

断路器有分简单与较进阶的版本,简单的断路器只需要知道服务是否可用。而较进阶的版本比起前者更有效率。进阶的断路器带有至少三个状态:

  • 关闭(Closed):断路器在预设的情形下是呈现关闭的状态,而断路器本身"带有"计数功能,每当错误发生一次,计数器也就会进行"累加"的动作,到了一定的错误发生次数断路器就会被"开启",这个时候亦会在内部启用一个计时器,一旦时间到了就会切换成半开启的状态。
  • 开启(Open):在开启的状态下任何请求都会"直接"被拒绝并且抛出异常讯息。
  • 半开启(Half-Open):在此状态下断路器会允许部分的请求,如果这些请求都能成功通过,那么就意味着错误已经不存在,则会被"切换回"关闭状态并"重置"计数。倘若请求中有"任一"的错误发生,则会回复到"开启"状态,并且重新计时,给予系统一段休息时间。
java 复制代码
public interface CircuitBreaker {

    /**
     * Get the associated circuit breaking rule.
     *  获取断路规则
     * @return associated circuit breaking rule
     */
    DegradeRule getRule();

    /**
     * Acquires permission of an invocation only if it is available at the time of invoking.
     *  根据上下文判断请求是否通过
     * @param context context of current invocation
     * @return {@code true} if permission was acquired and {@code false} otherwise
     */
    boolean tryPass(Context context);

    /**
     * Get current state of the circuit breaker.
     * 断路器当前状态
     * @return current state of the circuit breaker
     */
    State currentState();

    /**
     * <p>Record a completed request with the context and handle state transformation of the circuit breaker.</p>
     * <p>Called when a <strong>passed</strong> invocation finished.</p>
     * 请求完成处理
     * @param context context of current invocation
     */
    void onRequestComplete(Context context);

    /**
     * Circuit breaker state. 断路器状态枚举
     */
    enum State {
        /**
         * In {@code OPEN} state, all requests will be rejected until the next recovery time point.
         */
        OPEN,
        /**
         * In {@code HALF_OPEN} state, the circuit breaker will allow a "probe" invocation.
         * If the invocation is abnormal according to the strategy (e.g. it's slow), the circuit breaker
         * will re-transform to the {@code OPEN} state and wait for the next recovery time point;
         * otherwise the resource will be regarded as "recovered" and the circuit breaker
         * will cease cutting off requests and transform to {@code CLOSED} state.
         */
        HALF_OPEN,
        /**
         * In {@code CLOSED} state, all requests are permitted. When current metric value exceeds the threshold,
         * the circuit breaker will transform to {@code OPEN} state.
         */
        CLOSED
    }
}

断路器CircuitBreaker 定义接口, 抽象断路器AbstractCircuitBreaker 主要实现了方法:getRuletryPasscurrentState,慢调用使用ResponseTimeCircuitBreaker,异常数和异常比例使用ExceptionCircuitBreaker ,这两个主要实现了onRequestComplete。

AbstractCircuitBreaker

虽然不同熔断降级策略的熔断器实现逻辑不同,但差异只是阈值的判断不同或需要统计的指标数据不同,而是否放行请求只需要根据当前熔断器的状态判断,因此,Sentinel为不同熔断降级策略的熔断器提供了一个统一的抽象类------AbstractCircuitBreaker。

除了实现接口中的方法,抽象断路器定义了断路器状态转换的方法,断路器状态无法直接从开启状态到关闭状态,因此有四个状态转换方法:

  • fromCloseToOpen 从关闭到开启
  • fromHalfOpenToOpen 从半开启到开启
  • fromHalfOpenToClose 从半开启到关闭
  • fromOpenToHalfOpen 从开启到半开启

先不贴代码了,回到主流程看看。

熔断降级

熔断策略

sentinel 提供以下几种熔断策略:

  • 慢调用比例 (SLOW_REQUEST_RATIO):选择以慢调用比例作为阈值,需要设置允许的慢调用 RT(即最大的响应时间),请求的响应时间大于该值则统计为慢调用。当单位统计时长(statIntervalMs)内请求数目大于设置的最小请求数目,并且慢调用的比例大于阈值,则接下来的熔断时长内请求会自动被熔断。经过熔断时长后熔断器会进入探测恢复状态(HALF-OPEN 状态),若接下来的一个请求响应时间小于设置的慢调用 RT 则结束熔断,若大于设置的慢调用 RT 则会再次被熔断。
  • 异常比例 (ERROR_RATIO):当单位统计时长(statIntervalMs)内请求数目大于设置的最小请求数目,并且异常的比例大于阈值,则接下来的熔断时长内请求会自动被熔断。经过熔断时长后熔断器会进入探测恢复状态(HALF-OPEN 状态),若接下来的一个请求成功完成(没有错误)则结束熔断,否则会再次被熔断。异常比率的阈值范围是 [0.0, 1.0],代表 0% - 100%。
  • 异常数 (ERROR_COUNT):当单位统计时长内的异常数目超过阈值之后会自动进行熔断。经过熔断时长后熔断器会进入探测恢复状态(HALF-OPEN 状态),若接下来的一个请求成功完成(没有错误)则结束熔断,否则会再次被熔断。

熔断规则DegradeRule

java 复制代码
public class DegradeRule extends AbstractRule {

    public DegradeRule() {}

    public DegradeRule(String resourceName) {
        setResource(resourceName);
    }

    /**
     * Circuit breaking strategy (0: average RT, 1: exception ratio, 2: exception count).
     * 降级策略
     */
    private int grade = RuleConstant.DEGRADE_GRADE_RT;

    /**
     * Threshold count. The exact meaning depends on the field of grade.
     * <ul>
     *     <li>In average RT mode, it means the maximum response time(RT) in milliseconds.</li>
     *     <li>In exception ratio mode, it means exception ratio which between 0.0 and 1.0.</li>
     *     <li>In exception count mode, it means exception count</li>
     * <ul/> 限流阈值
     */
    private double count;

    /**
     * Recovery timeout (in seconds) when circuit breaker opens. After the timeout, the circuit breaker will
     * transform to half-open state for trying a few requests.
     * 重置熔断的窗口时间,默认值 0
     */
    private int timeWindow;

    /**
     * Minimum number of requests (in an active statistic time span) that can trigger circuit breaking.
     * 当 grade 配置为 DEGRADE_GRADE_EXCEPTION_RATIO 时,该值表示可触发熔断的最小请求数
     * @since 1.7.0
     */
    private int minRequestAmount = RuleConstant.DEGRADE_DEFAULT_MIN_REQUEST_AMOUNT;

    /**
     * The threshold of slow request ratio in RT mode.
     * RT模式下,该值表示可触发熔断的超过阈值的慢请求数量
     * @since 1.8.0
     */
    private double slowRatioThreshold = 1.0d;

    /**
     * The interval statistics duration in millisecond.
     *
     * @since 1.8.0
     */
    private int statIntervalMs = 1000;

断路器的构建

熔断规则配置由 DegradeRuleManager 加载,跟之前看的flowruleManager差不多

通过DegradeRuleManager.loadRules使降级规则生效时,会将DegradeRule转换为断路器CircuitBreaker。

java 复制代码
    private static class RulePropertyListener implements PropertyListener<List<DegradeRule>> {

        private synchronized void reloadFrom(List<DegradeRule> list) {
            //构建断路器
            Map<String, List<CircuitBreaker>> cbs = buildCircuitBreakers(list);
            Map<String, Set<DegradeRule>> rm = new HashMap<>(cbs.size());

            for (Map.Entry<String, List<CircuitBreaker>> e : cbs.entrySet()) {
                assert e.getValue() != null && !e.getValue().isEmpty();

                Set<DegradeRule> rules = new HashSet<>(e.getValue().size());
                for (CircuitBreaker cb : e.getValue()) {
                    rules.add(cb.getRule());
                }
                rm.put(e.getKey(), rules);
            }

            DegradeRuleManager.circuitBreakers = cbs;
            DegradeRuleManager.ruleMap = rm;
        }


        @Override
        public void configUpdate(List<DegradeRule> conf) {
            reloadFrom(conf);
            RecordLog.info("[DegradeRuleManager] Degrade rules has been updated to: {}", ruleMap);
        }

        @Override
        public void configLoad(List<DegradeRule> conf) {
            reloadFrom(conf);
            RecordLog.info("[DegradeRuleManager] Degrade rules loaded: {}", ruleMap);
        }
         private Map<String, List<CircuitBreaker>> buildCircuitBreakers(List<DegradeRule> list) {
            Map<String, List<CircuitBreaker>> cbMap = new HashMap<>(8);
            if (list == null || list.isEmpty()) {
                return cbMap;
            }
            for (DegradeRule rule : list) {
                if (!isValidRule(rule)) {
                    RecordLog.warn("[DegradeRuleManager] Ignoring invalid rule when loading new rules: {}", rule);
                    continue;
                }

                if (StringUtil.isBlank(rule.getLimitApp())) {
                    rule.setLimitApp(RuleConstant.LIMIT_APP_DEFAULT);
                }//核心方法
                CircuitBreaker cb = getExistingSameCbOrNew(rule);
                if (cb == null) {
                    RecordLog.warn("[DegradeRuleManager] Unknown circuit breaking strategy, ignoring: {}", rule);
                    continue;
                }

                String resourceName = rule.getResource();

                List<CircuitBreaker> cbList = cbMap.get(resourceName);
                if (cbList == null) {
                    cbList = new ArrayList<>();
                    cbMap.put(resourceName, cbList);
                }
                cbList.add(cb);
            }
            return cbMap;
        }
    }

DegradeRuleManager.RulePropertyListener#buildCircuitBreakers 是构建断路器方法,

java 复制代码
 private static CircuitBreaker getExistingSameCbOrNew(/*@Valid*/ DegradeRule rule) {
        List<CircuitBreaker> cbs = getCircuitBreakers(rule.getResource());
        if (cbs == null || cbs.isEmpty()) {//断路器为空直接创建
            return newCircuitBreakerFrom(rule);
        }
        for (CircuitBreaker cb : cbs) {
            if (rule.equals(cb.getRule())) {//返回已有断路器
                // Reuse the circuit breaker if the rule remains unchanged.
                return cb;
            }
        }//创建新的断路器
        return newCircuitBreakerFrom(rule);
    }
    private static CircuitBreaker newCircuitBreakerFrom(/*@Valid*/ DegradeRule rule) {
        switch (rule.getGrade()) {
            case RuleConstant.DEGRADE_GRADE_RT:// 慢调用比例
                return new ResponseTimeCircuitBreaker(rule);
            case RuleConstant.DEGRADE_GRADE_EXCEPTION_RATIO: //异常比例
            case RuleConstant.DEGRADE_GRADE_EXCEPTION_COUNT:  //异常数
                return new ExceptionCircuitBreaker(rule);
            default:
                return null;
        }
    }

根据不同的降级策略创建不同的断路器。

熔断降级的流程

DegradeSlot 是实现熔断降级的切入点,它作为 ProcessorSlot 插入到 ProcessorSlotChain 链表中,在 entry 方法中调用 CircuitBreaker去判断是否熔断当前请求,如果熔断则抛出 Block 异常

java 复制代码
@Spi(order = Constants.ORDER_DEGRADE_SLOT)
public class DegradeSlot extends AbstractLinkedProcessorSlot<DefaultNode> {

    @Override
    public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count,
                      boolean prioritized, Object... args) throws Throwable {
        //在触发后续slot前执行熔断的检查  
        performChecking(context, resourceWrapper);
        //触发后续的slot
        fireEntry(context, resourceWrapper, node, count, prioritized, args);
    }

    void performChecking(Context context, ResourceWrapper r) throws BlockException {
        //根据资源名称获取断路器CircuitBreaker
        List<CircuitBreaker> circuitBreakers = DegradeRuleManager.getCircuitBreakers(r.getName());
        if (circuitBreakers == null || circuitBreakers.isEmpty()) {
            return;
        }//遍历
        for (CircuitBreaker cb : circuitBreakers) {
            if (!cb.tryPass(context)) {//校验是否通过,不通过抛异常
                throw new DegradeException(cb.getRule().getLimitApp(), cb.getRule());
            }
        }
    }
java 复制代码
public boolean tryPass(Context context) {
        // Template implementation. 允许通行
        if (currentState.get() == State.CLOSED) {
            return true;
        }//尝试通行
        if (currentState.get() == State.OPEN) {
            // For half-open state we allow a request for probing.
            return retryTimeoutArrived() && fromOpenToHalfOpen(context);
        }
        return false;
    }

前面列出断路器有三种状态,CLOSE:正常通行,HALF_OPEN:允许探测通行,OPEN:拒绝通行,这里判断逻辑:

  • 如果熔断器状态为关闭,则返回true,即允许请求通过。

  • 如果熔断器状态为开启,并且已经超过熔断时长以及开启状态成功转换为半开启(探测)状态,则返回true,即允许请求通过。

  • 如果熔断器状态为开启,并且还在熔断时长内,则返回false,禁止请求通过。

java 复制代码
public abstract class AbstractCircuitBreaker implements CircuitBreaker {

    protected final DegradeRule rule;
    protected final int recoveryTimeoutMs;

    private final EventObserverRegistry observerRegistry;

    protected final AtomicReference<State> currentState = new AtomicReference<>(State.CLOSED);
    
      protected boolean fromOpenToHalfOpen(Context context) {
        //尝试将状态从OPEN设置为HALF_OPEN
        if (currentState.compareAndSet(State.OPEN, State.HALF_OPEN)) {
            // 状态变化通知
            notifyObservers(State.OPEN, State.HALF_OPEN, null);
            Entry entry = context.getCurEntry();
            // 在entry添加一个exitHandler entry.exit()时会调用
            entry.whenTerminate(new BiConsumer<Context, Entry>() {
                @Override
                public void accept(Context context, Entry entry) {
                    // Note: This works as a temporary workaround for https://github.com/alibaba/Sentinel/issues/1638
                    // Without the hook, the circuit breaker won't recover from half-open state in some circumstances
                    // when the request is actually blocked by upcoming rules (not only degrade rules).
                    if (entry.getBlockError() != null) {
                        // Fallback to OPEN due to detecting request is blocked
                        // 如果有发生异常,重新将状态设置为OPEN 请求不同通过
                        currentState.compareAndSet(State.HALF_OPEN, State.OPEN);
                        notifyObservers(State.HALF_OPEN, State.OPEN, 1.0d);
                    }
                }
            });// 此时状态已设置为HALF_OPEN正常通行
            return true;
        }
        return false;
    }

在调用Entry#exit()时,会触发插槽链条的退出调用。具体到熔断降级DegradeSlot#exit方法。

java 复制代码
  public void exit(Context context, ResourceWrapper r, int count, Object... args) {
        Entry curEntry = context.getCurEntry();
        //如果当前其他solt已经有了BlockException直接调用fireExit 不用继续走熔断逻辑了
        if (curEntry.getBlockError() != null) {
            fireExit(context, r, count, args);
            return;
        }
        //通过资源名称获取所有的熔断CircuitBreaker
        List<CircuitBreaker> circuitBreakers = DegradeRuleManager.getCircuitBreakers(r.getName());
        if (circuitBreakers == null || circuitBreakers.isEmpty()) {
            fireExit(context, r, count, args);
            return;
        }

        if (curEntry.getBlockError() == null) {
            // passed request
            for (CircuitBreaker circuitBreaker : circuitBreakers) {
                circuitBreaker.onRequestComplete(context);
            }
        }

        fireExit(context, r, count, args);
    }

断路器状态转换

接上面的exit,通过circuitBreaker.onRequestComplete回调熔断器执行状态切换。

ExceptionCircuitBreaker

ExceptionCircuitBreaker负责异常数/异常比例的熔断,通过滑动窗口统计发生错误数及请求总数

ExceptionCircuitBreaker#onRequestComplete

java 复制代码
    public void onRequestComplete(Context context) {
        Entry entry = context.getCurEntry();
        if (entry == null) {
            return;
        }
        Throwable error = entry.getError();
        //异常时间窗口计数器
        SimpleErrorCounter counter = stat.currentWindow().value();
        if (error != null) {// 发生异常,异常数加1
            counter.getErrorCount().add(1);
        } //总数加1
        counter.getTotalCount().add(1);
        //异常状态处理
        handleStateChangeWhenThresholdExceeded(error);
    }

ExceptionCircuitBreaker#handleStateChangeWhenThresholdExceeded

java 复制代码
private void handleStateChangeWhenThresholdExceeded(Throwable error) {
        //如果熔断开启,发生错误继续熔断
        if (currentState.get() == State.OPEN) {
            return;
        }
        //断路器为半开启状态
        if (currentState.get() == State.HALF_OPEN) {
            //没有异常,熔断器由半开启转换为关闭,允许所有请求通过
            // 未发生异常 HALF_OPEN >>> CLOSE
            // In detecting request
            if (error == null) {
                fromHalfOpenToClose();
            } else {
                //请求还是发生异常,熔断器由半开起转为开启,熔断所有请求
                // 发生异常 HALF_OPEN >>> OPEN
                fromHalfOpenToOpen(1.0d);
            }
            return;
        }
        //下面为熔断器关闭状态
        List<SimpleErrorCounter> counters = stat.values();
        long errCount = 0;
        long totalCount = 0;
        for (SimpleErrorCounter counter : counters) {
            //计算异常请求数量以及请求总数
            errCount += counter.errorCount.sum();
            totalCount += counter.totalCount.sum();
        }
        //最小请求数内不发生熔断
        if (totalCount < minRequestAmount) {
            return;
        }
        // 当前异常数
        double curCount = errCount;
        if (strategy == DEGRADE_GRADE_EXCEPTION_RATIO) {
            // Use errorRatio
            // 算出当前的异常比例
            curCount = errCount * 1.0d / totalCount;
        }
        // 判断当前异常数或异常比例是否达到设定的阀值
        if (curCount > threshold) {
            transformToOpen(curCount);
        }
    }
ResponseTimeCircuitBreaker

ResponseTimeCircuitBreaker负责慢调用的熔断,通过滑动窗口统计慢调用数量及总的请求数

ResponseTimeCircuitBreaker#onRequestComplete

java 复制代码
 public void onRequestComplete(Context context) {
        //获取当前滑动窗口
        SlowRequestCounter counter = slidingCounter.currentWindow().value();
        Entry entry = context.getCurEntry();
        if (entry == null) {
            return;
        }//请求完成时间
        long completeTime = entry.getCompleteTimestamp();
        if (completeTime <= 0) {
            completeTime = TimeUtil.currentTimeMillis();
        }//计算响应时间rt
        long rt = completeTime - entry.getCreateTimestamp();
        if (rt > maxAllowedRt) {
            //rt时间超时,慢调用数加1
            counter.slowCount.add(1);
        }
        counter.totalCount.add(1);
        //状态转换处理
        handleStateChangeWhenThresholdExceeded(rt);
    }

ResponseTimeCircuitBreaker#handleStateChangeWhenThresholdExceeded

java 复制代码
  private void handleStateChangeWhenThresholdExceeded(long rt) {
        //如果熔断开启,拦截所有请求
        if (currentState.get() == State.OPEN) {
            return;
        }
        //如果熔断半开启状态
        if (currentState.get() == State.HALF_OPEN) {
            // In detecting request
            // TODO: improve logic for half-open recovery
            if (rt > maxAllowedRt) {
                //请求RT大于设置的阈值,熔断状态由半开启转换为开启
                fromHalfOpenToOpen(1.0d);
            } else {
                //请求RT小于设置的阈值,熔断状态由半开启转换为关闭
                fromHalfOpenToClose();
            }
            return;
        }

        //下面熔断状态为关闭
        List<SlowRequestCounter> counters = slidingCounter.values();
        long slowCount = 0;
        long totalCount = 0;
        for (SlowRequestCounter counter : counters) {
            //统计慢调用数量和总调用数量
            slowCount += counter.slowCount.sum();
            totalCount += counter.totalCount.sum();
        }
        //总调用小于最小请求阈值,不做熔断
        if (totalCount < minRequestAmount) {
            return;
        }//计算慢请求比例
        double currentRatio = slowCount * 1.0d / totalCount;
        if (currentRatio > maxSlowRequestRatio) {
            //慢调用比例大于阈值,熔断状态由关闭转变为开启
            transformToOpen(currentRatio);
        }
        //慢调用比例等于阈值,慢调用比例等于慢速请求比率最大值
        //熔断状态由关闭转变为开启
        if (Double.compare(currentRatio, maxSlowRequestRatio) == 0 &&
                Double.compare(maxSlowRequestRatio, SLOW_REQUEST_RATIO_MAX_VALUE) == 0) {
            transformToOpen(currentRatio);
        }
    }

先到这里。

相关推荐
吉大一菜鸡3 小时前
FPGA学习(基于小梅哥Xilinx FPGA)学习笔记
笔记·学习·fpga开发
CCSBRIDGE5 小时前
Magento2项目部署笔记
笔记
亦枫Leonlew6 小时前
微积分复习笔记 Calculus Volume 2 - 5.1 Sequences
笔记·数学·微积分
爱码小白6 小时前
网络编程(王铭东老师)笔记
服务器·网络·笔记
LuH11247 小时前
【论文阅读笔记】Learning to sample
论文阅读·笔记·图形渲染·点云
一棵开花的树,枝芽无限靠近你8 小时前
【PPTist】组件结构设计、主题切换
前端·笔记·学习·编辑器
犬余9 小时前
设计模式之桥接模式:抽象与实现之间的分离艺术
笔记·学习·设计模式·桥接模式
数据爬坡ing10 小时前
小白考研历程:跌跌撞撞,起起伏伏,五个月备战历程!!!
大数据·笔记·考研·数据分析
咖肥猫10 小时前
【ue5学习笔记2】在场景放入一个物体的蓝图输入事件无效?
笔记·学习·ue5
郭尘帅66611 小时前
Ajax学习笔记
笔记·学习·ajax