本文是对下游消费MQ时,进行流控的一点点思考 🤔
消息突增时,MQ消费端会怎么样?
markdown
MQ:解耦、削峰、异步;(消费者可以配置消费线程数,来控制消费速度)
Sentinel:流量控制、熔断降级;
场景:当业务上游突然发送大批量消息,
而消费下游消费能力有限时(机器配置低、消费线程数配置过大等)
-> 下游服务多线程去拉消息,拉的时候批量,拉完立刻消费,消费完又拉,马不停蹄;
-> 下游服务cpu迅速被打满,且一直处于100%上下,机器卡死重启。
解决思路:
法1:把消费线程数配的极低,那消费者一直没压力,但是会造成大量消息堆积,得不到消费;❌
法2:在消费的代码里加限流逻辑,比如使用sentinel,从而控制消费的速度,不会一直去拉消息;✅
拓展:
法2其实就是我们经常使用sentinel对web请求,或上游服务限流一样,
根据服务实际的消费能力来配置流控阈值;(对服务进行压测来得到服务的稳定消费能力)
只不过对于MQ,我们容易有错误的认知:MQ本来就有削峰能力,而不需要下游做什么;
而实际是下游需要根据服务消费能力,自己手动来控制!!!!!!
比如:配置合适的线程数去拉取消息;消费的时候主动慢一点,sleep一下等。
Ref:听说你的MQ消费线程数设置300
使用Sentinel给MQ消费限流 实践!
官方文档 ✈️

实践是检验真理的唯一标准!!!
markdown
实践结论:
1. sentinel的均匀排队流控效果 ✅
2. 匀速排队模式时,超过流控时线程被阻塞,排队等待 ✅
java
public class PaceFlowDemoTest {
private static final String KEY = "abc";
private static volatile CountDownLatch countDown;
private static final Integer requestQps = 20;
private static final Integer count = 1;
private static final AtomicInteger done = new AtomicInteger();
private static final AtomicInteger pass = new AtomicInteger();
private static final AtomicInteger block = new AtomicInteger();
public static void main(String[] args) throws InterruptedException {
System.out.println("pace behavior");
countDown = new CountDownLatch(1);
initPaceFlowRule();
simulatePulseFlow();
countDown.await();
System.out.println("done");
System.out.println("total pass:" + pass.get() + ", total block:" + block.get());
// System.out.println();
// System.out.println("default behavior");
// TimeUnit.SECONDS.sleep(5);
// done.set(0);
// pass.set(0);
// block.set(0);
// countDown = new CountDownLatch(1);
// initDefaultFlowRule();
// simulatePulseFlow();
// countDown.await();
// System.out.println("done");
// System.out.println("total pass:" + pass.get() + ", total block:" + block.get());
System.exit(0);
}
private static void initPaceFlowRule() {
List<FlowRule> rules = new ArrayList<FlowRule>();
FlowRule rule1 = new FlowRule();
rule1.setResource(KEY);
rule1.setCount(count);
rule1.setGrade(RuleConstant.FLOW_GRADE_QPS);
rule1.setLimitApp("default");
/*
* CONTROL_BEHAVIOR_RATE_LIMITER means requests more than threshold will be queueing in the queue,
* until the queueing time is more than {@link FlowRule#maxQueueingTimeMs}, the requests will be rejected.
*/
rule1.setControlBehavior(RuleConstant.CONTROL_BEHAVIOR_RATE_LIMITER);
rule1.setMaxQueueingTimeMs(10 * 1000);
rules.add(rule1);
FlowRuleManager.loadRules(rules);
}
private static void initDefaultFlowRule() {
List<FlowRule> rules = new ArrayList<FlowRule>();
FlowRule rule1 = new FlowRule();
rule1.setResource(KEY);
rule1.setCount(count);
rule1.setGrade(RuleConstant.FLOW_GRADE_QPS);
rule1.setLimitApp("default");
// CONTROL_BEHAVIOR_DEFAULT means requests more than threshold will be rejected immediately.
rule1.setControlBehavior(RuleConstant.CONTROL_BEHAVIOR_DEFAULT);
rules.add(rule1);
FlowRuleManager.loadRules(rules);
}
private static void simulatePulseFlow() {
for (int i = 0; i < requestQps; i++) {
Thread thread = new Thread(new Runnable() {
@Override
public void run() {
long startTime = TimeUtil.currentTimeMillis();
Entry entry = null;
try {
// 此处验证: 匀速排队模式时,超过流控时是线程被排队等待,还是扔到队里里被异步执行?
// 如果是异步的话,当前设置在mq限流场景时,消费者还是会一直拉消息,导致队里里的任务急剧增大(猜测不会这样)
// 如果是排队等待,那么执行前,执行后线程名一致? √一致
long before = TimeUtil.currentTimeMillis();
System.out.println(Thread.currentThread().getName() + ", entry前线程");
entry = SphU.entry(KEY);
long after = TimeUtil.currentTimeMillis();
System.out.println(Thread.currentThread().getName() + ", entry后线程,线程耗时:" + (after - before));
} catch (BlockException e1) {
block.incrementAndGet();
} catch (Exception e2) {
// biz exception
} finally {
if (entry != null) {
entry.exit();
pass.incrementAndGet();
// 此处验证: sentinel的均匀排队流控效果
long cost = TimeUtil.currentTimeMillis() - startTime;
System.out.println(Thread.currentThread().getName() + ": one request pass, cost " + cost + " ms");
}
}
if (done.incrementAndGet() >= requestQps) {
countDown.countDown();
}
}
}, "Thread:" + i);
thread.start();
}
}
}
out:
vbnet
pace behavior
INFO: Sentinel log output type is: file
INFO: Sentinel log charset is: utf-8
INFO: Sentinel log base directory is: /Users/a1021500048/logs/csp/
INFO: Sentinel log name use pid is: false
INFO: Sentinel log level is: INFO
Thread:1, entry前线程
Thread:0, entry前线程
Thread:2, entry前线程
Thread:4, entry前线程
Thread:3, entry前线程
Thread:5, entry前线程
Thread:6, entry前线程
Thread:7, entry前线程
Thread:8, entry前线程
Thread:9, entry前线程
Thread:10, entry前线程
Thread:11, entry前线程
Thread:12, entry前线程
Thread:13, entry前线程
Thread:14, entry前线程
Thread:15, entry前线程
Thread:16, entry前线程
Thread:17, entry前线程
Thread:18, entry前线程
Thread:19, entry前线程
Thread:7, entry后线程,线程耗时:39
Thread:7: one request pass, cost 39 ms
Thread:8, entry后线程,线程耗时:1044
Thread:8: one request pass, cost 1044 ms
Thread:14, entry后线程,线程耗时:2042
Thread:14: one request pass, cost 2042 ms
Thread:12, entry后线程,线程耗时:3042
Thread:12: one request pass, cost 3042 ms
Thread:19, entry后线程,线程耗时:4041
Thread:19: one request pass, cost 4041 ms
Thread:0, entry后线程,线程耗时:5043
Thread:0: one request pass, cost 5043 ms
Thread:3, entry后线程,线程耗时:6043
Thread:3: one request pass, cost 6043 ms
Thread:6, entry后线程,线程耗时:7043
Thread:6: one request pass, cost 7043 ms
Thread:15, entry后线程,线程耗时:8043
Thread:15: one request pass, cost 8043 ms
Thread:16, entry后线程,线程耗时:9042
Thread:16: one request pass, cost 9042 ms
Thread:17, entry后线程,线程耗时:10042
Thread:17: one request pass, cost 10042 ms
done
total pass:11, total block:9 // rule1.setMaxQueueingTimeMs(10s);
Process finished with exit code 0
MQ消费限流工具类🔨
java
/**
* 带有限流功能的mq消费者
* 实现:通过默认流控效果(直接拒绝)+sleep实现 -> 优化:sentinel的匀速排队模式
*/
public abstract class LimitBaseConsumer extends BaseConsumer {
@Override
protected void consume(MessageExt msg, ConsumeConcurrentlyContext context) throws Exception {
MqConsumerConfig mqConsumerConfig = getConsumerConfig();
// 对应的 key 为 `groupName:topicName`
String resourceName = mqConsumerConfig.getConsumerGroup() + ":" + mqConsumerConfig.getTopic();
log.info("开始消费 topic {} msgId = {}", mqConsumerConfig.getTopic(), msg.getMsgId());
if (StringUtils.isBlank(resourceName)) {
log.info("sentinel resource name is blank");
return;
}
// 如果不开启限流,则维持原消费逻辑
if (!AcmValueConstant.MQ_CONSUMER_LIMIT_SWITCH) {
consumeAfterLimit(msg, context);
return;
}
long rcvIntervalTimeLeft = AcmValueConstant.MQ_CONSUMER_RCV_INTERVAL_TIME; // MQ消费,限流间隔时间
long sleepTime = getSleepTimeAfterLimit();
// 使用sentinel + 时间兜底
while (Boolean.FALSE.equals(SphO.entry(resourceName)) && rcvIntervalTimeLeft > 0) {
log.info("{} msgId = {} execute limited ", resourceName, msg.getMsgId());
if (rcvIntervalTimeLeft > AcmValueConstant.MQ_CONSUMER_RCV_INTERVAL_TIME) {
rcvIntervalTimeLeft = AcmValueConstant.MQ_CONSUMER_RCV_INTERVAL_TIME;
}
try {
if (rcvIntervalTimeLeft >= sleepTime) {
rcvIntervalTimeLeft -= sleepTime;
TimeUnit.MILLISECONDS.sleep(sleepTime);
} else {
TimeUnit.MILLISECONDS.sleep(rcvIntervalTimeLeft);
rcvIntervalTimeLeft = 0;
}
} catch (Exception e) {
log.error("{} sleep exception ", resourceName);
}
}
try {
consumeAfterLimit(msg, context);
} finally {
try {
SphO.exit();
} catch (Exception e) {
// nothing
}
}
}
/**
* 限流后睡眠毫秒数
* @return
*/
protected abstract long getSleepTimeAfterLimit();
/**
* 限流后业务处理
* @param msg
* @param context
*/
protected abstract void consumeAfterLimit(MessageExt msg, ConsumeConcurrentlyContext context);
}