RocketMQ 源码学习--Consumer-02 消息负载均衡机制与重新分布

带着问题去研究中间件，想想自己实现如何实现

前提

通过架构可以知道下面角色之间的对应关系

主题：消息队列（MessageQueue）= 1：n
主题：消息生产者 = 1：n （n>=1）
主题：消息消费者 = 1：n（n>=1）

问题

PullRequest是哪里产生的呢？？？
1. 上一内容知道，只有存在 PullRequest，才会去 Broker 进行拉取数据，那么这个东西哪里来的呢
消费端消息负载均衡机制与重新分布

1、一个消费组中多个消费者是如何对消息队列（1个主题多个消息队列）进行负载消费的。

2、一个消费者中多个线程又是如何协作（并发）的消费分配给该消费者的消息队列中的消息呢？
MsgTreeMap 干嘛用的，为何要保存依次数据
消息消费进度保持机制

PullRequest 哪里来

使用 IDEA 的 navigate 的 call hiberate 来查看调用链路
根源的地方是 RebalanceService.run 方法，那么这个对象是干嘛用的呢

RebalanceService

经过层层委托，最后会到 RebalanceImpl.doRebalance 方法的地方

RebalanceServiceImpl

先看一下具有哪些属性

arduino 复制代码

//消息处理队列，一个消费队列，保存一个消费进度的队列，里面就是消费进度的信息
protected final ConcurrentMap<MessageQueue, ProcessQueue> processQueueTable = new ConcurrentHashMap<MessageQueue, ProcessQueue>(64);

//主题和消息队列的集合
protected final ConcurrentMap<String/* topic */, Set<MessageQueue>> topicSubscribeInfoTable =
  new ConcurrentHashMap<String, Set<MessageQueue>>();

//订阅信息。
protected final ConcurrentMap<String /* topic */, SubscriptionData> subscriptionInner =
  new ConcurrentHashMap<String, SubscriptionData>();

//消费组名称。
protected String consumerGroup;

//消费模式。
protected MessageModel messageModel;

//队列分配算法
protected AllocateMessageQueueStrategy allocateMessageQueueStrategy;

//MQ 客户端实例
protected MQClientInstance mQClientFactory;

doRebalance

typescript 复制代码

/**
* IMP RebalanceImpl#doRebalance执行重平衡
*     该方法将会获取当前消费者的订阅信息集合，然后遍历订阅信息集合，
*     获取订阅的topic，调用rebalanceByTopic方法对该topic进行重平衡。
*
* @param isOrder 是否顺序消费
*/
public void doRebalance(final boolean isOrder) {

  //获取当前消费者的订阅信息集合
  Map<String, SubscriptionData> subTable = this.getSubscriptionInner();
  if (subTable != null) {

    //遍历订阅信息集合
    for (final Map.Entry<String, SubscriptionData> entry : subTable.entrySet()) {

      //获取topic
      final String topic = entry.getKey();
      try {

        /*
        * 对该topic进行重平衡
        */
        this.rebalanceByTopic(topic, isOrder);
      } catch (Throwable e) {
        if (!topic.startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) {
          log.warn("rebalanceByTopic Exception", e);
        }
      }
    }
  }

  /*
  * 丢弃不属于当前消费者订阅的topic的队列快照ProcessQueue
  */
  this.truncateMessageQueueNotMyTopic();
}

rebalanceByTopic

内部分为了集群和广播两种模式，先查看集群模式

集群

kotlin 复制代码

/*
* 集群模式的处理
* 基于负载均衡策略确定跟配给当前消费者的MessageQueue，然后更新当前consumer的处理队列processQueueTable的信息
*/
case CLUSTERING: {

  //获取topic的所有的消息队列信息
  Set<MessageQueue> mqSet = this.topicSubscribeInfoTable.get(topic);

  /*
  * IMP
  *  从topic所在的broker中获取当前consumerGroup的clientId集合，即消费者客户端id集合
  *  一个clientId代表一个消费者
  */
  List<String> cidAll = this.mQClientFactory.findConsumerIdList(topic, consumerGroup);
  if (null == mqSet) {
    if (!topic.startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) {
      log.warn("doRebalance, {}, but the topic[{}] not exist.", consumerGroup, topic);
    }
  }

  if (null == cidAll) {
    log.warn("doRebalance, {} {}, get consumer id list failed", consumerGroup, topic);
  }

  if (mqSet != null && cidAll != null) {

    //将topic的消息队列存入list集合中
    List<MessageQueue> mqAll = new ArrayList<MessageQueue>();
    mqAll.addAll(mqSet);


    /*
    * 对topic的消息队列和clientId集合分别进行排序
    * 排序能够保证，不同的客户端消费者在进行负载均衡时，其mqAll和cidAll中的元素顺序是一致的
    */
    Collections.sort(mqAll);
    Collections.sort(cidAll);

    //获取分配消息队列的策略实现，即负载均衡的策略类
    AllocateMessageQueueStrategy strategy = this.allocateMessageQueueStrategy;

    List<MessageQueue> allocateResult = null;
    try {

      /*
      * 为当前clientId也就是当前消费者，分配消息队列
      * 这一步就是执行负载均衡或者说重平衡的算法
      */
      allocateResult = strategy.allocate(
        this.consumerGroup,
        this.mQClientFactory.getClientId(),
        mqAll,
        cidAll);
    } catch (Throwable e) {
      log.error("AllocateMessageQueueStrategy.allocate Exception. allocateMessageQueueStrategyName={}", strategy.getName(),
                e);
      return;
    }

    //对消息队列去重
    Set<MessageQueue> allocateResultSet = new HashSet<MessageQueue>();
    if (allocateResult != null) {
      allocateResultSet.addAll(allocateResult);
    }

    /*
    * 更新新分配的消息队列的处理队列processQueueTable的信息，创建最初的pullRequest并分发给PullMessageService
    */
    boolean changed = this.updateProcessQueueTableInRebalance(topic, allocateResultSet, isOrder);
    //如果processQueueTable发生了改变
    if (changed) {
      log.info(
        "rebalanced result changed. allocateMessageQueueStrategyName={}, group={}, topic={}, clientId={}, mqAllSize={}, cidAllSize={}, rebalanceResultSize={}, rebalanceResultSet={}",
        strategy.getName(), consumerGroup, topic, this.mQClientFactory.getClientId(), mqSet.size(), cidAll.size(),
        allocateResultSet.size(), allocateResultSet);

      /*
      * 设置新的本地订阅关系版本，重设流控参数，立即给所有broker发送心跳，让Broker更新当前订阅关系
      */
      this.messageQueueChanged(topic, mqSet, allocateResultSet);
    }
  }
  break;
}

步骤：

获取主题的所有消息队列
获取该主题下的所有消费者 id，从 broker 中获取同一主题的所有消费者 clientId
上面两个参数必须都存在，否则返回，对上面的所有信息，进行排序，保证所有节点，看到的顺序是一致的
根据分配算法 AllocateMessageQueueStrategy 进行分配消费队列，获取分配后的消息队列 MessageQueue
updateProcessQueueTableInRebalance如果重新分配了，那么执行重新分配之后的逻辑：更新新分配的消息队列的处理队列processQueueTable的信息，创建最初的pullRequest并分发给PullMessageService，返回是否发生改变
最后，如果发现订阅的消息队列确实发生改变了，则【messageQueueChanged】更新本地的信息，以及通过 Broker，当前的订阅信息发生了变化

经过上面的分析，重点在 updateProcessQueueTableInRebalance之后的处理，如果自己实现的话，

内存中的缓冲信息：修改内存中关于消息队列的信息，
跟新消息队列信息，
1. 之前属于自己的，不再获取数据
2. ，新的消息队列，拉取一下数据

ini 复制代码

/**
*
* @param topic
* @param mqSet  属于 当前 client 的消息队列，分配后的
* @param isOrder
* @return
*/
private boolean updateProcessQueueTableInRebalance(final String topic, final Set<MessageQueue> mqSet,
                                                   final boolean isOrder) {
  boolean changed = false;

  Iterator<Entry<MessageQueue, ProcessQueue>> it = this.processQueueTable.entrySet().iterator();
  /**
  * K1
  *  遍历消息队列-处理队列缓存，只处理 mq 的主题与该主题相关的 ProcessQueue, 如果 mq 不在当期主题的处理范围内（由于消息队列数量变化等原因，
  *  消费者的消费队列发生了变化，该消息队列已经分配给别的消费者去消费了），首先设置该消息队列为丢弃 (dropped 为 voliate 修饰)，可以及时的阻止
  *  继续向 ProceeQueue 中拉取数据，然后执行removeUnecessaryMessageQueue(mq,pq) 来判断是否需要移除
  */
  while (it.hasNext()) {
    Entry<MessageQueue, ProcessQueue> next = it.next();
    MessageQueue mq = next.getKey();
    ProcessQueue pq = next.getValue();


    //主题判断
    if (mq.getTopic().equals(topic)) {

      //重分配后的 MessageQueue 中不包含 旧的消息队列，那么肯定发生了改变
      if (!mqSet.contains(mq)) {

        //消息进度标记为舍弃
        pq.setDropped(true);
        if (this.removeUnnecessaryMessageQueue(mq, pq)) {
          it.remove();
          changed = true;
          log.info("doRebalance, {}, remove unnecessary mq, {}", consumerGroup, mq);
        }

        //进度拉取超时了，很久没有进行更新进度
      } else if (pq.isPullExpired()) {
        switch (this.consumeType()) {
          case CONSUME_ACTIVELY:
            break;
          case CONSUME_PASSIVELY:
            pq.setDropped(true);
            if (this.removeUnnecessaryMessageQueue(mq, pq)) {
              it.remove();
              changed = true;
              log.error("[BUG]doRebalance, {}, remove unnecessary mq, {}, because pull is pause, so try to fixed it",
                        consumerGroup, mq);
            }
            break;
          default:
            break;
        }
      }
    }
  }


  
  //之前不存在的消息队列
  // K1 在内存中移除 MessageQueue 的 offerset, 然后计算下一个拉取偏移量，然后每一个MessageQueue创建一个拉取任务(PullRequest)
  List<PullRequest> pullRequestList = new ArrayList<PullRequest>();
  for (MessageQueue mq : mqSet) {

    //K2 只处理之前不存在的
    // 消息队列--消费进度，不包含的话，属于全新的消费进度，需要重新消费
    if (!this.processQueueTable.containsKey(mq)) {
      if (isOrder && !this.lock(mq)) {
        log.warn("doRebalance, {}, add a new mq failed, {}, because lock failed", consumerGroup, mq);
        continue;
      }

      this.removeDirtyOffset(mq);
      ProcessQueue pq = new ProcessQueue();

      long nextOffset = -1L;
      try {
        nextOffset = this.computePullFromWhereWithException(mq);
      } catch (Exception e) {
        log.info("doRebalance, {}, compute offset failed, {}", consumerGroup, mq);
        continue;
      }

      if (nextOffset >= 0) {
        ProcessQueue pre = this.processQueueTable.putIfAbsent(mq, pq);
        if (pre != null) {
          log.info("doRebalance, {}, mq already exists, {}", consumerGroup, mq);
        } else {
          log.info("doRebalance, {}, add a new mq, {}", consumerGroup, mq);
          PullRequest pullRequest = new PullRequest();
          pullRequest.setConsumerGroup(consumerGroup);
          pullRequest.setNextOffset(nextOffset);
          pullRequest.setMessageQueue(mq);
          pullRequest.setProcessQueue(pq);
          pullRequestList.add(pullRequest);
          changed = true;
        }
      } else {
        log.warn("doRebalance, {}, add new mq failed, {}", consumerGroup, mq);
      }
    }
  }

  //增加拉取请求
  this.dispatchPullRequest(pullRequestList);

  return changed;
}

dispatchPullRequest 方法就是将拉取请求加到 ProcessQueue 中，好吧，原来在这个地方，根据上面的源码，回想一下，现在就会存在下面两个情况，重分配其实两种情况，一种是一开始的分配，内存中根本没有数据，一种是之前有数据，然后重新平衡之后分配的消息队列
1. 重分配一直存在的消息队列
  1. 其实在一开始的时候，因为这个队列也不存在所以会加到 PullRequest，由前面的拉取可知，只要存在了 PullRequest 就会拉取完之后，继续放到队列中，实现不断拉取的效果
2. 重分配后新出现的消息队列
  1. 构造出消费进度队列，然后将 PullRequest 放入 ProcessQueue 中，让消费者获取到请求候，执行后面的拉取流程

广播。。。后续期待

借鉴学习：blog.csdn.net/prestigedin...